$$%% examples \newcommand{\exGraph}{\graph_{\mathrm{ex}}} \newcommand{\exOnto}{\onto_{\mathrm{ex}}} \newcommand{\exMappings}{\mappings_{\mathrm{ex}}} \newcommand{\exExtensions}{\extensions_{\mathrm{ex}}} \newcommand{\exRule}{r_{\mathrm{ex}}} \newcommand{\RDFSrules}{\rules_{\mathrm{RDFS}}} %% RDF \newcommand{\triple}[3]{(#1, #2, #3)} \newcommand{\tuple}[1]{\langle #1 \rangle} \newcommand{\subject}{\mathtt{s}} \newcommand{\prop}{\mathtt{p}} \newcommand{\object}{\mathtt{o}} \newcommand{\blank}{\_{:}b} \newcommand{\blankn}[1]{\_{:}#1} \newcommand{\irin}[1]{{:}\mathrm{#1}} \newcommand{\class}{\mathtt{c}} \newcommand{\nsrdf}{\mathrm{rdf{:}}} \newcommand{\nsrdfs}{\mathrm{rdfs{:}}} \newcommand{\rdftype}{\mathrm{rdf{:}type}} \newcommand{\rdfLiteral}{\mathrm{rdf{:}Literal}} \newcommand{\rdfssubClassOf}{\mathrm{rdfs{:}subClassOf}} \newcommand{\rdfssubPropertyOf}{\mathrm{rdfs{:}subPropertyOf}} \newcommand{\rdfsdomain}{\mathrm{rdfs{:}domain}} \newcommand{\rdfsrange}{\mathrm{rdfs{:}range}} \newcommand{\rdfsClass}{\mathrm{rdfs{:}Class}} \newcommand{\rdfProperty}{\mathrm{rdf{:}Property}} \newcommand{\xsdint}{\mathrm{xsd{:}int}} %% \newcommand{\type}{\tau} \newcommand{\subclass}{\prec_{sc}} \newcommand{\subproperty}{\prec_{sp}} \newcommand{\domain}{\hookleftarrow_{d}} \newcommand{\range}{\hookrightarrow_{r}} \newcommand{\rdfentailment}{\vdash_{^\mathrm{RDF}}} \newcommand{\RDFS}[1]{\mathrm{RDFS}(#1)} \newcommand{\aka}{a.k.a.~} \newcommand{\etc}{etc} \newcommand{\wrt}{w.r.t.~} \newcommand{\st}{s.t.~} \newcommand{\ie}{i.e.,~} \newcommand{\eg}{e.g.,~} \newcommand{\graph}{G} \newcommand{\rules}{\mathcal{R}} \newcommand{\sources}{\mathcal{S}} \newcommand{\views}{\mathcal{V}} \newcommand{\extensions}{\mathcal{E}} \newcommand{\onto}{\mathcal{O}} \newcommand{\mappings}{\mathcal{M}} \newcommand{\modelsrdf}{\models_\rules} \newcommand{\bgp}{P} \newcommand{\Bl}[1]{\mathrm{Bl}(#1)} \newcommand{\Val}[1]{\mathrm{Val}(#1)} \newcommand{\Var}[1]{\mathrm{Var(#1)}} \newcommand{\ext}[1]{\mathrm{ext}(#1)} \newcommand{\cert}{\mathrm{cert}} \newcommand{\ans}{\mathrm{ans}} \newcommand{\query}{\leftarrow} \newcommand{\body}[1]{\textrm{body}(#1)} \newcommand{\head}[1]{\textrm{head}(#1)} \newcommand{\cs}{\mathrm{cs}} \newcommand{\lcs}{\mathrm{lcs}} \newcommand{\cl}{\mathrm{cl}} \newcommand{\lua}{\mathrm{lua}} \newcommand{\lur}{\mathrm{lur}} \newtheorem{lemma}{Lemma} \newtheorem{definition}{Definition} \newtheorem{problem}{Problem} \newtheorem{property}{Property} \newtheorem{corollary}{Corollary} \newtheorem{example}{Example} \newtheorem{theorem}{Theorem} \newcommand{\URIs}{\mathscr U} \newcommand{\IRIs}{\mathscr I} \newcommand{\BNodes}{\mathscr B} \newcommand{\Literals}{\mathscr L} \newcommand{\Variables}{\mathscr V} % DB \newcommand{\CQ}{\ensuremath{\mathtt{CQ}}\xspace} \newcommand{\UCQ}{\ensuremath{\mathtt{UCQ}}\xspace} \newcommand{\SQL}{\ensuremath{\mathtt{SQL}}\xspace} \newcommand{\rel}[1]{\mathsf{#1}} % Cost model \newcommand{\cans}[1]{|#1|_t} \newcommand{\cref}[1]{|#1|_r} \newcommand{\db}{\mathtt{db}} % DL \newcommand{\cn}{\ensuremath{N_{C}}\xspace} \newcommand{\rn}{\ensuremath{N_{R}}\xspace} \newcommand{\inds}{\ensuremath{N_{I}}\xspace} \newcommand{\ainds}{\ensuremath{\mathrm{Ind}}\xspace} \newcommand{\funct}{\mathit{funct} \ } \newcommand{\KB}{\mathcal{K}\xspace} \newcommand{\dlr}{DL-Lite$_{\mathcal{R}}$\xspace} % Logics \newcommand{\FOL}{\ensuremath{\mathtt{FOL}}\xspace} \newcommand{\datalog}{\ensuremath{\mathtt{Datalog}}\xspace} \newcommand{\dllite}{DL-Lite\xspace} \newcommand{\true}{\mathrm{true}} \newcommand{\false}{\mathrm{false}} \newcommand{\dis}{\mathtt{dis}} \newcommand{\vars}[1]{\ensuremath{\mathrm{vars}(#1)}} %\newcommand{\terms}[1]{\ensuremath{\mathrm{terms}(#1)}} %math \renewcommand{\phi}{\varphi} \newcommand\eqdef{\stackrel{\mathclap{\normalfont\mbox{def}}}{=}} \newcommand\restr[2]{#1_{|#2}} \newcommand{\ontoBody}[1]{\mathrm{body}_\onto(#1)} %proof of the rewriting theorem \newcommand{\rdfGraph}{\graph^{\mappings}_{\extensions}} \newcommand\systemGraph{\graph^{\mappings \cup \mappings^{\text{STD}}_\onto}_{\extensions \cup \extensions_\onto}} \newcommand\viewsGraph{\graph^{\mappings^{\rules,\onto} \cup \mappings^{\text{STD}}_\onto}_{\extensions \cup \extensions_\onto}} \newcommand{\standMappings}{\mappings^{\text{STD}}_\onto} \newcommand{\reminder}[1]{[\vadjust{\vbox to0pt{\vss\hbox to0pt{\hss{\Large $\Longrightarrow$}}}}{{\textsf{\small #1}}}]} %\newcommand{\FG}[1]{\textcolor{blue}{\reminder{FG:~#1}}} \newcommand{\extVersion}{false} \newcommand{\printIfExtVersion}[2] { \ifthenelse{\equal{\extVersion}{true}}{#1}{} \ifthenelse{\equal{\extVersion}{false}}{#2}{} } \newcommand{\bda}{\true} \newcommand{\ifBDA}[2]% {% \ifthenelse{\equal{\bda}{true}}{#1}{}% \ifthenelse{\equal{\bda}{false}}{#2}{}% } %%% Local Variables: %%% TeX-master: "paper" %%% End: $$

BSBM-Based Benchmarks

Table of Contents

1. Experiments Based on BSBM

In the experiments, we define four RIS:

  • S1 = (O1, R, M1, E1) with one small relational datasource,
  • S2 = (O2, R, M2, E2) with one big relational datasource,
  • S3 = (O1, R, M3, E3) with two small heterogeneous datasources,
  • S4 = (O2, R, M4, E4) with two big heterogeneous datasources.

NEW: You can visualize the small RIS using Obi-Wan:

We experiment four approaches:

  • REW-CA reformulates the query w.r.t. Ra and Rc and rewrites the reformulations with the mappings,
  • REW-C reformulates the query w.r.t. Rc and rewrites the reformulations with the saturated mappings w.r.t. Ra,
  • MAT in pre-processing, it materializes the RDF graph induced by the mappings and saturates the materialization w.r.t. Ra and Rc, at query time, its evaluates the query on the saturated materialization,
  • MAT-CA in pre-processing, it materializes the RDF graph induced by the mappings, at query time, it reformulates the query w.r.t. Rc and Ra and evaluates the reformulations on the materialization.

1.2. Experiment for REW

Our experiments on the approach REW shows that it is slower that REW-C, because (i) REW have the same behaviour that REW-C at query time if the query does not query the ontology part of the graph and (ii) we have experimentally shown below that on 6 queries (querying the ontology), in the REW approach, the size of the rewriting explodes in comparison to the REW-C approach. We can see the size (number of union) of the rewriting in the column N_REW of the following tables and the corresponding time in ms of rewriting in the T_REW column.

This explosion of the size of the rewriting leads to a explosion of the time spend for optimizing of the rewriting. This optimization steps are called CLASH, COVER and CORE, respectively removing rewriting with empty answers set according the dictionary defining the extensions, minimizing the union of rewriting by removing redundant rewritings and compute the core of each rewriting (removing redundancy between query atoms). The column prefixed by N_ contains the size of the rewriting after the optimization and the column prefixed by T_ contains the corresponding time. The last column T_OP contains the time spend for the optimization of the execution plan in Tatooine; some ongoing work may modify this time, but not the previous ones.

You can see the reformulations number and respectively the reformulation process time in the column N_REF, resp. T_REF. The number of triples in the query is written in the column N_TRI.

1.2.1. Experiment on small scenario

We observe the size of the rewriting is larger with a multiplicative factor going from 29 to 74 in the small scenario. The first table concerns REW approach and the second REW-C.

Table 1: statistics of REW on S1 and S3
INPUT NTRI NREF NREW NCLASH NCOVER TREF TREW TCLASH TCOVER TCORE TOP
Q13 4 nan 1005 576 304 nan 106 1 119 247 175
Q13a 4 nan 2010 1152 606 nan 123 2 371 487 441
Q13b 4 nan 16080 9216 4400 nan 435 24 17531 4982 11909
Q22 4 nan 929 539 385 nan 70 0 76 54 126
Q22a 4 nan 929 539 385 nan 69 1 77 55 123
Q23 7 nan 5922 4474 856 nan 555 9 978 268 707
Table 2: statistics of REW-C on S1 and S3
INPUT NTRI NREF NREW NCLASH NCOVER TREF TREW TCLASH TCOVER TCORE TOP
Q13 4 16 25 nan 21 1 184 1 14 44 19
Q13a 4 16 50 nan 42 1 185 0 31 84 41
Q13b 4 16 400 nan 336 1 223 0 268 1166 354
Q22 4 2 2 nan 2 1 10 0 0 0 1
Q22a 4 24 32 nan 32 0 73 0 1 4 7
Q23 7 64 80 56.0 24 738 1650 0 10 39 16

1.2.2. Experiment on large scenario

We observe the size of the rewriting is larger with a multiplicative factor going from 33 to 969 in the large scenario. The first table concerns REW approach and the second REW-C.

Table 3: statistics of REW on S2 and S4
INPUT NTRI NREF NREW NCLASH NCOVER TREF TREW TCLASH TCOVER TCORE TOP
Q13 4 nan 25886 12992 10018 nan 1660 44 98580 6960 72717
Q13a 4 nan       nan >10min        
Q13b 4 nan       nan >10min        
Q22 4 nan 12865 6459 6401 nan 1000 18 19320 859 19142
Q22a 4 nan 12865 6459 6401 nan 995 18 19646 855 19138
Q23 7 nan 77538 56954 12888 nan 7426 130 353692 2477 163584
Table 4: statistics of REW-C on S2 and S4
INPUT NTRI NREF NREW NCLASH NCOVER TREF TREW TCLASH TCOVER TCORE TOP
Q13 4 16 50 nan 42 4 3198 1 29 89 30
Q13a 4 16 3200 nan 2688 3 3504 9 5396 4922 6366
Q13b 4 16 3200 nan 2688 3 3177 3 5385 5366 5343
Q22 4 24 32 nan 32 1 1878 0 0 4 5
Q22a 4 200 384 nan 384 3 12417 3 100 47 110
Q23 7 64 80 56.0 24 642 21428 1 7 32 24

2. Ontologies and Mappings Definitions

2.1. Ontologies O1 and O2

The ontologies O1 and O2 are build respectively making the union of core ontology with product product type hierarchy of 1M triples and the union of core ontology with product type hierarchy of 50M triples, as defined below.

2.1.1. Core Ontology for BSBM

We create an ontology for BSBM benchmark according to the RDF schema. Some part of this ontology are imported from foaf ontology. You can download the ontology NT file, which contains 26 classes and 36 properties used in 40 subclass, 32 subproperty, 42 domain and 16 range definitions. This ontology contains a small hierarchy for the countries containing 14 sub-classes.

bsbm-onto.png

Figure 1: The BSBM core ontology

2.1.2. Product Type Hierarchy

  1. 1 million triples

    The ontology contains 150 subclass definitions. The subclass hierarchy of products have balanced tree-shape of depth 3 and 151 classes, among them 96 are used in data.

    product-type-hierarchy1m.png

  2. 50 million triples

    The ontology contains 2010 subclass definitions. The ontology have a balanced tree shape of 2011 classes of depth 4, 1280 classes are used in data:

    product-type-hierarchy50m.png

2.2. RIS mappings for relational sources M1 and M2

RIS mappings are either GAV mappings or GLAV mappings RDF integration system have:

  • 307 mappings in M1,
  • 3863 mappings in M2.

2.2.1. GAV Mappings

GAV mappings copy a part of the relational BSBM schema in star shaped triples. The GAV mappings extensions contain:

  • 90 178 tuples, in E1 for M1,
  • 4 806 514 tuples, in E2 for M2.

Mappings for product properties:

<$product,$producer,$label,$comment,$propertyNum1,$propertyNum2,$propertyTex1,$propertyTex2,$propertyTex3,$propertyTex4,$propertyTex5,$propertyTex6,$publisher,$publishDate> :- 
	triple($product,<bsbm:producer>,$producer),
	triple($product,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($product,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($product,<bsbm:productPropertyNumeric1>,$propertyNum1),
	triple($product,<bsbm:productPropertyNumeric2>,$propertyNum2),
	triple($product,<bsbm:productPropertyTextual1>,$propertyTex1),
	triple($product,<bsbm:productPropertyTextual2>,$propertyTex2),
	triple($product,<bsbm:productPropertyTextual3>,$propertyTex3),
	triple($product,<bsbm:productPropertyTextual4>,$propertyTex4),
	triple($product,<bsbm:productPropertyTextual5>,$propertyTex5),
	triple($product,<bsbm:productPropertyTextual6>,$propertyTex6),
	triple($product,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($product,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V1(<bsbm-int:dataFromProducer{3}/Product{0}>, <bsbm-int:dataFromProducer{3}/Producer{3}>, "{1}", "{2}", "{4}", "{5}", "{6}", "{7}", "{8}", "{9}", "{10}", "{11}", <bsbm-int:dataFromProducer{3}/Producer{3}>, "{13}")
	^
	|
(V0): SELECT nr, label, comment, producer, propertyNum1, propertyNum2, propertyTex1, propertyTex2, propertyTex3, propertyTex4, propertyTex5, propertyTex6, publisher, publishDate FROM Product

Mapping for product type properties:

<$type,$label,$comment,$publisher,$publishDate> :- 
	triple($type,<rdf:type>,<bsbm:ProductType>),
	triple($type,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($type,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($type,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($type,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V3(<bsbm-int:ProductType{0}>, "{1}", "{2}", <bsbm-int:StandardizationInstitution{3}>, "{4}")
	^
	|
(V2): SELECT nr, label, comment, publisher, publishDate FROM ProductType

Mapping for producer properties:

<$producer,$label,$comment,$homepage,$publisher,$publishDate> :- 
	triple($producer,<rdf:type>,<bsbm:Producer>),
	triple($producer,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($producer,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($producer,<http://xmlns.com/foaf/0.1/homepage>,$homepage),
	triple($producer,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($producer,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V11(<bsbm-int:dataFromProducer{0}/Producer{0}>, "{1}", "{2}", <{3}>, <bsbm-int:dataFromProducer{4}/Producer{4}>, "{5}")
	^
	|
(V10): SELECT nr, label, comment, homepage, publisher, publishDate FROM Producer

Mapping for offer properties:

<$offer,$product,$vendor,$price,$validFrom,$validTo,$deliveryDays,$offerWebpage,$publisher,$publishDate> :- 
	triple($offer,<rdf:type>,<bsbm:Offer>),
	triple($offer,<bsbm:product>,$product),
	triple($offer,<bsbm:vendor>,$vendor),
	triple($offer,<bsbm:price>,$price),
	triple($offer,<bsbm:validFrom>,$validFrom),
	triple($offer,<bsbm:validTo>,$validTo),
	triple($offer,<bsbm:deliveryDays>,$deliveryDays),
	triple($offer,<bsbm:offerWebpage>,$offerWebpage),
	triple($offer,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($offer,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V17(<bsbm-int:dataFromVendor{3}/Offer{0}>, <bsbm-int:dataFromProducer{2}/Product{1}>, <bsbm-int:dataFromVendor{3}/Vendor{3}>, "{4}", "{5}", "{6}", "{7}", <{8}>, <bsbm-int:dataFromVendor{9}/Vendor{9}>, "{10}")
	^
	|
(V16): SELECT nr, product, producer, vendor, price, validFrom, validTo, deliveryDays, offerWebpage, publisher, publishDate FROM Offer

Mapping for review properties:

<$review,$product,$reviewDate,$title,$text,$rating1,$rating2,$publisher,$publishDate> :- 
	triple($review,<bsbm:reviewFor>,$product),
	triple($review,<bsbm:reviewDate>,$reviewDate),
	triple($review,<http://purl.org/dc/elements/1.1/title>,$title),
	triple($review,<http://purl.org/stuff/rev#text>,$text),
	triple($review,<bsbm:rating1>,$rating1),
	triple($review,<bsbm:rating2>,$rating2),
	triple($review,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($review,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V21(<bsbm-int:dataFromRatingSite{10}/Review{0}>, <bsbm-int:dataFromProducer{2}/Product{1}>, "{4}", "{5}", "{6}", "{8}", "{9}", <bsbm-int:dataFromRatingSite{10}/RatingSite{10}>, "{11}")
	^
	|
(V20): SELECT nr, product, producer, person, reviewDate, title, text, language, rating1, rating2,  publisher, publishDate FROM Review

Mapping for vendor properties:

<$vendor,$label,$comment,$homepage,$publisher,$publishDate> :- 
	triple($vendor,<rdf:type>,<bsbm:Vendor>),
	triple($vendor,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($vendor,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($vendor,<http://xmlns.com/foaf/0.1/homepage>,$homepage),
	triple($vendor,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($vendor,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V13(<bsbm-int:dataFromVendor{0}/Vendor{0}>, "{1}", "{2}", <{3}>, <bsbm-int:dataFromVendor{4}/Vendor{4}>, "{5}")
	^
	|
(V12): SELECT nr, label, comment, homepage, publisher, publishDate FROM Vendor

Mappings for product types where ?type is instantiated by product types contained in data (leaves product types of the hierarchy):

<$product> :- 
	triple($product, <rdf:type>, <bsbm-int:ProductType?type>);
	^
	|
V23(<bsbm-int:dataFromProducer{1}/Product{0}>)
	^
	|
(V22): SELECT P.nr AS nr, P.producer AS producer FROM PRODUCT AS P, PRODUCTTYPEPRODUCT AS PT WHERE PT.product = P.nr AND PT.producttype = ?type

Mappings for country types, where ?type is instantiated by country types:

<$country> :- 
	triple($country,<rdf:type>,<bsbm:CountryType?type>);
	^
	|
V7703(<http://downlode.org/rdf/iso-3166/countries#{0}>)
	^
	|
(V7702): SELECT country FROM CountryType AS C WHERE C.countryType=?type

2.2.2. GLAV Mappings

GLAV extensions contain:

  • 257 210 tuples in E1 for M1,
  • 13 562 414 tuples in E2 for M2.

GLAV mappings partially expose the results of join queries over the BSBM data; these mappings expose incomplete knowledge. Each GLAV mapping \(q_{1}(\bar x) \leadsto q_{2}(\bar x)\) is not redundant with \(\mappings_{\mathrm{GAV}}\) the previous GAV mappings in the sense, that \(q_{2}(\bar x)\) does not have answer on the graph induced by \(\mappings_{\mathrm{GAV}}\) and their extensions.

Mapping for reviewer properties:

<$review,$name,$mbox_sha1sum,$country,$publisher,$publishDate> :- 
	triple($review,<http://purl.org/stuff/rev#reviewer>,$person),
	triple($person,<http://xmlns.com/foaf/0.1/name>,$name),
	triple($person,<http://xmlns.com/foaf/0.1/mbox_sha1sum>,$mbox_sha1sum),
	triple($person,<bsbm:country>,$country),
	triple($person,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($person,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V19(<bsbm-int:dataFromRatingSite{4}/Review{0}>, "{1}", "{2}", <http://downlode.org/rdf/iso-3166/countries#{3}>, <bsbm-int:dataFromRatingSite{4}/RatingSite{4}>, "{5}")
	^
	|
(V18): SELECT R.nr AS nr, P.name AS name, P.mbox_sha1sum AS mbox_sha1sum, P.country AS country, P.publisher AS publisher, P.publishDate AS publishDate FROM Person AS P, Review AS R WHERE R.person = P.nr

Mapping linking product feature properties to product with existential product feature:

<$product,$label,$comment,$publisher,$publishDate> :- 
	triple($product,<bsbm:productFeature>,$productFeature),
	triple($productFeature,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($productFeature,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($productFeature,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($productFeature,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V5(<bsbm-int:dataFromProducer{1}/Product{0}>, "{2}", "{3}", <bsbm-int:StandardizationInstitution{4}>, "{5}")
	^
	|
(V4): SELECT P.nr AS nr, P.producer AS producer, PF.label AS label, PF.comment AS comment, PF.publisher AS publisher, PF.publishDate AS publishDate FROM Product AS P, ProductFeatureProduct AS PFP, ProductFeature AS PF WHERE P.nr = PFP.product AND PFP.productFeature = PF.nr

Mapping linking two review's ratings to product with existential review parameterized by its reviewer's country:

<$product,$rating3,$rating4,$country> :- 
	triple($review,<bsbm:reviewFor>,$product),
	triple($review,<bsbm:rating3>,$rating3),
	triple($review,<bsbm:rating4>,$rating4),
	triple($review,<http://purl.org/stuff/rev#reviewer>,$person),
	triple($person,<bsbm:country>,$country);
	^
	|
V7(<bsbm-int:dataFromProducer{1}/Product{0}>, "{2}", "{3}", <http://downlode.org/rdf/iso-3166/countries#{4}>)
	^
	|
(V6): SELECT P.nr AS nr, P.producer AS producer, R.rating3 AS rating3, R.rating4 AS rating4, PE.country AS country FROM Product AS P, Review AS R, Person AS PE WHERE P.nr = R.product AND R.person = PE.nr

Mapping linking product to country of its producer using existential producer:

<$product,$country> :- 
	triple($product,<bsbm:producer>,$producer),
	triple($producer,<bsbm:country>,$country);
	^
	|
V9(<bsbm-int:dataFromProducer{1}/Product{0}>, <http://downlode.org/rdf/iso-3166/countries#{2}>)
	^
	|
(V8): SELECT P.nr AS nr, P.producer AS producer, PR.country AS country FROM Product AS P, Producer AS PR WHERE P.producer = PR.nr

Mapping linking offer to country of its vendor using existential vendor:

<$offer,$country> :- 
	triple($offer,<bsbm:vendor>,$vendor),
	triple($vendor,<bsbm:country>,$country);
	^
	|
V15(<bsbm-int:dataFromVendor{1}/Offer{0}>, <http://downlode.org/rdf/iso-3166/countries#{2}>)
	^
	|
(V14): SELECT O.nr AS nr, O.vendor AS vendor, V.country AS country FROM Offer AS O, Vendor AS V WHERE O.vendor = V.nr

Mappings linking offer to product numeric properties using existential product parameterized by its producer's country and its product type. The product type ?type is instantiated by product types:

<$offer,$propertyNum3,$propertyNum4,$country> :- 
	triple($offer,<bsbm:product>,$product),
	triple($product,<rdf:type>,<bsbm-int:ProductType?type>),
	triple($product,<bsbm:productPropertyNumeric3>,$propertyNum3),
	triple($product,<bsbm:productPropertyNumeric4>,$propertyNum4),
	triple($product,<bsbm:producer>,$producer),
	triple($producer,<bsbm:country>,$country);
	^
	|
V2583(<bsbm-int:dataFromVendor{1}/Offer{0}>, "{2}", "{3}", <http://downlode.org/rdf/iso-3166/countries#{4}>)
	^
	|
(V2582): SELECT O.nr AS offer, O.vendor AS vendor, P.propertyNum3 AS propertyNum3, P.propertyNum4 AS propertyNum4, PR.country AS country FROM Offer AS O, Product AS P, Producer AS PR, PRODUCTTYPEPRODUCT AS PTP WHERE P.nr = O.product AND PR.nr = P.producer AND P.nr = PTP.product AND PTP.producttype=?type

Mappings linking review to product numeric properties using existential product parameterized by its producer's country and its product type. The product type ?type is instantiated by product types:

<$review,$propertyNum5,$propertyNum6,$country> :- 
	triple($review,<bsbm:reviewFor>,$product),
	triple($product,<rdf:type>,<bsbm-int:ProductType?type>),
	triple($product,<bsbm:productPropertyNumeric5>,$propertyNum5),
	triple($product,<bsbm:productPropertyNumeric6>,$propertyNum6),
	triple($product,<bsbm:producer>,$producer),
	triple($producer,<bsbm:country>,$country);
	^
	|
V5141(<bsbm-int:dataFromRatingSite{1}/Review{0}>, "{2}", "{3}", <http://downlode.org/rdf/iso-3166/countries#{4}>)
	^
	|
(V5140): SELECT R.nr AS review, R.publisher AS publisher, P.propertyNum5 AS propertyNum5, P.propertyNum6 AS propertyNum6, PR.country AS country FROM Review AS R, Product AS P, Producer AS PR, Producttypeproduct AS PTP WHERE P.nr = R.product AND P.producer = PR.nr AND P.nr = PTP.product AND PTP.producttype=?type

2.2.3. RIS Configurations

The RIS configurations by source size are:

2.3. RIS mappings for heterogeneous sources M3 and M4

RIS mappings are either GAV mappings or GLAV mappings RDF integration system have:

  • 307 mappings in M3 where 12 are on MongoDB,
  • 3863 mappings in M4, where 12 are on MongoDB.

We expose here only mappings on MongoDB, the others are the same as in M1 or M2 i.e. on Postgres datasource.

2.3.1. GAV Mappings on MongoDB

Mapping for offer properties

<$offer,$product,$vendor,$price,$validFrom,$validTo,$deliveryDays,$offerWebpage,$publisher,$publishDate> :- 
	triple($offer,<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/Offer>),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/product>,$product),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/vendor>,$vendor),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/price>,$price),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/validFrom>,$validFrom),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/validTo>,$validTo),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/deliveryDays>,$deliveryDays),
	triple($offer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/offerWebpage>,$offerWebpage),
	triple($offer,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($offer,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V17(<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromVendor{3}/Offer{0}>, <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromProducer{2}/Product{1}>, <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromVendor{3}/Vendor{3}>, "{4}", "{5}", "{6}", "{7}", <{8}>, <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromVendor{9}/Vendor{9}>, "{10}")
	^
	|
(V16): [nr, product, producer, vendor, price, validfrom, validto, deliverydays, offerwebpage, publisher, publishdate]<- db.offer.aggregate([])

Mappings for vendor properties

<$vendor,$label,$comment,$homepage,$publisher,$publishDate> :- 
	triple($vendor,<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/Vendor>),
	triple($vendor,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($vendor,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($vendor,<http://xmlns.com/foaf/0.1/homepage>,$homepage),
	triple($vendor,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($vendor,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V13(<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromVendor{0}/Vendor{0}>, "{1}", "{2}", <{3}>, <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromVendor{4}/Vendor{4}>, "{5}")
	^
	|
(V12): [nr, label, comment, homepage, publisher, publishdate]<- db.vendor.aggregate([])

mappings for country type with parameter $type:

<$country> :- 
	triple($country,<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/CountryType$type>);
	^
	|
V7705(<http://downlode.org/rdf/iso-3166/countries#{0}>)
	^
	|
(V7704): [country]<- db.countrytype.aggregate([{"$match":{"countrytype":$type}}])

2.3.2. GLAV mappings

mappings for product features

<$product,$label,$comment,$publisher,$publishDate> :- 
	triple($product,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/productFeature>,$productFeature),
	triple($productFeature,<http://www.w3.org/2000/01/rdf-schema#label>,$label),
	triple($productFeature,<http://www.w3.org/2000/01/rdf-schema#comment>,$comment),
	triple($productFeature,<http://purl.org/dc/elements/1.1/publisher>,$publisher),
	triple($productFeature,<http://purl.org/dc/elements/1.1/date>,$publishDate);
	^
	|
V5(<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromProducer{1}/Product{0}>, "{2}", "{3}", <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/StandardizationInstitution{4}>, "{5}")
	^
	|
(V4): [nr, producer, label, comment, publisher, publishdate]<- db.productfeatureproduct.aggregate([])

mappings for producer of products

<$product,$country> :- 
	triple($product,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/producer>,$producer),
	triple($producer,<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/country>,$country);
	^
	|
V9(<http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/dataFromProducer{1}/Product{0}>, <http://downlode.org/rdf/iso-3166/countries#{2}>)
	^
	|
(V8): [nr, producer, country]<- db.productproducercountry.aggregate([])

2.3.3. RIS Configurations

The RIS configurations by source size are:

2.4. Data loading

Data loading details can be find in a specific page.

2.5. Command lines

First, make sure that you have install tatooine requirements and modify conf/tatooine.conf if needed.

Each RIS system is defined in a directory:

  • \(S_{1}\) in ris-bsbm-copy-existential-1m
  • \(S_{2}\) in ris-bsbm-copy-existential-50m
  • \(S_{3}\) in ris-glav-json-1m
  • \(S_{4}\) in ris-glav-json-50m

The methods for query answering are:

  • REW_OR for REW-C
  • REF for REW-CA
  • REW for REW, but only the rewriting is supported for now.
  • MAT_SAT, add the option -M for enabling the materialization step

The following command line run the experiments on \(S_{3}\) using the methods REW-C (denoted REW_OR). The results of the experiments are written in the directory ris-glav-json-1m/runs/:

java -jar het2onto.jar -t conf/tatooine.conf -q ris-glav-json-1m/queries.txt -s ris-glav-json-1m/ris-bsbm-glav.json -d ris-glav-json-1m/runs/ -m REW_OR

2.6. Old Experiments

2.6.1. Relational Experiment of 27th March

  1. 1M triples
  2. 50M triples