Bases de Données / Databases

Site Web de l'équipe BD du LIP6 / LIP6 DB Web Site

Outils pour utilisateurs

Outils du site


site:recherche:logiciels:sparqlwithspark

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentesRévision précédente
Prochaine révision
Révision précédente
site:recherche:logiciels:sparqlwithspark [13/09/2016 17:57] – [Chain queries] hubertsite:recherche:logiciels:sparqlwithspark [17/11/2023 18:39] (Version actuelle) amann
Ligne 1: Ligne 1:
 ====== SPARQL query processing with Apache Spark ====== ====== SPARQL query processing with Apache Spark ======
-This web page is a companion to the "SPARQL query processing with Apache Spark" paper submitted at EDBT 2017.+This wiki is a companion to the following publications: 
 +  * [[https://hal.archives-ouvertes.fr/hal-01502519|SPARQL Graph Pattern Processing with Apache Spark]] 
 +  * [[https://arxiv.org/abs/1604.08903|SPARQL query processing with Apache Spark]] 
 +  * [[https://hal.archives-ouvertes.fr/hal-01214900|HAQWA: a Hash-based and Query Workload Aware Distributed RDF Store]] 
 +  * [[https://hal.archives-ouvertes.fr/hal-01214902|On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark]]
  
-It provides access to some resources related to the evaluation section.+It provides access to the resources related to the evaluation section of [[https://arxiv.org/abs/1604.08903|SPARQL query processing with Apache Spark]]. 
 + 
 +See also [[en:site:recherche:logiciels:rdfdist]] concerning RDF distribution approaches using Spark
  
 ===== Data sets ===== ===== Data sets =====
   * DrugBank   * DrugBank
   * DBPedia   * DBPedia
-  * LUBM +  * LUBM: LU100M, LU1B 
-  * WatDiv+  * WatDiv: see [[en:site:recherche:logiciels:sparqlwithspark:datasetWatdiv]]
  
  
 ===== Query processing ===== ===== Query processing =====
 +
 +==== WatDiv queries ====
 +
 +
 +=== Query S1 ===
 +<code sparql>
 +SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 ?v7 ?v8 ?v9 WHERE {
 +?v0 gr:includes ?v1 . %v2% gr:offers ?v0 .
 +?v0 gr:price ?v3 . ?v0 gr:serialNumber ?v4 .
 +?v0 gr:validFrom ?v5 . ?v0 gr:validThrough ?v6 .
 +?v0 sorg:eligibleQuantity ?v7 .
 +?v0 sorg:eligibleRegion ?v8 .
 +?v0 sorg:priceValidUntil ?v9 . }
 +</code>
 +
 +See [[en:site:recherche:logiciels:sparqlwithspark:watDivS1]]
 +
 +=== Query F5 ===
 +<code sparql>
 +SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 WHERE {
 +?v0 gr:includes ?v1 . %v2% gr:offers ?v0 .
 +?v0 gr:price ?v3 . ?v0 gr:validThrough ?v4 .
 +?v1 og:title ?v5 . ?v1 rdf:type ?v6 . }
 +</code>
 +
 +See [[en:site:recherche:logiciels:sparqlwithspark:watDivF5]]
 +== Execution reports for F5 ==
 +
 +^Plan  ^  Execution report  ^
 +|SPARQL DF  |  {{:en:site:recherche:logiciels:f5_plan_sparql_df.png?300|SPARQL DF}}  |
 +|SPARQL Hybrid|  {{:en:site:recherche:logiciels:f5_plan_sparql_hybrid.png?300|SPARQL Hybrid}}  |
 +|S2RDF        |  {{:en:site:recherche:logiciels:f5_plan_s2rdf.png?300|S2RDF}}  |
 +|S2RDF+Hybrid        |  {{:en:site:recherche:logiciels:f5_plan_s2rdf_hybrid.png?300|S2RDF+Hybrid}}  |
 +
 +
 +=== Query C3 ===
 +<code sparql>
 +SELECT ?v0 WHERE {
 +?v0 wsdbm:likes ?v1 . ?v0 wsdbm:friendOf ?v2 .
 +?v0 dc:Location ?v3 . ?v0 foaf:age ?v4 .
 +?v0 wsdbm:gender ?v5 . ?v0 foaf:givenName ?v6 . }
 +</code>
 +
 +See [[en:site:recherche:logiciels:sparqlwithspark:watDivC3]]
 +
  
 ==== Star queries ==== ==== Star queries ====
 +Star queries over the DrugBank dataset
  
 +Star with 3 branches
  
 +<code sparql>
 +SELECT ?x ?a ?b
 +WHERE {
 + ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 + ? <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b.}
 +</code>
 +
 +Star with 5 branches
 +<code sparql>
 +SELECT ?x ?a ?b ?c ?d
 +WHERE {
 + ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 + ? <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . 
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d .
 +}
 +</code>
 +
 +Star with 10 branches
 +<code sparql>
 +SELECT ?x ?a ?b ?c ?d ?g ?h ?i
 +WHERE {
 + ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 + ? <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . 
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i .
 +}
 +</code>
 +
 +Star with 15 branches
 +<code sparql>
 +SELECT ?x ?a ?b ?c ?d ?g ?h ?i ?j ?k ?l
 +WHERE {
 + ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 + ? <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . 
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/contraindicationInsert> ?j .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/interactionInsert> ?k .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/structure> ?l.
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/state> ?m .
 + ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/rxlistLink> <http://www.rxlist.com/cgi/generic/ibup.htm> .}
 +</code>
 +
 +See [[en:site:recherche:logiciels:sparqlwithspark:star]]
 ==== Chain queries ==== ==== Chain queries ====
 +
 Chain queries over DBPedia data set. Chain queries over DBPedia data set.
 +
 === Chain4 query === === Chain4 query ===
 Chain4 is  Chain4 is 
-<code>+<code sparql>
 SELECT ?x1, ?x2, ?x3, ?x4, ?x5  WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 } SELECT ?x1, ?x2, ?x3, ?x4, ?x5  WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 }
 </code> </code>
 with properties with properties
-<code>+<code scala>
 val P1 = 1389363200 val P1 = 1389363200
 val P2 = 52239 val P2 = 52239
Ligne 30: Ligne 144:
 val P4 = 1164156928 val P4 = 1164156928
 </code> </code>
 +See [[en:site:recherche:logiciels:sparqlwithspark:chain4| Chain4 query plans]]
  
-The plans produced by each method are: +=== Chain6 query === 
-  * SPARQL RDD: +<code sparql
-<code> +SELECT ?x1, ?x2, ?x3, ?x4, ?x5?x6?x7   WHERE ?x1 P1 ?x2 . ?x2 P2 ?x3 ?x3 P3 ?x4 . ?x4 P4 ?x5 . ?x5 P5 ?x6 . ?x6 P6 ?x7 }
-val d1 = triples.filter{case(s,(p,o))=> p==1389363200}.map{case(x1,(p, x2)) => (x2, x1)} +
-val d2 = triples.filter{case(s,(p,o))=> p==P2}.mapValues{case(p,x3) => x3} +
-val d3 = triples.filter{case(s,(p,o))=> p==P3}.mapValues{case(p,x4) => x4} +
-val d4 = triples.filter{case(s,(p,o))=> p==P4}.mapValues{case(p,x5) => x5} +
- +
-val j1 = d1.join(d2).map{case(x2,(x1, x3))=> (x3, (x1,x2))} +
-val j2 = j1.join(d3).map{case(x3,((x1,x2), x4))=> (x4, (x1,x2,x3))} +
-val j3 = j2.join(d4).map{case(x4,((x1,x2,x3), x5))=> (x5, (x1,x2,x3,x4))} +
-j3.count+
 </code> </code>
- +with properties 
-  * SPARQL DF: +<code scala
-<code> +val P1 18843 
-val t1 df.where(s"p=$P1").select("s","o").withColumnRenamed("s", "x1").withColumnRenamed("o", "x2") +val P2 5540 
-val t2 df.where(s"p=$P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3") +val P3 1179222016 
-val t3 df.where(s"p=$P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4") +val P4 1446076416 
-val t4 df.where(s"p=$P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5") +val P5 1446244352 
-val res t1.join(t2,Seq("x2")).join(t3,Seq("x3")).join(t4,Seq("x4"))+val P6 36363
 </code> </code>
 +See [[en:site:recherche:logiciels:sparqlwithspark:chain6| Chain6 query plans]]
  
-  * SPARQL Hybrid DF: +==== Snowflake queries ==== 
-<code> + 
-val P2 = 52239 +SPARQL for Q8 from LUBM test suite 
-val P3 = 1164541952 +<code sparql
-val P4 = 1164156928 +PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
-val subg = df.where(s"p in ($P2, $P3, $P4)"+PREFIX ub: <http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#> 
-subg.persist +SELECT ?X?Y?Z 
-subg.count +WHERE 
-val st2 = subg.where(s"p= $P2").select("s","o").withColumnRenamed("s""x2").withColumnRenamed("o", "x3") +{?X rdf:type ub:Student 
-val st3 = subg.where(s"p= $P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4") +  ?Y rdf:type ub:Department 
-val st4 = subg.where(s"p= $P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5") +  ?X ub:memberOf ?Y 
-val res = t1.join(st2,Seq("x2")).join(st3,Seq("x3")).join(st4,Seq("x4")) +  ?Y ub:subOrganizationOf <http://www.University0.edu> 
-res.count+  ?X ub:emailAddress ?Z}
 </code> </code>
  
-=== Chain6 query ===+See [[en:site:recherche:logiciels:sparqlwithspark:snowflakeQ8]]  
 + 
 + 
 +===== Misc ===== 
 + [[en:site:recherche:logiciels:sparqlwithspark:utility| Utility tools]] 
 + 
  
-==== Snowflake queries ==== 
  
-==== WatDiv queries ==== 
  
site/recherche/logiciels/sparqlwithspark.1473782227.txt.gz · Dernière modification : de hubert