site:recherche:logiciels:sparqlwithspark
                Ceci est une ancienne révision du document !
Table des matières
SPARQL query processing with Apache Spark
This web page is a companion to the “SPARQL query processing with Apache Spark” paper submitted at EDBT 2017.
It provides access to some resources related to the evaluation section.
Data sets
- DrugBank
 - DBPedia
 - LUBM
 - WatDiv
 
Query processing
Star queries
Chain queries
Chain queries over DBPedia data set.
Chain4 query
Chain4 is
SELECT ?x1, ?x2, ?x3, ?x4, ?x5  WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 }
with properties
val P1 = 1389363200 val P2 = 52239 val P3 = 1164541952 val P4 = 1164156928
The plans produced by each method are:
- SPARQL RDD:
 
val d1 = triples.filter{case(s,(p,o))=> p==1389363200 || p==18843 || p==1089470464 || p==18970 }.map{case(x1,(p, x2))=> (x2, x1)}
val d2 = triples.filter{case(s,(p,o))=> p==P2}.mapValues{case(p,x3) => x3}
val d3 = triples.filter{case(s,(p,o))=> p==P3}.mapValues{case(p,x4) => x4}
val d4 = triples.filter{case(s,(p,o))=> p==P4}.mapValues{case(p,x5) => x5}
val j1 = d1.join(d2).map{case(x2,(x1, x3))=> (x3, (x1,x2))}
val j2 = j1.join(d3).map{case(x3,((x1,x2), x4))=> (x4, (x1,x2,x3))}
val j3 = j2.join(d4).map{case(x4,((x1,x2,x3), x5))=> (x5, (x1,x2,x3,x4))}
j3.count
- SPARQL DF:
 
val t1 = df.where(s"p=$P1").select("s","o").withColumnRenamed("s", "x1").withColumnRenamed("o", "x2")
val t2 = df.where(s"p=$P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3")
val t3 = df.where(s"p=$P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4")
val t4 = df.where(s"p=$P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5")
val res = t1.join(t2,Seq("x2")).join(t3,Seq("x3")).join(t4,Seq("x4"))
- SPARQL Hybrid DF:
 
val P2 = 52239
val P3 = 1164541952
val P4 = 1164156928
val subg = df.where(s"p in ($P2, $P3, $P4)")
subg.persist
subg.count
val st2 = subg.where(s"p= $P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3")
val st3 = subg.where(s"p= $P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4")
val st4 = subg.where(s"p= $P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5")
val res = t1.join(st2,Seq("x2")).join(st3,Seq("x3")).join(st4,Seq("x4"))
res.count
Chain6 query
Snowflake queries
WatDiv queries
site/recherche/logiciels/sparqlwithspark.1473782196.txt.gz · Dernière modification :  de hubert
                
                