Bases de Données / Databases

Site Web de l'équipe BD du LIP6 / LIP6 DB Web Site

User Tools

Site Tools


en:site:recherche:logiciels:sparqlwithspark:watdivs1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:site:recherche:logiciels:sparqlwithspark:watdivs1 [15/09/2016 09:57] huberten:site:recherche:logiciels:sparqlwithspark:watdivs1 [16/09/2016 23:06] (current) – [WatDiv Query S1 plans] hubert
Line 1: Line 1:
-====== WatDiv Query S1 ======+{{indexmenu_n>2}} 
 + 
 +====== WatDiv Query S1 plans======
  
 === SPARQL DF plan === === SPARQL DF plan ===
  
-<code>+<code scala> 
 +// random partitioning
 val DATA = dfDefault val DATA = dfDefault
-//val DATA = subset 
  
 val t1 = DATA.where(s"(p=$idOffers and s=$retailer)").select("o").withColumnRenamed("o","s") val t1 = DATA.where(s"(p=$idOffers and s=$retailer)").select("o").withColumnRenamed("o","s")
-//queryTimeDFIter(t1, 1) 
-// 1.3s 
- 
-// On a résolu le pb de mauvaise localité de la jointure: 
-// OK la localité est correcte (car on a complété t1 pour qu'il ait plusieurs morceaux non vides  
-// on a bien "input from memory" dans les taches de jointure 
 val e1 = sc.parallelize(1 to NB_FRAGMENTS, NB_FRAGMENTS).map(x => -1).toDF("s") val e1 = sc.parallelize(1 to NB_FRAGMENTS, NB_FRAGMENTS).map(x => -1).toDF("s")
 val t1OK = t1.unionAll(e1) val t1OK = t1.unionAll(e1)
 var plan = t1OK var plan = t1OK
  
-//queryTimeDFIter(plan, 1) 
  
-// on reprend l'ordre obtenu par taille croissante des triple patterns+// ordered by increasing triple tp size
 val orderedProp = List( val orderedProp = List(
   ("sorg", "priceValidUntil"),   ("sorg", "priceValidUntil"),
Line 30: Line 25:
   ("sorg", "eligibleRegion"),    ("sorg", "eligibleRegion"), 
   ("gr", "price"))   ("gr", "price"))
- 
  
 val triples = orderedProp.map{case(ns, p) => { val triples = orderedProp.map{case(ns, p) => {
Line 51: Line 45:
 === SPARQL Hybrid DF plan === === SPARQL Hybrid DF plan ===
  
-<code>+<code scala>
  
 val subset = df.where(s"(p=51 and s=$retailer) or p in (3,9,38,40,56,57,63,69)").persist val subset = df.where(s"(p=51 and s=$retailer) or p in (3,9,38,40,56,57,63,69)").persist
Line 57: Line 51:
 // Merging time=4,885s  // Merging time=4,885s 
  
 +val DATA = subset
 +
 +val t1 = DATA.where(s"(p=$idOffers and s=$retailer)").select("o").withColumnRenamed("o","s")
 +val e1 = sc.parallelize(1 to NB_FRAGMENTS, NB_FRAGMENTS).map(x => -1).toDF("s")
 +val t1OK = t1.unionAll(e1)
 +var plan = t1OK
 +
 +
 +// ordered by increasing triple tp size
 +val orderedProp = List(
 +  ("sorg", "priceValidUntil"),
 +  ("gr", "validFrom"), 
 +  ("gr", "validThrough"), 
 +  ("gr", "includes"),
 +  ("gr", "serialNumber"), 
 +  ("sorg", "eligibleQuantity"), 
 +  ("sorg", "eligibleRegion"), 
 +  ("gr", "price"))
 +
 +val triples = orderedProp.map{case(ns, p) => {
 +  val idP = getIdP(ns, p)
 +  DATA.where(s"p=$idP").select("s","o").withColumnRenamed("o", s"o$idP")
 +}}
 +
 +// next triples
 +for( i <- triples) {
 +  plan = plan.join(i, "s")
 +}
 +
 +
 +// Execute query plan for S1
 +//----------------------------
 +queryTimeDFIter(plan, 10)
 +// 2,87 + 4,885 = 7,76s   INPUT=14+6.2=20,2GB SHFR=32KB 
 </code> </code>
  
Line 63: Line 91:
 === S2RDF plan === === S2RDF plan ===
  
-<code>+<code scala>
 val VP2EXP=VP2Random val VP2EXP=VP2Random
  
Line 97: Line 125:
 === S2RDF+Hybrid plan === === S2RDF+Hybrid plan ===
  
-<code>+<code scala>
 // VP's partitioned by subject // VP's partitioned by subject
 val VP2EXP=VP2Subject val VP2EXP=VP2Subject
en/site/recherche/logiciels/sparqlwithspark/watdivs1.1473926229.txt.gz · Last modified: by hubert