Table of Contents
MU4IN803 : SAM - Megadata Storage and Access (Feb 2020)
By 2023
2023: see the website </ignore> Moodle 4IN803
Old site (Feb 2020)
- Teacher: Hubert Naacke
- Course: Tuesdays at 4pm salle before each session.
- Group 1: Friday TME at 8.30am then TD
Consult the TME and TD rooms on the planning of the M1. Exam: course, TD and TME documents are authorised.
Homework subject : DM1
The </ignore>TMEJDBC 2020 can be done at home, locally using H2. It is no longer necessary to access the PPTI Oracle server to do this TME.
2020 Calendar
Date course | course | Dates TD | TD | TME at 8.30am | ens |
---|---|---|---|---|---|
28/01 | course 1: B+ trees | - | - | - | |
4/02 | course 2: Dynamic chopping | 4/02 | TD1: B+ trees | TME Index 2020 | H |
11/02 | course 3: query optimization | 11/02 | TD2: Hashing | TME Index 2020 (2) | S |
18/02 | lesson 4: Cost of operators | 18/02 | TD3: Optimising queries | TME Jointure | H |
25/02 | course 5: Design by fragmentation and replication | 25/02 | TD4: Optimising requests (continued) | TME Jointure (2) | H |
3/03 | course 6: Distributed query processing | 3/03 | TD5: Join queries | TME Jointure répartie | H |
10/03 | - revisions - | - | - | - | - |
17/03 | Review 1 | - | - | - | - |
24/03 | lesson 7: Replication | 24/03 | TD6: Fragmentation-based design | </ignore>TMEJDBC | H |
31/03 | course 8: Distributed transactions | 31/03 | TD7: Distributed queries | </ignore>TMEJDBC (2) | H |
6/04 and 14/04 | Holidays | - | - | - | - |
21/04 | course 9: parallel comics | 21/04 | TD8: Distributed queries (continued) | </ignore>TMEJDBC (3) | H |
28/04 | course 10: Fault recovery + Spark demo | 28/04 | TD9: Distributed transactions | </ignore>TMEJDBC (4) | H |
5/05 | - | 5/05 | TD10 Revisions | TME 10 on l'</ignore>interblocage allocated and revisions | H |
12/05 | - revisions - | - | - | - | - |
May 19th | Exam 2 | - | - | - | - |
16th June | Session 2 | - | - | - | - |
Course Suppports
TD supports
- TD1 to 5: poly part 1
- TD6 to 10 : poly part2
MCTs
Read the doc on Connexion au serveur Oracle 11
- TME Index 2020 (2 sessions)
- TME Jointure (2 sessions)
- TME Jointure répartie (1 session)
- </ignore>TMEJDBC(4 sessions)
- tme2pc (1 session)
Various
Go to website M1 DAC, the M1 timetable
Website </ignore><ignore> with news and documents posted by the master's school.
Former BDR2015 site with <ignore></ignore> annals. Subjects for the 2015 exam, 2016 mid-term exam
A video titled: </ignore> Really Big Data Analytics on Graphs with Trillions of Edges (presentation by Willy Zwaenepoel at the 2016 colloquium)
The service DynamoDB: see the paragraphs on defining a table, a primary key and secondary indexes.
The article mentioned in course 3 on query optimisation: How Good Are Query Optimizers, Really?.
The article mentioned during the TD on query optimisation : SIGMOD 2017: Access Path Selection in Main-Memory Optimized Data Systems: Should I Scan or Should I Probe?
Parallel sorting implemented for the SortBenchmark: experimental reports for the years sortBenchmark 2016 and sortBenchmark 2014
Spanner: OSDI2012, TOCS2013, SIGMOD2017 TrueTime explained by E.Brewer (pdf)
Some interesting recent articles (EDBT 2020)
Cost of a query: A cost model that is “learned” with a neural network… Does this mean we no longer need to bother with cost formulas? Not quite… PDF
Aggregation queries on data streams: a highly efficient distributed solution: PDF
ML-Index: can we replace the intermediate levels of a B+ tree with a neural network? If so, for which queries? PDF
Distributed transactions: a more 'distributed system' article, preferably for SAR students. The trick is to order transactions globally in a deterministic way before executing them. The article is based on the 2012 Calvin solution. Calvin is explained in a summarised and very affordable (for Master1 students) way in section 2.1. The rest is a bit more difficult to read. Q-StoreDistributed, Multi-partition Transactions via Queue-oriented Execution and Communication. PDF