Author(s)
Term
4. term
Education
Publication year
2017
Submitted on
2017-06-05
Pages
64 pages
Abstract
RDF data is becoming increasingly popular as a model for representing unstructured data on the Web. The data sets therefore reaches web-scale sizes in the form of RDF graphs with billions of triples. In order to handle such large data sets distributed processing systems are needed. SFRDF is one of such systems, which is based on the distributed framework called Apache Flink and the partitioning technique known as ExtVP. SFRDF showed to nearly be competitive with state of the art systems during its creation in the fall of 2016. We propose an improved version of SFRDF, SFRDF+, which implements several improvements to the original system. The improvements include simple changes such as introducing dictionary encoding, but also more advanced features such as introducing join order optimizations in order to generate better query plans. In order to find the best approach for generation of query plans we implement and evaluate different approaches from within the area of RDF processing, i.e. CliqueSquare, and traditional database management systems, i.e. DPCCP and a greedy approach. We evaluate the dictionary encoding and the different approaches for join order optimizations and learn that the only feasible approach for our system is the greedy one. We modify the cost function for the greedy approach to prefer bushy plans to see if these yield better performance. This is not the case for the plans generated by our algorithm, but the standard greedy algorithm shows promising results with an overall reduction to query response times.
Keywords
SFRDF ; Flink ; Bushy ; Join order optimization ; Dictionary Encoding ; SPARQL ; RDF
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.