Author(s)
Term
4. term
Education
Publication year
2023
Submitted on
2023-08-10
Pages
10 pages
Abstract
I de seneste år har grafdatabaser, triplestores og vidensgrafer i stigende grad tiltrukket interesse. Det er dog stadig en udfordring at forespørge triplestores effektivt, da mange optimeringsstrategier fra traditionelle databaser stadig ikke er udforskede. Som et første skridt til at optimere triplestores undersøger denne artikel spørgsmålet om, hvordan man kan forbedre forespørgselstiden ved at adressere omkostningstunge eksistenstjek i join-operationer. For at nå dette mål integrerer vi et Bloom-filter, der kompakt befinder sig i primær-hukommelsen, der skal bruges i stedet for diskbaserede indekser til eksistenstjekoperationer. Vi anvender desuden vidensgrafsstatistik til at bestemme de specifikke join-operationer, hvor Bloom-filter eksistenstjek gavner eksekveringstiden. Vi udvider en reference triplestore (Jena) med Bloom-filtre og integrerer vores tilgang til forespørgselsoptimering. Vi evaluerer vores tilgang, JenaBloom, på et stort sæt af mere end 1.500 forespørgsler og viser dens effektivitet på forespørgsler, der returnerer tomme resultatsæt, samt dem, der returnerer ikke-tomme resultatsæt.
In recent years, graph data management, triplestores, and knowledge graphs have increasingly have attracted interest. However, it still remains challenging to efficiently query triplestores, as many optimization strategies from traditional databases are still left unexplored. As a first step to optimize triplestores, this paper examines the question of how to improve query execution time by addressing costly existence checks in join operations. To achieve this goal, we integrate a Bloom filter residing entirely and compactly in-memory to be used in place of disk-based indexes for existence check operations. We furthermore apply triple statistics in determining the specific join operations in which Bloom filter existence checks benefit execution time. We extend a reference triplestore (Jena) with Bloom Filters and integrate our approach for query optimization. We evaluate our approach, JenaBloom, on a large set of more than 1,500 queries, and show its effectiveness on queries returning empty result sets, as well as those returning non-empty result sets.
Keywords
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.