Term
4. term
Education
Publication year
2005
Submitted on
2012-02-14
Abstract
This report documents our work on engineering a dynamic XML full-text index, usable for querying structured XML documents with simple syntax. We focus primarily on designing an index structure feasible for incremental updates while also discussing associated maintenance strategies for efficiently migrating data from an in-memory index into disk based B-trees. The proposed index structure is designed as a series of cascading inverted indexes: One index kept in main memory containing the working set of documents, one incrementally built disk based B-tree containing documents recently ruled out of the former index, and finally one static disk based B-tree containing documents that have remained static in a period long time. The efficiency of minimizing the amount of data stored in the indexes is researched. We evaluate on various compression schemes in the context of compressing entire inverted lists vs. single postings. In extension, we propose a customized variable byte length encoding scheme for storing Dewey paths efficiently. We facilitate the concept of Meet operator in order to filter search results and return the most relevant XML elements. We refine our previously proposed algorithm for the Meet operator in order to increase the relevance of result sets. In conclusion we conduct empirical tests, showing that the implemented system performs reasonably within the intended environment.
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.