Author(s)
Term
4. term
Education
Publication year
2019
Submitted on
2019-06-07
Pages
15 pages
Abstract
The Case for Learned Index proposes to replace data structures with machine learning models. The reasoning behind this is that most data structures are general purpose, whereas a machine learning model can be specialized to a specefic dataset. We propose to further specialize this idea by utilizing meta-learning. By looking at data characteristics called meta-features, we determined the complexity of datasets. Several machine learning models were tested and ranked based on their performance using Multi-Criteria Decision Analysis. A meta-learner was constructed which, based on the meta-features and the ranking of the machine learning models, can predict which model to use for a given dataset. Furthermore, we introduced the notion that different applications require machine learning models that excels at different aspects. Therefore, the user is able to specify which aspects their machine learning model should excel at. Our results showed superior performance compared to the base learned index model presented by Kraska et al.
The Case for Learned Index proposes to replace data structures with machine learning models. The reasoning behind this is that most data structures are general purpose, whereas a machine learning model can be specialized to a specefic dataset. We propose to further specialize this idea by utilizing meta-learning. By looking at data characteristics called meta-features, we determined the complexity of datasets. Several machine learning models were tested and ranked based on their performance using Multi-Criteria Decision Analysis. A meta-learner was constructed which, based on the meta-features and the ranking of the machine learning models, can predict which model to use for a given dataset. Furthermore, we introduced the notion that different applications require machine learning models that excels at different aspects. Therefore, the user is able to specify which aspects their machine learning model should excel at. Our results showed superior performance compared to the base learned index model presented by Kraska et al.
Keywords
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.