AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Adaptive Learned Index

Authors

;

Term

4. term

Education

Publication year

2019

Submitted on

Pages

15

Abstract

Idéen om learned indexes går ud på at erstatte nogle generelle datastrukturer (fx databaseindekser) med maskinlæringsmodeller, der kan tilpasses et bestemt datasæt. Vi tager idéen et skridt videre med meta-læring: Vi beskriver hvert datasæt med meta-features (overordnede kendetegn, der opsummerer dets struktur og kompleksitet) og bruger dem til at styre modelvalget. Vi afprøver flere maskinlæringsmodeller og rangerer dem med Multi-Criteria Decision Analysis (MCDA), en metode der afvejer flere præstationskriterier. Oven på dette bygger vi en meta-lærer, som på baggrund af et datasæts meta-features og modelrangeringen forudsiger, hvilken model der bør bruges. Fordi forskellige anvendelser vægter forskellige præstationsaspekter, kan brugeren angive, hvad der er vigtigst, hvorefter systemet vælger modeller derefter. I vores forsøg klarede denne tilgang sig bedre end den grundlæggende learned index-model præsenteret af Kraska m.fl.

The idea behind learned indexes is to replace some general-purpose data structures (such as database indexes) with machine-learning models that can be tailored to a specific dataset. We take this idea a step further with meta-learning: we describe each dataset using meta-features (high-level characteristics that capture its structure and complexity) and use them to guide model selection. We evaluate several machine-learning models and rank them with Multi-Criteria Decision Analysis (MCDA), a method that balances multiple performance criteria. On top of this, we build a meta-learner that, given a dataset’s meta-features and the model rankings, predicts which model to use for that dataset. Because different applications value different aspects of performance, our approach also lets users state which aspects matter most, and it selects models accordingly. In our experiments, this approach performed better than the base learned index model introduced by Kraska et al.

[This abstract was generated with the help of AI]