AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Sequence Tree Prediction

Authors

; ;

Term

2. term

Publication year

2011

Submitted on

Pages

93

Abstract

I denne afhandling undersøger vi sekvensforudsigelse til prædiktiv tekst, hvor målet er at gætte det næste ord eller tegn ud fra de foregående. Vi introducerer en ny metode, Sequence Trees (sekvenstræer), og definerer den formelt. Vi implementerer og tester også flere optimeringer, der reducerer hukommelsesforbrug (rumkompleksitet) og forbedrer hastighed (tidskompleksitet). Vi evaluerer Sequence Trees mod to udbredte referencemetoder—n-grammer (modeller baseret på sekvenser af fast længde) og klassifikationstræer (beslutningstræklassifikatorer)—på en opgave med prædiktiv tekst. Sequence Trees opnår højere forudsigelsesnøjagtighed end begge referencemetoder. Metoden har et begrænset, men lavere hukommelsesforbrug end klassifikationstræer. N-grammer bruger endnu mindre hukommelse, men har lavere nøjagtighed. Med de foreslåede optimeringer opnår Sequence Trees både høj nøjagtighed og lavt hukommelsesforbrug i prædiktiv tekst.

This thesis studies sequence prediction for predictive text, where the goal is to guess the next word or token from the previous ones. We introduce a new method, Sequence Trees, and give its formal definition. We also implement and test several optimizations that reduce memory use (space complexity) and improve speed (time complexity). We evaluate Sequence Trees against two common baselines—n-grams (models that use fixed-length sequences) and classification trees (decision-tree classifiers)—on a predictive text task. Sequence Trees achieve higher prediction accuracy than both baselines. Their memory use is a constraint, but it is lower than that of classification trees. N-grams use even less memory but have lower accuracy. With the proposed optimizations, Sequence Trees deliver both high accuracy and low memory requirements in predictive text.

[This abstract was generated with the help of AI]

Other projects by the authors