AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Attribute-enhanced Collaborative Topic Modelling

Author

Term

4. term

Education

Publication year

2019

Submitted on

Pages

94

Abstract

This thesis investigates solutions to the item cold-start problem in recommender systems using data from the Japanese internship platform 01Intern. It proposes CoAWILDA+, a hybrid approach that combines the strengths of collaborative filtering with multiple types of item attributes. The model integrates topic modeling of unstructured job text via AWILDA (Adaptive Windowing–based Incremental Latent Dirichlet Allocation) to capture thematic content and adapt to evolving data, and an Attribute-Enhancement variant of matrix factorization that incorporates structured item attributes into latent factors so that new jobs without interaction history can be represented. The thesis covers datasets (01Intern and MovieLens 100K), the relevance of data streams and concept drift, and training and evaluation methods. The AWILDA component is evaluated separately with emphasis on topic model properties (e.g., perplexity and document similarities) and the effects of drift adaptation, while recommendation quality is assessed using standard protocols and measures with particular attention to item cold-start. Specific quantitative findings are not provided in this excerpt; instead, the work presents the combined model, implementation details, and the evaluation framework that enables analysis of how blending text-derived topics with attribute-enhanced matrix factorization can mitigate cold-start in practice.

Denne afhandling undersøger løsningsmuligheder for item cold-start-problemet i anbefalingssystemer med udgangspunkt i data fra den japanske praktikplatform 01Intern. Der foreslås en hybridmodel, CoAWILDA+, som kombinerer styrkerne ved kollaborativ filtrering med udnyttelse af flere typer item-attributter. Modellen integrerer emnemodellering af ustruktureret jobtekst via AWILDA (Adaptive Windowing-baseret inkrementel Latent Dirichlet Allocation) for at udtrække tematisk indhold og håndtere ændringer over tid, og en Attribute-Enhancement-variant af matrixfaktorisering, der inkorporerer strukturerede item-attributter i de latente faktorer for at kunne repræsentere nye jobs uden historiske interaktioner. Afhandlingen gennemgår datasæt (01Intern og MovieLens 100K), relevansen af data streams og konceptdrift, samt metoder til træning og evaluering. AWILDA-komponenten evalueres særskilt med fokus på emnemodellernes egenskaber (fx perplexity og dokumentlignheder) og effekter af drifttilpasning, mens anbefalingskvalitet vurderes med etablerede protokoller og mål og med særlig opmærksomhed på item cold-start. Konkrete kvantitative resultater fremgår ikke af dette uddrag; arbejdet præsenterer dog den samlede model, implementeringsdetaljer og evalueringsrammen, der sætter scenen for at analysere, hvordan kombinationen af tekstbaserede emner og attributforstærket matrixfaktorisering kan afhjælpe cold-start i praksis.

[This apstract has been generated with the help of AI directly from the project full text]