AAU Student Projects - visit Aalborg University's student projects portal
A master thesis from Aalborg University

Unsupervised Feature Subset Selection

Author(s)

Term

4. term

Education

Publication year

2003

Submitted on

2012-02-14

Abstract

This master thesis has been developed in the domain of Decision Support Systems and it covers the sparsely researched area of unsupervised feature subset selection for data clustering. In the report we discuss what characterizes features that are relevant for data clustering and we propose new relevance score measures which are capable of producing a ranking of the features with respect to their relevance. The relevance scores, combined with a threshold, can be used in a filter approach where the uninformative features are discarded. The report proposes two methods for setting a threshold and the score measures are tested empirically on 3 synthetic data sets and 4 real world data sets. In a second step we propose to use the relevance rankings in a hybrid approach to performing unsupervised feature subset selection. This method allows us to perform unsupervised feature subset selection with less model inductions than ordinary wrapper approaches. Empirical tests show both the filter and hybrid approaches to perform satisfactory.

Documents


Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.

If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.