Multi-dimensional Classification - Data Mining using Data Cubes
Student thesis: Master thesis (including HD thesis)
- Peter Jensen
4. term, Computer Science, Master (Master Programme)
This thesis deals with the use of data mining on data warehouse
structured data, also known as multi-dimensional data.
The theory regarding data warehouses is investigated with the purpose
of understanding the structure of data in these. Then a data set,
dealing with the sales of products, and the payments of the customers,
is analysed. There are to goals for this analysis, one is to create
a multi-dimensional design, the other, and more important, is to get
experience in creating such designs, to understand the structure of
the designs better.
Then it is tried to analyse the multi-dimensional data using a traditional
data mining tool, Clementine. The aim of this analysis is to discover
weaknesses in traditional data mining tools when dealing with
multi-dimensional data. We then propose a way to analyse
multi-dimensional data in general, and we propose changes to decision
tree induction algorithms such that they utilise the multi-dimensional
structure better.
Finally, we evaluate the proposed way to analyse multi-dimensional
data using a prototype of a graphical interface, and we analyse some
of the proposed changed to decision tree induction using the data set
we have been working with.
Language | English |
---|---|
Publication date | Aug 2003 |