• Morten Gade
  • Michael Gade Nielsen
4. semester, Datalogi, Kandidat (Kandidatuddannelse)
This thesis documents the design, implementation and test of Probabilistic Relational Models (PRMs). PRMs are a graphical statistical approach to modeling relational data using the Relational Language. PRMs consist of two components; the dependency structure and the parameters.

Our design is based on simplicity, flexibility, and performance. We explain the search over possible structures, using a notation of potential parents. The potential parents are used in four different search algorithms; one greedy, two random, and a hybrid between greedy and random. Also, we explicitly explain learning the parameters using sufficient statistics, by considering internal and external dependencies and how to keep track of the context.

We perform five different tests, showing that the penalty term of the score function can be tweaked to control the trade off between maximum likelihood and complexity. Also, our limited scalability test shows that our implementation scales near linear, although our implementation could be further optimized. The search test of the four algorithms show that introducing randomness is beneficial if combined with a greedy approach. The results also show, that although greedy finds the best model, the hybrid approach comes close in less time.

In comparison with our prior work, it is clear that PRMs are a very good alternative to propositional data mining in terms of descriptiveness.
SprogEngelsk
Udgivelsesdatojun. 2005
ID: 61065122