AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Linkage Analysis with PDGs

Authors

;

Term

4. term

Publication year

2004

Abstract

Denne afhandling undersøger linkage-analyse, en metode til at finde geners placering ved at se på, hvordan DNA-segmenter nedarves sammen. Målet har været at undersøge, om en eksisterende algoritme, Fast Tree Traversal (udviklet af DeCode Genetics i Island), kan optimeres ved hjælp af probabilistiske grafiske modeller (netværksbaserede repræsentationer af usikkerhed). Den nuværende implementering af Fast Tree Traversal bruger MTBDDs (en kompakt datastruktur til beslutningsdiagrammer). Afhandlingen består af tre dele: en introduktion til linkage-analyse; en gennemgang af eksisterende algoritmer; og udviklingen af en ny algoritme baseret på PDGs. Vi har implementeret en enkeltpunkt-linkage-algoritme, der giver en RFG som output. For at bruge dette output som input til en multipunkt-algoritme skal RFG’en normaliseres til en PDG. Implementeringen er testet med data fra Superlink-hjemmesiden. Superlink er en anden linkage-algoritme, der bruger probabilistiske grafiske modeller, i dette tilfælde Bayesianske netværk. Med disse data indeholder den resulterende RFG 105 noder, hvilket er meget lille sammenlignet med det teoretisk mulige 4^42 noder.

This thesis examines linkage analysis, a method for locating genes by looking at how DNA segments are inherited together. The goal was to explore whether an existing algorithm, Fast Tree Traversal (developed by DeCode Genetics in Iceland), can be optimized using probabilistic graphical models (network-based representations of uncertainty). The current Fast Tree Traversal implementation uses MTBDDs (a compact data structure for decision diagrams). The thesis has three parts: an introduction to linkage analysis; a survey of existing algorithms; and the development of a new algorithm based on PDGs. We implemented a single-point linkage analysis algorithm that outputs an RFG. To use this output as input to a multi-point algorithm, the RFG must be normalized into a PDG. The implementation was tested with data from the Superlink website. Superlink is another linkage analysis algorithm that uses probabilistic graphical models, in this case Bayesian networks. With these data, the resulting RFG contains 105 nodes, which is very small compared with the theoretical 4^42 nodes.

[This abstract was generated with the help of AI]