Y-STR: Estimation af haplotypefrekvens og evidensberegning
Studenteropgave: Speciale (inkl. HD afgangsprojekt)
- Mikkel Meyer Andersen
4. semester, Matematik, Kandidat (Kandidatuddannelse)
Y-STR haplotype frequency estimation is important because it is required in order to calculate evidence. The loci on the Y-chromosome cannot be assumed to be independent as with on the autosomal STR, so the simultaneous probability does not factor to the product of the marginal probabilities. This means that a statistical model incorporating proper dependence must be created.
First an existing method, the frequency surveying approach, is described, and afterwards new models are developed. The new models considered are a new method called ancestral awareness and models based on existing methods such as kernel smoothing and model based clustering. Also a class of models, classification models, are developed. Examples of such models are classification trees, support vector machines, and ordered logistic regression.
Methods to assess the performance of the methods are developed and afterwards used to compare the models. It is found that classification trees is a good model, but it has the disadvantage of not using the prior knowledge such as the single step mutation model. Besides frequency estimation, evidence calculations is also considered in this thesis.
First an existing method, the frequency surveying approach, is described, and afterwards new models are developed. The new models considered are a new method called ancestral awareness and models based on existing methods such as kernel smoothing and model based clustering. Also a class of models, classification models, are developed. Examples of such models are classification trees, support vector machines, and ordered logistic regression.
Methods to assess the performance of the methods are developed and afterwards used to compare the models. It is found that classification trees is a good model, but it has the disadvantage of not using the prior knowledge such as the single step mutation model. Besides frequency estimation, evidence calculations is also considered in this thesis.
Sprog | Engelsk |
---|---|
Udgivelsesdato | maj 2010 |
Antal sider | 138 |
Udgivende institution | Institut for Matematiske Fag, Aalborg Universitet |