AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Y-STR: Haplotype Frequency Estimation and Evidence Calculation

Translated title

Y-STR: Estimation af haplotypefrekvens og evidensberegning

Term

4. term

Publication year

2010

Submitted on

Pages

138

Abstract

Y-STR haplotype frequency estimation is important because it is required in order to calculate evidence. The loci on the Y-chromosome cannot be assumed to be independent as with on the autosomal STR, so the simultaneous probability does not factor to the product of the marginal probabilities. This means that a statistical model incorporating proper dependence must be created. First an existing method, the frequency surveying approach, is described, and afterwards new models are developed. The new models considered are a new method called ancestral awareness and models based on existing methods such as kernel smoothing and model based clustering. Also a class of models, classification models, are developed. Examples of such models are classification trees, support vector machines, and ordered logistic regression. Methods to assess the performance of the methods are developed and afterwards used to compare the models. It is found that classification trees is a good model, but it has the disadvantage of not using the prior knowledge such as the single step mutation model. Besides frequency estimation, evidence calculations is also considered in this thesis.