AAU Student Projects - visit Aalborg University's student projects portal
A master thesis from Aalborg University

Palamut - An Expansion of the Bonito basecaller using language models

Author(s)

Term

4. term

Education

Publication year

2020

Submitted on

2020-06-11

Pages

17 pages

Abstract

In this paper we discuss methods used in modern basecallers and the end-to-end ASR architecture adopted by the Bonito basecaller to increase accuracy. We investigate the prospect of increasing accuracy by applying common ASR approaches to basecalling. \ We expand the architecture of the Bonito nanopore basecaller, by introducing a decoder algorithm, allowing for the use of language model probabilities, to increase accuracy of basecalls. We train and compare $n$-gram and RNN character-level language models. \ Our results show that while an introduction of language models gives a slight increase in consensus accuracy of basecalls, our current language models decrease read accuracy by an equal margin. We finally conclude that the decrease in accuracy is caused by poorly optimized hyperparameters of our decoder, and present potential solutions to the problem.

Documents


Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.

If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.