AutoBinner: A Metagenomic Binner combining Feature Learning and HDBSCAN
Term
4. term
Education
Publication year
2020
Submitted on
2020-06-11
Pages
65
Abstract
This report documents the development of the metagenomic binner AutoBinner. Autobinner combines feature learning and clustering by using a stacked undercomplete autoencoder and the clustering algorithm HDBSCAN. The autoencoder learns a feature embedding, given abundance and composition features before the embedding is clustered. To provide the full picture of AutoBinner the report also provides an explanation of the workings of neural networks in the context of autoencoders and feature learning. The performance of AutoBinner is evaluated with a comparison with state of the art binners, MetaBAT2 [Kang et al., 2019], CONCOCT [Alneberg et al., 2013] and VAMB [Nissen et al., 2018], on three different datasets CAMI Medium, CAMI High, and CAMI Airways. The results indicate that further refinement of AutoBinner is needed, yet we see potential of using autoencoders and HDBSCAN for metagenomic binning.
Keywords
Binner ; Autoencoder ; Binning ; Deep learning ; Feature Learning ; Tensorflow ; Keras ; HDBSCAN ; Clustering
Documents
