Deep Clustering for Metagenomic Binning
Translated title
Deep Clustering til Metagenomic Binning
Authors
Term
4. Term
Publication year
2022
Submitted on
2022-06-17
Pages
117
Abstract
Deep learning is an area that is only sparsely explored for metagenomic bin- ning. The existing deep learning-based approaches usually preprocess raw DNA sequences into input features such as com- position and abundance and perform rep- resentation learning and clustering in two steps. The utilization of unprocessed DNA sequences as input shows promis- ing results for gene prediction. Joint deep clustering leads to better results for im- age clustering than basic approaches like k-means clustering. In this report, we in- vestigate the potential of joint end-to-end unsupervised learning and the utilization of unprocessed contigs as inputs for the task of metagenomic binning. We propose two new binners: Deep Con- volutional Metagenomic Binner (DCMB) and Deep Stacked Metagenomic Bin- ner (DSMB). Both binners utilize KL divergence-based joint deep clustering. The DCMB takes unprocessed contigs and the DSMB uses abundance and composi- tion as inputs. The performance of both binners is bench- marked on the CAMI Low dataset and compared to metagenomic binners VAMB, MetaBat2, and SolidBin. The results show that metagenomic information requires preprocessing to obtain meaningful repre- sentations and that joint end-to-end learn- ing slightly improves the number of recov- ered bins.
Keywords
metagenomics ; binning ; deep learning ; joint clustering ; end-to-end learning ; KL divergence ; VAMB ; metagenomic binning ; DCMB ; DSMB ; DVMB
Documents
