AAU Student Projects - visit Aalborg University's student projects portal
A master thesis from Aalborg University

Autoencoder techniques for survival analysis on renal cell carcinoma

Author(s)

Term

4. term

Education

Publication year

2024

Submitted on

2024-06-07

Pages

14 pages

Abstract

Survival analysis heavily impacts the study of diseases by providing statistical methods and metrics to analyze time-to-event data, crucial in understanding disease progression and the effectiveness of treatments. However, in the medical domain, the data is often high-dimensional, complicating the regression of such methodologies. For this reason, in this work, we have focused on compressing the high-dimensionality found in the transcriptomic data of patients treated with an immunotherapy (avelumab + axitinib) and a TKI (sunitinib) into latent, meaningful features using autoencoders. We then applied a statistical methodology based on the COX Proportional Hazards model, a semi-parametric approach, combined with Breslow’s estimator to determine the survival functions of the patients and predict each patient's Progression-Free Survival (PFS). We extensively analyzed different penalties as well as their combinations. Due to the nature of the transcriptomic data, we extended the model to accept not only tabular data but also its graph variant, where the edges represent protein-to-protein interactions between genes, which proved to be a more meaningful approach. Finally, since neural networks, and especially autoencoders, are often seen as black boxes, we worked on interpretability by identifying the mutual information between the genes in the original data and the representations of the latent features. This approach attempts to clarify which genes are most presented in which latent variables. Our results show that certain types of autoencoders are more relevant depending on the situation. To obtain accurate reconstruction, denoising autoencoders prove useful. To find meaningful representations of the data, the sparse variant is the best option. Moreover, these penalties can be combined to achieve both accurate representations and meaningful latent features. The interpretable models also suggested that genes such as LRP2 and ACE2 are highly related to renal cell carcinoma. We present this work as extensive research demonstrating the usefulness of autoencoders in high-dimensional problems.

Keywords

Documents


Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.

If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.