Improving Neural Networks for Predicting Sepsis from Imbalanced, Multivariate Time Series Data with High Missing Rates
Student thesis: Master Thesis and HD Thesis
- Martin Simonsen
- Mathias Højer Svendsen
- Simon Dam Nielsen
4. term, Software, Master (Master Programme)
In this project, we attempt to create a neural network model, which outperforms an XGBoost model for predicting sepsis. The data, we have available is highly imbalanced, multivariate time series data with high missing rates. In order to mitigate the problems that arise from this type of data, we experiment with modifications of the following time series models: LSTM, TCN, BRITS, and GRU-D. We propose a neural network architecture, which utilizes extracted features from the data while incorporating the time series models. We experiment with the following: class weighted loss function, demographics extracted features, missingness representations, observation rates extracted features, and delta representation extracted features. Through our experiments, we observe that the neural network models benefit from the observation rate, and BRITS and GRU-D show the best results on separate datasets. We also observe that the missingness representations are beneficial as inputs to the models. Finally, we conclude that for one dataset, our model is preferable in a clinical setting compared to XGBoost, due to its calibration, despite XGBoost's superior performance metrics.
Language | English |
---|---|
Publication date | 28 Jun 2021 |
Number of pages | 89 |
External collaborator | Enversion A/S no name vbn@aub.aau.dk Other |