Trusting Gut Instincts: Transformer-Based Extraction of Structured Data from Gut-Brain Axis Publications: A Model Ensembling and Weighted Training Approach for GutBrainIE

Translated title

Trusting Gut Instincts: Transformer-Based Extraction of Structured Data from Gut-Brain Axis Publications

Authors

Andersen, Lasse Ryge ; Dolmer, Mikkel Hagerup ; Gardshodn, Marius Ihlen

Term

4. term

Education

Software, Master

Publication year

2025

Submitted on

2025-06-09

Pages

Abstract

This paper presents the proposed solution by our team, Gut-Instincts, for the GutBrainIE challenge, which introduces a named-entity recognition (NER) task and three relation extraction (RE) tasks on biomedical articles related to the gut-brain axis. To address the domain-specific terminology involved in the tasks, we rely on biomedical pretrained transformer-based models. For NER, we extend these with three different classification heads: (1) a dense layer, (2) a dense layer followed by a conditional random field (CRF), or (3) a bidirectional long short-term memory layer followed by a CRF. For RE, we introduce negative samples and experiment with different ratios between positive and negative samples. For all tasks, we use model ensembling to reduce variability and improve robustness. Furthermore, since the provided dataset is of different quality levels, we use weighted training that enables the models to utilize all available data, while ensuring that high-quality data has a stronger influence during optimization. Our experimental results suggest that a large ratio of negative to positive samples, model ensembling, and weighted training improve performance in the NER and RE tasks. In the GutBrainIE challenge, we placed second in the NER task (6.1) with a micro F1 score of 0.8382, and first place in all three RE tasks 6.2.1, 6.2.2, and 6.2.3 with micro F1 scores of 0.6864, 0.6866, and 0.4635, respectively.

Documents

Download
View record in AAU Student Projects

An executive master's programme thesis from Aalborg University

Trusting Gut Instincts: Transformer-Based Extraction of Structured Data from Gut-Brain Axis Publications: A Model Ensembling and Weighted Training Approach for GutBrainIE