Classifying the Reason for Adverse Events: -based on Statistical Natural Language Processing
Student thesis: Master Thesis and HD Thesis
- Marie Juul Hansen
- Nana Østergaard Rasmussen
4. term, Biomedical Engineering and Informatics, Master (Master Programme)
In year 2007, 23.521 adverse events were reported in Denmark. The purpose of reporting adverse events is to create guidelines for how to prevent future events, and hence save lives. The reports are written in natural language. It is time consuming for the risk manager to read through all of the reports carefully, to locate the reason for the adverse event.
This project examines if it is possible to classify the adverse events, based on the reason for the event by using statistical natural language processing. This is done by developing a system as a proof of concept. The system consists of a user interface and a classification model. The classification model is trained and tested on 132 adverse event reports, all containing the keyword 'EPM', where the purpose of the model is to classify whether EPM is the reason for the adverse event or not. This is done by creating a model based on a combination of a prior knowledge from domain experts and statistical knowledge.
The classification results in an F-measure at 0.946 for the reports, where EPM is not the reason, and an F-measure at 0.667 for the reports, where EPM is the reason for the adverse event. These results can be improved by training the model on more adverse event reports.
It can be concluded that statistical natural language processing can be used in classifying adverse events if a prior knowledge is included in relation to the specific keyword.
Language | Danish |
---|---|
Publication date | 2008 |
Number of pages | 119 |
Publishing institution | AAU |