Default Probability Prediction by Explainable Machine Learning for Small and Medium-Sized Enterprises
Author
Shadpour, Mehdi
Term
4. semester
Publication year
2024
Submitted on
2024-06-02
Pages
55
Abstract
Små og mellemstore virksomheder (SMV’er) er centrale for økonomisk vækst, men det er komplekst og tidskrævende for långivere at vurdere deres kreditværdighed. Dette studie undersøger, hvordan forklarlig maskinlæring kan forbedre vurderingen af kreditrisiko ved at forudsige sandsynligheden for misligholdelse for SMV’er. Vi analyserer tre grupper af amerikanske SMV’er, der modtog støtte fra Payroll Protection Program (PPP) under COVID-19-pandemien, med fokus på at forudsige hver virksomheds sandsynlighed for misligholdelse (dvs. risikoen for ikke at kunne tilbagebetale et lån). Vi træner udbredte maskinlæringsmodeller—eXtreme Gradient Boost (XGB), logistisk regression og Support Vector Machine (SVM)—til at estimere denne risiko. Selvom sådanne modeller ofte er nøjagtige, er de typisk svære at fortolke, fordi de ikke viser, hvilke inputfaktorer der vægter mest i den enkelte beslutning. Derfor anvender vi forklarlig kunstig intelligens (XAI) med metoder som SHAP og LIME, der gør det synligt, hvordan hver variabel påvirker modelens forudsigelse, og i hvilken retning og styrke. Ifølge resultaterne i dette arbejde giver brugen af XAI ikke blot klare forklaringer, men er også forbundet med mere præcise, transparente og mere dækkende vurderinger af misligholdelsesrisiko. Denne tilgang kan hjælpe finansielle beslutningstagere med at vurdere SMV’ers kreditrisiko mere effektivt og forstå de vigtigste drivkræfter bag modelernes beslutninger.
Small and medium-sized enterprises (SMEs) are vital to economic growth, yet assessing their creditworthiness is complex and time-consuming for lenders. This study explores how explainable machine learning can improve credit risk assessment by predicting the probability of default for SMEs. We examine three groups of U.S. SMEs that received support from the Payroll Protection Program (PPP) during the COVID-19 pandemic, aiming to forecast each firm’s probability of default (the chance a borrower cannot repay a loan). We train common machine learning models—eXtreme Gradient Boost (XGB), logistic regression, and support vector machines (SVM)—to estimate this risk. While these models can be highly accurate, they are often hard to interpret because they do not reveal which inputs matter most in individual decisions. To address this, we apply explainable AI (XAI) techniques, specifically SHAP and LIME, which show how each variable influences a prediction and by how much. In this study’s findings, using XAI not only provides clear, case-by-case explanations but is also associated with more accurate, transparent, and comprehensive assessments of default risk. This approach can help financial decision-makers evaluate SME credit risk more efficiently and understand the key factors driving model decisions.
[This summary has been rewritten with the help of AI based on the project's original abstract]
Keywords
Documents
