Balancing Privacy and Accuracy in Machine Learning Models with Differential Privacy
Authors
Akter, Monia ; Hasan, Md Afridi
Term
4. semester
Education
Publication year
2025
Submitted on
2025-06-04
Pages
89
Abstract
At beskytte persondata under træning af maskinlæringsmodeller er en udfordring. Denne afhandling undersøger Differential Privacy (DP) – en metode, der begrænser, hvor meget en model kan afsløre om et enkelt individ. Vi evaluerer Logistisk Regression, Beslutningstræer, Naive Bayes og Neurale Netværk på Adult Income-datasættet og træner på både oprindelige data og DP-beskyttede data med forskellige privathedsbudgetter (niveauer af beskyttelse). Naive Bayes klarer sig godt med DP, i tråd med dets enkle, probabilistiske design. Ensemblemodeller bevarer også god nøjagtighed på tværs af privathedsniveauer. For neurale netværk giver brugen af DP-SGD – en træningsmetode, der håndhæver DP – en praktisk balance mellem nøjagtighed og privathed og hjælper med at reducere privathedsangreb. På den baggrund anbefales Naive Bayes, ensembletilgange og DP-SGD til brug i praksis, hvor både privathed og nøjagtighed er vigtige.
Protecting personal data during machine learning is challenging. This thesis examines Differential Privacy (DP), a technique that limits how much a model can reveal about any one person. We evaluate Logistic Regression, Decision Trees, Naive Bayes, and Neural Networks on the Adult Income dataset, training on both original data and DP-protected data with different privacy budgets (levels of protection). Naive Bayes performs well with DP, consistent with its simple, probabilistic design. Ensemble models also maintain good accuracy across privacy levels. For neural networks, using DP-SGD—a training method that enforces DP—offers a practical balance between accuracy and privacy and helps reduce privacy attacks. Based on these results, the thesis recommends Naive Bayes, ensemble approaches, and DP-SGD for real-world scenarios where both privacy and accuracy matter.
[This summary has been rewritten with the help of AI based on the project's original abstract]
Documents
