AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Leveraging Symbolic Execution on Event-based Languages to Detect Data-Races

Authors

;

Term

4. term

Education

Publication year

2020

Submitted on

Pages

40

Abstract

Samtidig kørsel (concurrency) kan gøre software mere responsivt, fordi flere dele kan køre parallelt eller på skift. Men samtidighed gør analyser af programmer sværere og kan give fejl, hvor to kørsler med samme input ender med forskellige resultater. En typisk fejl er en data-race (datakapløb), hvor to tråde tilgår den samme variabel samtidigt uden korrekt synkronisering, og mindst én af dem skriver. Symbolsk eksekvering er en metode, der gennemløber mulige programveje med symbolske (ubestemte) input for at finde fejl, som kun opstår under særlige betingelser. Metoden er mest brugt til sekventielle programmer. Denne afhandling undersøger, hvordan symbolsk eksekvering kan bruges til at finde data-races i samtidige programmer ved at omsætte programmet til en kontrolflowgraf, som beskriver mulige tildelinger og interaktioner på tværs af tråde. Fordi modellen er en over-approksimation af alle mulige adfærdsmønstre, kan den symbolske eksekvering nogle gange danne ugyldige modeller (uopnåelige forløb) og rapportere potentielle data-races, der viser sig at være falske positiver. Afhandlingen diskuterer, hvordan dette kan forbedres, og præsenterer en forbedring af den symbolske eksekvering, som har vist sig ret effektiv for samtidige programmer med to tråde. Derudover beskrives mulige afbødninger og udvidelser, som kan styrke analysen yderligere, så en separat, konkret kørsel ikke behøver at være nødvendig for at bekræfte, at en rapporteret data-race faktisk kan opstå.

Concurrency can make software feel more responsive by letting parts of a program run in parallel or interleave. However, it also makes analysis harder and can introduce nondeterministic bugs, where two runs with the same input produce different results. A common example is a data race, where two threads access the same variable at the same time without proper synchronization and at least one write occurs. Symbolic execution explores possible program paths using symbolic (unknown) inputs to find bugs that only appear under specific conditions. It has mostly been applied to sequential code. This thesis investigates how to use symbolic execution to detect data races in concurrent programs by transforming the program into a control-flow graph that captures possible assignments and interactions across threads. Because this design over-approximates all possible behaviors, the analysis can sometimes produce invalid models (infeasible executions) and report potential data races that are false positives. The thesis discusses how to reduce these issues and presents an improvement to symbolic execution that has shown to be quite effective for programs with two threads. It also outlines mitigations and extensions that could further strengthen the analysis, aiming to reduce the need for separate concrete executions to confirm that a reported data race can actually occur.

[This abstract was generated with the help of AI]