AAU Student Projects is unavailable between June 15th 1.30pm and 17th 1.30pm due to planned system maintenance. The projects cannot be downloaded during this period.
AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Bid-Ask Spread Forecasting: Modelling and Forecasting Bid–Ask Spreads with Integer-Valued Trawl Processes

Translated title

Bid-Ask Spread Forecasting: Modellering og prognosticering af bid--ask-spreads med heltalsværdige trawlprocesser

Author

Term

4. term

Publication year

2026

Submitted on

Abstract

This thesis studies how to model and forecast the high-frequency behavior of the bid-ask spread, the gap between the best buying and selling prices. The spread takes discrete, non-negative values, changes slowly (is persistent), and is often more variable than simple models allow (overdispersion). This makes it well suited to integer-valued trawl processes (IVT), a family of count time-series models that capture flexible memory. We compare six IVT specifications that combine two distributions for the size of fluctuations (Poisson and negative binomial) with three trawl functions that determine how the past influences the present (exponential, inverse-Gaussian, and Gamma). The models are estimated using pairwise composite likelihood—an efficient technique based on probabilities for pairs of observations—and evaluated by their in-sample fit and by forecasting performance in an expanding-window design (the models are re-estimated as more data become available). The empirical application uses 5-second spread data for four liquid stocks. The results show that the negative binomial distribution improves the fit by allowing overdispersion, while a Gamma trawl generally best matches the observed autocorrelation (the link between current and past values). Forecast performance depends strongly on the horizon: at 5 seconds, a Poisson-INGARCH benchmark is most competitive; around 1 minute, negative-binomial IVT models perform best; and at longer horizons, simple persistence forecasts are hard to beat. Overall, IVT models provide an interpretable and useful framework for spread dynamics, but richer specifications require careful numerical diagnostics and do not unambiguously dominate simpler benchmarks.

Denne afhandling undersøger, hvordan man kan modellere og forudsige den højfrekvente udvikling i bid-ask-spreadet – forskellen mellem bedste købs- og salgspris. Spreadet måles i hele trin, kan ikke være negativt, ændrer sig langsomt (er persistent) og varierer ofte mere end simple modeller tillader (overdispersion). Derfor anvender vi heltalsværdige trawlprocesser (IVT), en klasse af tidsseriemodeller for tællinger, der kan beskrive fleksibel hukommelse i data. Vi sammenligner seks IVT-specifikationer, som kombinerer to fordelinger for udsvingenes størrelse (Poisson og negativ binomial) med tre trawlfunktioner, der bestemmer, hvordan fortiden påvirker nutiden (eksponentiel, invers-Gaussisk og Gamma). Modellerne estimeres med parvis sammensat likelihood, en effektiv metode baseret på sandsynligheder for datapunkter i par, og vurderes både på, hvor godt de passer de data, de er estimeret på, og på deres prognoser i et ekspanderende vinduesdesign (modellerne genestimeres løbende, efterhånden som mere data bliver tilgængelig). Den empiriske analyse bruger 5-sekunders spreaddata for fire likvide aktier. Resultaterne viser, at den negative binomialfordeling forbedrer modellens tilpasning ved at tillade overdispersion, mens en Gamma-trawl generelt bedst rammer den observerede autokorrelation (sammenhængen mellem nuværende og tidligere værdier). Prognosekvaliteten afhænger stærkt af horisonten: ved 5 sekunder er en Poisson-INGARCH-model mest konkurrencedygtig, omkring 1 minut klarer negative binomiale IVT-modeller sig bedst, og ved længere horisonter er simple persistensprognoser svære at slå. Samlet set giver IVT-modeller en gennemsigtig og nyttig ramme til at forstå spread-dynamik, men rigere specifikationer kræver omhyggelig numerisk diagnostik og overgår ikke entydigt enklere benchmarks.

[This abstract has been rewritten with the help of AI based on the project's original abstract]