AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


DIPAAL: DIstributed PostgreSQL-based AIS Analytics and Loading

Term

4. term

Education

Publication year

2023

Submitted on

Pages

25

Abstract

AIS data show promise for analytical purposes, but as the data are not intended for analysis, the data need to be cleaned, processed, and stored before being usable. This paper presents an extension of DIPAAL, a system consisting of an efficient and modular ETL process for loading AIS data, as well as a distributed data warehouse storing the trajectories of ships. A spatially distributed data warehouse, with granularized cell and heatmap representations, is designed, developed, and evaluated. At the time of writing, DIPAAL stores 414 million kilometres of ship trajectories and more than 10 billion rows in the largest relation. It is found that the introduced granularized cell representation resolved out-of-memory errors of previous work, while improving the runtime of up to 324% compared to a trajectory-based query. It is also found that the spatially divided shards enable a consistently good scale up for both cell and heatmap analytics in large areas, ranging between 354% to 1164% with a 5x increase in workers. Lastly, it is found that the spatial divisions become slightly skewed over time, as traffic patterns evolve.