Bridge-DB: Query Optimization in a Multi-Database System

Studenteropgave: Speciale (inkl. HD afgangsprojekt)

  • Rune Sanden Ettrup
  • Lisbeth Nielsen
4. semester, Software, Kandidat (Kandidatuddannelse)
In this paper we present a distributed database system, called Bridge-DB. This system focuses on using multiple data sources without any prior knowledge of the underlying database architecture as well as simplify interaction with multiple database systems.

Bridge-DB has its own query language BQL which supports all CRUD operations. It is connected to PostgreSQL and Neo4J and through the modular design of the system it supports any type of storage mechanism through the implementation of a database driver module.

Our main contribution is the implementation of a cost-based optimizer using a combination of a dynamic and black box cost model to determine which database a query should be executed on, or whether the query should be enumerated, and executed on multiple databases after which the optimizer does post-processing of the results to fulfill the query.

The new solution has been tested against two different datasets each with a bias towards Neo4J and PostgreSQL respectively and a combination of both in order to test the effectiveness of the cost model. Based on response times, data traffic and overhead of Bridge-DB we show that through the use of our cost model we gain higher performance on response times at the cost of an increase in data traffic between Neo4J and Bridge-DB.
SprogEngelsk
Udgivelsesdato22 jun. 2015
Antal sider38
ID: 213686734