AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Building an OLAP-XML Query Engine

Author

Term

10. Term

Publication year

2003

Abstract

Mange analyseresystemer (OLAP, Online Analytical Processing) har svært ved at håndtere data, der ændrer sig hurtigt, som f.eks. aktiekurser, hvis de skal læsses fysisk ind i datakuber. Samtidig findes sådanne data ofte i XML-databaser. Der er derfor brug for at forbinde XML-data logisk til OLAP uden at flytte eller kopiere data (logisk føderation). Tidligere arbejde har beskrevet en løsning til OLAP-XML-føderation, herunder datamodeller, et forespørgselssprog kaldet SQLXM, forespørgselsteknikker, en fysisk algebra (de interne operatorer til at udføre forespørgsler) og en prototype på en OLAP-XML-forespørgselsmotor. Dette arbejde integrerer praktisk forespørgselsoptimering i motoren. En forespørgselsoptimerer vælger den mest effektive måde at køre en forespørgsel på. Vi kombinerer regelbaseret og omkostningsbaseret optimering til at danne et sæt alternative eksekveringsplaner (et planrum) ud fra en startplan og vælge den plan med lavest omkostning. Nye operatorer og transformationsregler udvider mulighederne for planer. For at begrænse søgningen anvendes beskæringsteknikker som Branch and Bound. Desuden integreres inlejring (at erstatte en reference med dens definition) i den fysiske algebra og understøttes af den nye motor. Eksperimenter på den opdaterede motor viser, at optimeringerne markant øger hastigheden af forespørgsler og gør den logiske integration af OLAP- og XML-data mere effektiv.

Many analytics systems (OLAP, Online Analytical Processing) struggle to handle fast-changing data, such as stock prices, when it must be physically loaded into data cubes. At the same time, such data often resides in XML databases. There is therefore a need to connect XML data to OLAP logically, without moving or copying the data (logical federation). Previous work presented a solution for OLAP-XML federation, including data models, a query language called SQLXM, querying techniques, a physical algebra (the internal operators used to execute queries), and a prototype OLAP-XML query engine. This work adds practical query optimization to that engine. A query optimizer chooses the most efficient way to run a query. We combine rule-based and cost-based optimization to generate a set of alternative execution plans (a plan space) from an initial plan and select the least-cost plan. New operators and transformation rules expand the range of plans. To keep the search manageable, we apply pruning methods such as Branch and Bound. We also integrate inlining—replacing a reference with its definition—into the physical algebra, supported by the new engine. Experiments on the updated engine show that these optimizations significantly speed up queries and make the logical integration of OLAP and XML data more effective.

[This abstract was generated with the help of AI]