Using XML Data in OLAP Queries
Studenteropgave: Speciale (inkl. HD afgangsprojekt)
- Karsten Riis
- Dennis Pedersen
4. semester, Datalogi, Kandidat (Kandidatuddannelse)
The changing data requirements of today''s dynamic business
environments are not handled well by current On-Line Analytical
Processing (OLAP) systems. Physically integrating unexpected data into
such systems is a long and time-consuming process making logical
integration the better choice in many situations. The increasing use
of Extended Markup Language (XML), e.g. in business-to-business (B2B)
applications, suggests that the required data will often be available
as XML data.
In this paper we present a flexible and theoretically well-founded approach to the logical federation of OLAP and XML data sources. This makes it possible to reference external XML data in OLAP queries, which allows XML data to be presented along with dimensional data in the result of an OLAP query, and enables the use of XML data for selection and grouping. Special care is taken to ensure that semantic problems do not occur in the integration process. To demonstrate the capabilities of this approach, we present a multi-schema query language based on the SQL and XPath languages. A complete federated system is also presented, covering all important areas of a federated approach to the integration of OLAP and XML. This work includes a complete formal background, a collection of algebraic rewrite rules, architectural and procedural design, and several effective cost based optimization techniques. A prototype is being developed and initial experimental studies have been conducted, indicating that our federated approach is indeed a feasible alternative to physical integration. Thus, our federated approach provides a powerful and flexible way to handle unexpected or short-term data requirements as well as rapidly changing data. As almost all data sources can be efficiently wrapped in XML format, the approach also allows the logical integration of external data from sources such as relational, object-relational, and object databases, opening up totally new application areas for OLAP.
In this paper we present a flexible and theoretically well-founded approach to the logical federation of OLAP and XML data sources. This makes it possible to reference external XML data in OLAP queries, which allows XML data to be presented along with dimensional data in the result of an OLAP query, and enables the use of XML data for selection and grouping. Special care is taken to ensure that semantic problems do not occur in the integration process. To demonstrate the capabilities of this approach, we present a multi-schema query language based on the SQL and XPath languages. A complete federated system is also presented, covering all important areas of a federated approach to the integration of OLAP and XML. This work includes a complete formal background, a collection of algebraic rewrite rules, architectural and procedural design, and several effective cost based optimization techniques. A prototype is being developed and initial experimental studies have been conducted, indicating that our federated approach is indeed a feasible alternative to physical integration. Thus, our federated approach provides a powerful and flexible way to handle unexpected or short-term data requirements as well as rapidly changing data. As almost all data sources can be efficiently wrapped in XML format, the approach also allows the logical integration of external data from sources such as relational, object-relational, and object databases, opening up totally new application areas for OLAP.
Sprog | Engelsk |
---|---|
Udgivelsesdato | jun. 2001 |