RelaXML - a Tool for Transferring Data between Relational Databases and XML Files

Studenteropgave: Kandidatspeciale og HD afgangsprojekt

  • Steffen Ulsø Knudsen
  • Christian Thomsen
4. semester, Datalogi, Kandidat (Kandidatuddannelse)
This report describes the platform independent tool RelaXML that can be used for transferring data between relational databases and XML files. The tool uses SAX technology and is thus able to handle large files.
The format of the XML file generated by RelaXML is specified by the user. Many formats -- also grouping of similar elements -- are supported. Transformations, which should be applied to the data when exported, can be defined. For example, it is possible to encrypt sensitive data or convert between units.
It is often required that the exported XML files can be reimported into the relational database. For instance, this is the case when the XML files have been updated or if the data should be imported into a new database. If some simple conditions are fulfilled, RelaXML is capable of importing the data again.
When doing an export, RelaXML gives guarantees about whether it is possible to import the data again. Furthermore, RelaXML offers possibilities for deleting data in an XML document from the database.
When an updated XML document is imported, RelaXML ensures that occurrences of redundant data are updated consistently. The user is allowed to update values in the XML and is not required to provide explicit informations about which values have been changed.
In the report, formal descriptions of the export and import operations are given. Further, design and implementation issues are described. A performance study shows good performance. The study shows that import and export through RelaXML have an overhead of about 100\% compared to direct use of SQL through JDBC.
The main contributions of the report are the guarantees on importability at export time and the ability to make very powerful and flexible transformations of the data both when exporting and importing.
SprogEngelsk
Udgivelsesdatojun. 2004
ID: 61061691