Using THE Expat XML Parser



Converting an XML schema using EXPAT XML Parser to an internal format so that it can be used to store XML data in relational tables.

Increasingly, Extensible Markup Language (XML) is considered the format of choice for the representation and exchange of information among various applications on the Internet. The popularity of XML can be mostly attributed to its flexibility for representing many kinds of information. The use of tags makes XML data self-describing, and the extensible nature of XML makes it possible to define new kinds of documents for specialized purposes.

From the database perspective, this raises an exciting possibility. With large amount of data stored in XML documents, it should be possible to query the contents of these documents. One should be able to issue queries over sets of XML documents to extract, synthesize, and analyze their contents. In fact, efficient storage of XML documents is now an active area of research in the database community.

Cost-based strategies to derive relational configurations for XML applications have been proposed and shown to provide substantially better configurations than heuristic methods. The general methodology in these strategies is to define a set of XML schema transformations that derive different relational configurations. Given an XML query workload, the quality of the relational configuration is evaluated by a costing function on the SQL equivalents of the XML queries. Since the search space is large, greedy heuristics are used to search through the associated space of relational configurations.

LegoDB is one such cost based XML storage mapping engine. LegoDB leverages current XML and relational technologies. It models the target application with an XML Schema, XML data statistics, and an XQuery workload. The space of configurations is generated through XML-Schema rewritings and the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer.


The code for this application can be found here.