Problems in
XML data management
(Research Seminar, October 10th,
2002)
Mary Fernandez
AT&T Labs, Research
Abstract:
XML is a flexible data format that has rapidly become
the lingua franca of data exchange between inter-enterprise applications on
the Internet. Hundreds of application- and industry-specific XML
dialects already exist. Bioinformatics data, financial products,
legal documents, medical transcripts, and electronic-commerce transactions are
some examples of the diverse kinds of data that are exchanged in XML.
Because of XML's rapid adoption, however, data-management tools for XML are
still sparse and immature.
In this talk, we will consider three problems in XML data management: accessing
XML data via programmatic and query interfaces; publishing legacy (non-XML)
data in XML; and storing XML data in legacy storage systems. We will
briefly survey commercial and research solutions to each of these problems,
then focus on the problem of publishing relational data in XML. We
will describe our own solution: SilkRoute, a general, selective, and
efficient architecture for viewing and querying relational data in
XML. Lastly, we will identify some of the interesting research
problems in XML data management.
This talk is based on a full-day tutorial given at WWW 2002. SilkRoute is
joint work with Atsuyuki Morishima (Univ. of Tsukuba), Yana Kadiyska and
Dan Suciu (Univ. of Washington), and Wang-Chiew Tan (U.C. Santa Cruz).
|