Monday, January 26, 2009

Relational/XML hybrid database design

Lately, I have been reading about the database design (particularly, for relational and XML hybrid storage).

I felt that a column design (for complex types, like say purchase order) having XML data (like the native SQL type, 'XML') is quite useful, as compared to distributing the whole schema (say, the purchase order - PO schema) across multiple relational tables and columns (with simple types).

Following are some of the advantages for using the SQL type, 'XML' in RDBMS tables:

1. The database and nearly the whole application becomes shielded from the Schema changes. If say something in the PO Schema changes, we only change the PO XML Schema. The RDBMS data model need not change (the columns of the table would still be same, say a, b, po where the column po is of type 'XML'). The SQL statements in the application programs (say, the JDBC statements) also need not change.

The impact of the Schema change on the application will be quite less, as compared to the situation, if the data were kept in simple type columns.

2. Designing a complex type (say, a purchase order) in a 'XML' column makes good design sense. The complex type most of the times has hierarchical structure, as is XML. So it is a good design decision to design complex types as XML. This also has advantage during validation of the data during data inserts and updates. The XML data as a whole get's written to the database, and an internal schema in the database validates the incoming XML data. This is much more simpler and natural than validating the data, had it been stored in multiple tables and columns.

3. The XML data in the relational/XML hybrid database can be accessed using XQuery, and also updated using XQuery update facility.

I found DB2 with pureXML having an excellent support for all these features.

No comments: