Home
Tech Work
EU Developer Forum
Java Strasbourg
JGP.log
Perrin
Links
About
Related Reading
Click a partner
Home arrow Tech Work arrow XML and Databases arrow Towards a Standardization of Database Metadata in XML
Towards a Standardization of Database Metadata in XML PDF Print E-mail
User Rating: / 3
PoorBest 
Wednesday, 30 August 2006

Having knowledge of how you database is structured is not easy. It is often available through complex tools accessible to DBA (Database Administrator), not even developers. Management may not have access to those tools. This structure is data about the data, it is called metadata.

This article introduces the needs for a XML dialect to store the database's metadata.

Why is it important?

In our times, people want to measure impacts more closely than ever. What happens when my supplier delivers updates of the software? What impacts has this change on your database? What impacts has it on your data?

ITIL (IT Infrastructure Library) recommends that changes should be tracked in the CMDB (Configuration Management Database). Other recommendations, including Sarbanes-Oxley, impose such archives. And simply, it is a best practice with a lot of common sense.

The problem is that there is no standard way to access, store & exploit this information.

Why is it important to have a standard?

It is important to have a standard way of storing those elements because most companies, those days, have multiple databases and multiple vendors supplying the data servers itself. It is not rare to find a shop with an ERP on Oracle, a retail application on Informix and a few web sites handling data from MySQL. Nobody is shocked.

Tools to access all those databases exist from Quest, Embarcadero, etc. But a lot of tools remain proprietary (like Command Center of DB2) or dedicated such as AGS ServerStudio for Informix…

This is the purpose of an XML dialect for storing database schema (aka database structure or database metadata).

Why not DDL (Data Definition Language)?

DDL uses SQL (Structured Query Language) and is proprietary to every vendor, and may evolve over time. DDL is not easy to parse, unless you are fluent in EBNF (Extended Backus–Naur Form).

Seeing differences between two versions of DDL can be done by using the Unix diff command, which results is not very user-intuitive. Just try to bring the result of a diff over 2 DDL files, to your CIO (Chief Information Officer), for fun.

Requirements

Be extensible: all databases are different. Storing the different capabilities is not the goal, but it is critical to see where things differ and to be sure that you have a format that follows the evolution and your data server assets.

Proposal

Based on all those constraints & needs, we propose DBMXML, for DataBase Metadata eXtensible Mark-up Language

What benefits can we expect of it?

Offer a vision of a database. Applying a simple XSL (eXtensible Stylesheet Language) filter will give you HTML, PDF or even Word reports of what your database looks like.

Comparing two versions of the database. Going from development to production? What impacts has the new development on my production database (yes, all this should be documented, but real life is often catching on us).

Keeping track of the evolution. What is the difference between my production, testing and development database?  

Improve consistency. If your customers’ last name is 32 characters long in your ERP and only 10 characters on the web, you have a consistency problem which may result in data loss or alteration at some time.

Measure the impact of changes. If the result of the comparison of 2 version of your schema reflects that a field gets from 40 to 50 characters long has different consequences that adding a column that does not allow NULL values.

More benefits

For the CIO: have a permanent, consistent and historic vision of its databases.

For the project manager: have a consistent way to discuss and share information about database structure to his developers.

For the developer: Developers will be more cautious and aware about changes at the database level and will adapt the source code accordingly in a safer way.

For the DBA: increased visibility of the database, ability to build and use more consistent scripts, tools, etc.

Extensions

A variety of tools can be build on top of DBMXML: code generator, documentators, graphers, productivity scripts, administration tools…

Practical applications of such a language

XML2DDL or DdlUtils uses an XML dialect to create the DDL for creating or altering a database. Awoma proposed a framework, called ThinStructure, where it dynamically used an XML repository for building records, SQL statements, checking integrity…

Why not reuse someone else’s work?

We are not affected by the NIH (Not Invented Here) syndrome, but after very extensive search, over time, we did not find anything that would fulfill the needs. DBMXML is being designed with transformation in mind, meaning that others can use it.

Similar work

XML2DDL is a project to transform an XML file to DDL, currently for PostgreSQL, MySQL, Oracle and Firebird. It does not define a standard XML dialect. Problem: XML is not documented. Project lead: Scott Kirkwood, scott at forusers dot com. More details at http://xml2ddl.berlios.de.

DdlUtils is an Apache project focusing on delivering an easy-to-use component for working with Database Definition (DDL) files. These are XML files that contain the definition of a database schema, e.g. tables and columns. Problem: XML is documented by a DTD but limited (e.g. no information on stored procedure, no “extension” possible…). Project lead: Thomas Dudziak, tomdzk at gmail dot com. More details at http://db.apache.org/ddlutils.

What’s next?

The next steps will be to design the XML dialect that would achieve DBMXML.

Links

 

 

Last Updated ( Thursday, 31 August 2006 )