One Schema For Multiple XML Vocabularies

Want to have a single schema that can validate many different XML documents, even if they use different tags? This paper will show you how!

Here is a simple XML document containing information about a person's birthday and the company he works at:

<?xml version="1.0"?>
<Person>
    <Birthday>1970-01-01</Birthday>
    <Company>The ABC Company</Company>
</Person>

And here is another XML document containing the same type of information, but for an entirely different venue:

<?xml version="1.0"?>
<Soldier>
    <Unit>54th Infantry</Unit>
    <DOB>1976-12-25</DOB>
</Soldier>

Unfortunately, each document will require its own XML Schema and its own custom applications.

Suppose the documents are created by two different communities and the communities want to interoperate. To do so will require each community to translate the other communities' tag names into their own. If there are N communities, each using their own tag names, then N-squared translations are required. That's bad.

We want:

Here's how to do it. First note that person data is part of the vcard standard. Both Birthday and DOB correspond to the vcard's bday property. Both Company and Unit correspond to the vcard's org property. So, add class attributes to each element, using the vcard properties as class values:

<?xml version="1.0"?>
<Person class="vcard">
    <Birthday class="bday">1970-01-01</Birthday>
    <Company class="org">The ABC Company</Company>
</Person>
<?xml version="1.0"?>
<Soldier class="vcard">
    <Unit class="org">54th Infantry</Unit>
    <DOB class="bday">1976-12-25</DOB>
</Soldier>

Now we have interoperability. Within a community the applications simply use the tag names. Across communities, the applications use the standard class values.

What about validation? Only one schema needs to be created, and every community can use it.

Here's what the schema needs to express: "If an element has a class attribute with value "bday" then the element's value must be of type date, if the class attribute has the value "org" then the element's value must be of type string (and let's further constrain it to a string of no more than 100 characters.

Thus, the element's value is based on the value of the class attribute. This is called a co-constraint. XML Schemas don't support co-constraints, but Schematron does. So, we can create one Schematron schema that every community can use. Here's the vcard Schematron schema.

Recap

We can have one schema that is used by multiple communities, without forcing each community to use the same tag names!

Each community simply creates XML instance documents using their own specific tag names. Everyone within the community understands those tag names so there is perfect interoperability within the community. To facilitate interoperability across communities, class attributes are added to each tag. Each element's data is validated based upon the value of the class attribute. Schematron enables this co-constraint validation.

Pretty cool, aye?

Tags

Last Updated: December 31, 2007