by Thomas Erl
Despite the many ways XML has been utilized, its core purpose has always been to help the Internet separate presentation from content. By its very nature as a meta language, XML supplies a framework for the creation of custom meta tags.
Aside from multi-media content, a standard Web page generally consists of HTML tags that supply formatting and layout commands to the browser, and ASCII text that makes up the Web page content. XML tags add intelligence to Web page content, by defining a document structure (such as an invoice), and by identifying and classifying each piece of relevant text within the document body (such as the invoice number, customer name, invoice amount, etc.).
Theoretically, all documents within an organization could be XML-enabled, providing a universal standard classification and categorization system. The most immediate benefit would be the addition of advanced searching capabilities, which could substantially streamline the process of locating information.
Enhancing the quality of corporate content this way, however, will ultimately lead to a superior document-sharing environment with advanced document management capabilities. The accessibility and programmability of these documents would be comparable to (and even interoperable with) information stored in traditional relational databases (as explained in Unifying Corporate Data and Documents).
Using XML this way is not a huge leap of faith anymore. The technology is part of the mainstream, simple in nature, and produced by the same standards organization responsible for SGML (the parent language of HTML and XML).
It is the actual implementation within an enterprise, however, where the challenges lie. Standardizing on XML as the document format for all published documents will significantly affect many parts of an organization, and will undoubtedly require a large and expensive conversion effort. Here are some of the major issues to look out for:
Vocabulary and Schema Authoring
Perhaps the most important factor to consider when introducing XML on any level, is the creation and maintenance of XML vocabularies, their associated document structures, and the encapsulation of corporate business rules in DTDs or Schemas. In larger organizations, this responsibility can evolve into a new role: the XML Data Custodian.
New tools will likely be required in order to allow non-technical users to convert existing documents or author new documents in the XML format. XML editing and publishing products come in all shapes and sizes. The key is to find a set of tools that are easy to use, yet still robust enough to handle large volumes of documents, batch processing, and multi-document meta tagging. It is also wise to settle on a set of tools compatible with your existing technical environment, as it may become necessary to integrate these products with your existing applications, or perhaps programmatically automate portions of the conversion process.
Any existing publishing process you may have will need to be altered in order to incorporate new publishing steps based on the use of new editing and publishing tools. If your organization uses a centralized model to manage the publishing processes for your intranet, for instance, the impact of XML will be limited to publishers (as opposed to users) having to learn new editing and publishing tools.
Corporate Intranet Upgrade
In order to support the .xml document format and take advantage of the new searching and document management features introduced by the XML format, your intranet may need to undergo an upgrade. Also, because an XML document adds meta data to the document content, it may result in a requirement for increased storage space.
Existing Document Formats
Another major obstacle to establishing XML as a standard document format within some organizations, is that a standard document format may already exist. MS Word, for example, may be the most popular document format in the world. Experts have speculated, however, that it is unlikely that Microsoft will ever want Word to fully support an open XML format, as it would lessen the strategic value of its proprietary .doc format. This will potentially increase the effort of converting Word documents and may require the involvement of third-party tools.
Deciding on XML as your standard document format is a pretty safe bet, as it is highly unlikely that another meta-enabling technology will surface and succeed XML as the industry-standard meta document format during the next decade.
Adopting XML as the standard format for documents within your organization, however, will require a significant financial investment, as well as a great deal of effort. The benefits, though mostly strategic, are likely to reward you for years to come; as far as information access goes, XML is as good as it gets.
- Inside XML Schemas
- SOAP in a Nutshell
- Transforming Data with XSLT
- Understanding DTDs
- Why SAX is Good for DOM
- What You Should Know about XPath
- An XHTML Primer
- XLink - Inside and Out
- Data Access with XQuery
- XSL versus CSS
- Another Introduction to XML
- Unifying Corporate Data & Documents
- Replacing HTML Documents with XML
- Meta-Enable Your Enterprise
- The XML Data Custodian
- Integrating XML into the Enterprise
- The Wireless Enterprise
Foreword by Grady Booch
With contributions from David Chappell, Jason Hogg, Anish Karmarkar, Mark Little, David Orchard, Satadru Roy, Thomas Rischbeck, Arnaud Simon, Clemens Utschig, Dennis Wisnosky, and others.
Governing Shared Services On-Premise & in the Cloud by Stephen Bennett, Thomas Erl, Clive Gee, Anne Thomas Manes, Robert Schneider, Leo Shuster, Andre Tost, Chris Venable
For more information about these books, visit: www.servicetechbooks.com