Unifying Corporate Documents and Data
by Thomas Erl

When reading about XML these days, you are bound to encounter articles that discuss the use of XML in two distinct areas: Web development and Web authoring. On the development side, the focus is typically on data management and application interoperability, whereas in the Web authoring world, XML is discussed as a meta-enabling technology that will add structure and "searchability" to traditional HTML documents.

Take a step back and look at both of these worlds and how they exist in your organization. If you have standardized on the use of XML for the development of your applications (and the data they manage), and if you have standardized on the use of XML meta tags as a means of structuring and categorizing your corporate documents, what do you have? You have created a global information framework with a common data classification and management platform. In other words, you have broadened your corporate information repositories to include not only information in application databases, but all of your corporate document sets.

Sounds pretty impressive, but how exactly did you achieve this? To answer that question, let's first look at how most organizations currently access their corporate information.

As you probably know, data used by applications and reporting tools is kept in databases accessed via specific protocols, such as ODBC and OLEDB, which support SQL-compliant queries of database records. Documents, on the other hand, usually reside on the local LAN, accessed via shares or intranet sites. Indexing engines provide full-text searching of document content, and some search products offer limited meta tag query parameters which are often based on high-level meta information, such as author, title, data modified, etc.

In this environment, it is not possible to run a query targeting specific types of data within documents, as they are not classified. Additionally, it is difficult - if not impossible - to run a single query across both sets of information. If you did want to search databases and documents, you'd have to write two separate queries and process the results with your application.

How does XML change all this? Well, by restructuring your documents using XML, you have turned every one of your corporate documents into a mini-repository, as searchable as any standard relational database.

Web applications which can access the corporate documents over a local intranet can perform queries based on specific data elements (as opposed to the standard full-text searches), while performing searches across databases at the same time. This means that you can now write a query like this:

"Search all budget forecasts created by human resources managers over the past week, and compare requested budget totals with totals actually spent over the past five years."

Here, the budget forecasts can exist as XML documents and the budget information from the past five years can reside in a database.

XML-compliant databases and search engines may be able to support single queries like this, but an emerging technology developed by the World Wide Web Consortium (W3C), called XQuery is worth noting. XQuery provides SQL-like functionality but is designed specifically as a query language for XML repositories.

This exciting new platform of unified information access will open up many new opportunities for data management, reporting and application interoperability. But that's another saga...

SOA Design Patterns by Thomas Erl
Foreword by Grady Booch
With contributions from David Chappell, Jason Hogg, Anish Karmarkar, Mark Little, David Orchard, Satadru Roy, Thomas Rischbeck, Arnaud Simon, Clemens Utschig, Dennis Wisnosky, and others.
Web Service Contract Design & Versioning for SOA by Thomas Erl, Anish Karmarkar, Priscilla Walmsley, Hugo Haas, Umit Yalcinalp, Canyang Kevin Liu, David Orchard, Andre Tost, James Pasley
SOA Principles of Service Design by Thomas Erl
Service-Oriented Architecture: A Field Guide to Integrating XML and Web Services by Thomas Erl
Service-Oriented Infrastructure:On-Premise and in the Cloud by Raj Balasubramanian, Benjamin Carlyle, Thomas Erl, Cesare Pautasso
Next Generation SOA:A Real-World Guide to Modern Service-Oriented Computing by Pethuru Cheliah, Thomas Erl, Clive Gee, Robert Laird, Berthold Maier, Hajo Normann, Leo Shuster, Bernd Trops, Clemens Utschig, Torsten Winterberg
SOA with .NET & Windows Azure: Realizing Service-Orientation with the Microsoft Platform by David Chou, John deVadoss, Thomas Erl, Nitin Gandhi, Hanu Kommalapati, Brian Loesgen, Christoph Schittko, Herbjorn Wilhelmsen, Mickey Williams
SOA Governance:
Governing Shared Services On-Premise & in the Cloud
by Stephen Bennett, Thomas Erl, Clive Gee, Anne Thomas Manes, Robert Schneider, Leo Shuster, Andre Tost, Chris Venable
SOA with Java by Raj Balasubramanian, David Chou, Thomas Erl, Thomas Plunkett, Satadru Roy, Philip Thomas, Andre Tost
Modern SOA Methodology: Methods for Applying Service-Orientation On-Premise & in the Cloud by Raj Balasubramanian, David Chou, Thomas Erl, Thomas Plunkett, Satadru Roy, Philip Thomas, Andre Tost
Cloud Computing: Concepts, Technology & Architecture by Thomas Erl, Zaigham Mahmood, Ricardo Puttini
Cloud Computing Design Patterns by Thomas Erl, Amin Naserpour

For more information about these books, visit: www.servicetechbooks.com

Arcitura Education Inc.
Arcitura Education Inc. is a leading global provider of progressive, vendor-neutral training and certification programs, providing industry-recognized certification programs for a range of certifications.
For more information:
SOA Certified Professional (SOACP)
The books in this series are part of the official curriculum for the SOA Certified Professional program.
For more information:
Cloud Certified Professional (CCP)
The books in this series are part of the official curriculum for the Cloud Certified Professional program.
For more information:
Big Data Science Certified Professional (BDSCP)
The books in this series are part of the official curriculum for the Big Data Science Certified Professional program.
For more information: