http://www.sajim.co.za/student24.4nr4.asp?print=1
Student Work Vol.4(4) December 2002
Recasting hypertext markup language in
extended text markup language: is it the
foundation for the future of e-commerce?
M. van der Westhuizen
Technikon Pretoria
Post Graduate Diploma in Information Management
RAU University
mattievdw@absamail.co.za
Contents
1. Introduction
2. Hypertext markup language (HTML) vs. extensible markup language (XML) in
business to business (B2B) interoperability
3. Architectural layers of the Internet economy and the use of XML in B2B
interoperatibility
4. Basic architectural needs to conduct B2B over the Internet using XML standards and
Web services today
5. Universal business language (UBL) – the next step
6. Conceptual model for B2B integration
7. Descriptions for the components of architectural e-commerce
8. Examples of some of the exciting new technonolgies enabled by XML
9. XML repository requirements
10. Conclusion
11. References
1 Introduction
Hypertext markup language (HTML) and extensible markup language (XML) are the lingua
franca for publishing on the World-Wide Web.
Having gone through several stages of evolution, today's HTML has a wide range of
features reflecting the needs of a very diverse and international community wishing to make
information available on the Web.
These questions are asked: Should information professionals have basic HTML or XML
capabilities and a understanding of the Web's language? And is HTML or XML the
foundation of the future of electronic commerce?
According to the Business Internet Consortium (2002), e-business architects and business
managers, who are responsible for strategy and implementing business to business (B2B)
solutions, need a basic understanding of HTML or XML as they plan for implementations
and develop roadmaps for future business-to-business systems.
The focus of this research was on high-level components of both enabling technologies and
business processes for B2B automation. The aim was not, however, an architecture
description for direct implementation, and the research did not address the details of the
logical and physical models for e-commerce, or implementation details for each layer.
In this article, the primarly purpose is to describe the architectural components needed in a
B2B environment. This is a high level view of B2B architecture aimed at delivering a
framework for B2B standards convergence and interoperability.
2 HTML vs. XML in B2B interoperability
XML is the current hype in electronic business economy. With the down-turn in today's
economy, Web services shine on organizations like a beacon of hope. Fueled by Microsoft's
Net initiative and IBM's WebSphere platform, Web services promise a new level of
compatibility across multiple technology platforms (Bloomberg 2002).
2.1 Web services begin with XML
This straightforward language enables different systems to talk to each other. Web services
use XML to create a set of industry-standard protocols for describing and exchanging
information and handling transactions between companies. Suddenly system integration is
easier and less expensive. Today's Webmasters, Internet technology excutives, information
managers, and e-business managers, must understand what Web services are, how they can
be leveraged in system integration efforts within, and outside the firewall, and how they can
help improve the bottom line.
2.2 HTML is a standard generalized markup language (SGML)
HTML is an SGML application and is widely regarded as the standard publishing language
of the World-Wide Web.
2.3 International Organization for Standardization (ISO) HTML
ISO HTML language is an application of the International Standard ISO 8879. ISO, founded
in 1947, is a worldwide federation of national standards bodies from some 100 countries.
Among the standards it fosters is Open Systems Interconnection (OSI), a universal reference
model for communication protocols. Many countries have national standards organizations
such as the American National Standards Institute (ANSI) that participate in and contribute
to the development of ISO standards making.
According to ISO, 'ISO' is not an abbreviation. It is a word, derived from the Greek isos,
meaning 'equal', which is the root for the prefix 'iso' that occurs in a host of terms, such as
'isometric' (of equal measure or dimensions) and 'isonomy' (equality of laws, or of people
before the law). The name ISO is used around the world to denote the organization, thus
avoiding the assortment of abbreviations that would result from the translation of
'International Organization for Standardization' into the different national languages of
members. Whatever the country, the short form of the organization's name is always ISO.
SGML is a language for describing markup languages, particularly those in electronic
document exchange, document management and document publishing. HTML is an
example of a language defined in SGML. SGML has been around since the middle 1980s
top
and has remained quite stable. Much of this stability stems from the fact that the language is
both feature-rich and flexible. This flexibility, however, comes at a price, and the price is a
level of complexity that has inhibited its adoption in a diversity of environments, including
the World-Wide Web (W3C 2002).
HTML, as originally conceived, was to be a language for the exchange of scientific and
other technical documents, suitable for use by non-document specialist. HTML addressed
the problem of SGML complexity by specifying a small set of structural and semantic tags
suitable for authoring relatively simple documents. In addition to simplifying the document
structure, HTML added support for hypertext. Multimedia capabilities were added later. In a
remarkably short space of time, HTML became widely popular and rapidly outgrew its
original purpose. Since HTML's inception, there has been rapid invention of new elements
for use within HTML (as standard) and for adapting HTML to vertical, highly specialized,
markets. This plethora of new elements has led to interoperability problems for documents
across different platforms.
2.4 XML is the shorthand name for extensible markup language
According to different authors form the W3C (2002), XML was conceived as a means of
regaining the power and flexibility of SGML without most of its complexity. Although a
restricted form of SGML, XML nonetheless preserves most of SGML's power and richness,
and yet still retains all of the SGML's commonly used features. While retaining these
beneficial features, XML removes many of the more complex features of SGML that make
the authoring and design of suitable software both difficult and costly (W3C 2002).
Many arguments in support of XML have taken strength from a critical appraisal of HTML.
HTML's limitations have fuelled the call for and interest in a technology such as XML.
2.5 HTML is a fixed tag set
HTML is a fixed tag set:
It only describes documents of a single type;
HTML data are hard to process;
browsers have permitted all manner of HTML messiness to pass, unchecked, into
semi-permanent residency in cyberspace; and
HTML documents that aspire to function like applications are clogging the Internet
with client-to-server traffic.
XML will change all of this, to be sure. However, no one can realistically expect the
volumes of active, useful HTML pages to become irrelevant overnight. In fact, HTML has
an important role to play in the brave new world of XML. What is that role?
2.6 HTML certainly has its past and current relevance
HTML provides a simple way of structuring hypertext documents and of placing references
in one document which point to another. These references, called 'links', may be presented
to readers of a document in such a way that a simple 'click' summons the linked document,
which is then presented to the reader. The reader has the impression of moving from one
document to another. This simple user interface has been wildly successful and as a result
the World-Wide Web, the 'Web', has become extremely popular.
2.7 XML data are smart data
In HTML:
P266 Laptop
Friendly Computer Shop
$1438
In XML:
P266 Laptop
Friendly Computer Shop
$1438
Both may look the same in your browser, however, HTML tells how the data should look,
but XML tells you what it means (Goldfarb 2002).
With XML, your browser knows there is a product, and it knows the model, dealer, and
price. From a group of these it can show you the cheapest product or closest dealer without
going back to the server.
Unlike HTML, with XML you create your own tags, so they describe exactly what you need
to know. Because of that, your client-side applications can access data sources anywhere on
the Web, in any format. New 'middle-tier' servers sit between the data sources and the client,
translating everything into your own task-specific XML.
Nevertheless, XML data are not just smart data; they are also smart documents. That means
when you display the information, the model name can be a different font from the dealer
name, and the lowest price can be highlighted in green. Unlike HTML, where text is just
text to be rendered in a uniform way, with XML text is smart, so it can control the rendition
(Goldfarb, 2002).
Moreover, you do not have to decide whether your information is data or documents; in
XML, it is always both at once. You can do data processing or document processing or both
at the same time.
In the frenzy of the growth, much of the discipline and good practices of the mature SGML
world has been lost, and browser developers have added additional features to the markup
language such as new tags and new semantics for tags. As a result, many documents have
been created which can only be rendered faithfully on a limited number of browsers.
Common Web practice is to hide any syntactic problems detected by the browsers and thus
the reader is not aware that a page being browsed is not always faithful to the original
authored document.
The International Standard was developed in an effort to ensure that it will remain possible
for an author to produce simple hypertext for the Web and be confident that a conforming
browser will be able to render the document faithfully. ISO/IEC 15445 represents a core of
the language to be supported by all conforming browsers, authoring and validating systems.
3 Architectural layers of the Internet economy and the use of XML in B2B
interoperatibility
There is a natural structure or hierarchy to the Internet economy that can be directly traced
to how businesses generate revenues. Based upon this type of structure, we broadly classify
the Internet economy into infrastructure and economic activity categories.
3.1 What is content and how do you manage it?
In understanding content management, it may be helpful to distinguish it from document
management and knowledge management. Document management deals with maintaining
top
and storing documents. Knowledge management is concerned with making information
accessible for decision-making through index, query and search mechanisms. While content
management shares some of the attributes of both document management (storing
information) and knowledge management (accessing information), it goes beyond them to
create a system for re-purposing and using information to drive business processes.
Because 'content' encompasses a wide variety of information objects, an expanded
repository and a new system of linking are required. The traditional method of transfering a
document into a document management repository will not work with a live feed. Nor will
the standard method of linking, object linking and embedding (OLE). In fact, the most
common method of reusing content is the familiar cut-and-paste technique and does not
involve linking. Content management requires a new enabling technology to accommodate
the dynamic array of information (Interleaf 2002).
By standardizing on XML, enterprises will be able to trade with anyone, any time, without
the need for the costly custom integration work that has been necessary in the past. But this
vision of XML-based 'plug-and-play' commerce is overly simplistic. Of course XML can be
used to create electronic catalogues, purchase orders, invoices, shipping notices and other
documents needed to conduct business. But XML by itself does not guarantee that these
documents can be understood by any business other than the one that create them. XML is
only the foundation on which additional standards can be defined to achieve the goal of true
interoperability.
Figure 1 (Interleaf 2002) shows how an effective content management system can help a
company improve customer satisfaction and loyalty, increase revenue and compete with
third-party and after-market parts suppliers.
Figure 1 Content management backbone
Content management encompasses a set of processes and technologies, enabling the creation
and packaging of content (documents, complex media, applets, components, etc.) as part of
a dynamic and integrated Web-centric environment.
Ultimately, the definition is decidedly Web-oriented, bringing up two questions: Can
content be shared between Web and non-Web uses? Are there new technologies that can
assist in this process?
First, content management requires new, enabling technology like XML. Second, content is
not merely documents and words, but graphics, audio, video clips, live feeds and software
components.
Enterprise content management emphasizes the need to address content management across
all forms and formats of information stored throughout the extended enterprise. Within the
enterprise, information is created using a wide variety of methods and tools, and this
information is revised frequently. Enterprise content management takes the smallest, most
appropriate units of information and allows them to be re-purposed and delivered in a
personalized form to the individual requiring information.
3.2 Architectural layers in the Internet economy
Figure 2 (Internet Indicators 2001) describes the different architectural layers for publishing
content on the Web and XML is often described as the colloquial speech of electronic
commerce.
Figure 2 Architectural layers in the Internet economy
The infrastructure category is further divided into two distinct but complementary 'layers'.
The Internet infrastructure layer, which provides the physical infrastructure for e-commerce,
and Internet applications infrastructure, which includes software applications, consulting,
training and integration services that build on top of the network infrastructure and which
makes it feasible for organizations to engage in on-line commerce.
The economic activity categories are also subdivided into two layers: electronic
intermediaries and on-line transactions.
The intermediary layer involves the role of a third party in a variety of capacities: market
maker, provider of expertise or certification that makes it easier for buyers to choose sellers
and/or products, search and retrieval services that reduce transaction costs in an electronic
market, and other services that facilitate the conduct of on-line commerce.
The transactions layer involves direct transactions between buyers and sellers such as
manufacturers and e-tailers. While we could take the position that e-tailers are also an
intermediary between the consumers and manufacturers, the illustration above highlights the
difference between an e-tailer like Amazon.com and an electronic intermediary in the purest
sense of the term. An intermediary would also specify where a book or CD could be found
at the lowest price or shortest delivery time or some combination of criteria specified by the
consumer. By contrast, Amazon.com only displays its own catalog, prices, availability and
lead-time. Of course, it is true that for the case of e-tailers the difference between the two
topmost layers could be a matter of degree.
3.2.1 Layer one: the Internet infrastructure indicator
A physical economy critically depends on an efficient infrastructure involving
transportation, energy, raw materials and skilled workforce (Internet Indicators 2001).
Likewise, the growth of a digital economy depends on the ubiquitous presence of high speed
and intelligent electronic networks and the ability to share any type of content between all
agents in the economy. The importance of XML can therefore be understood. Accordingly,
the Internet Infrastructure Layer includes companies that manufacture or provide products
and services that make up the Internet network infrastructure. This layer includes companies
that provide telecommunications and fiber backbones, access and end-user networking
equipment necessary for the proliferation of Internet-based electronic commerce. This layer
includes the following types of companies (Internet Indicators 2001):
National and regional backbone providers (e.g. Qwest, MCI WorldCom);
service providers (e.g. America Online (AOL), Earthlink);
network equipment for backbones and service providers (e.g. Cisco, Lucent, 3Com);
conduit manufacturers (e.g. Corning); and
server and client hardware (e.g. Dell, Compaq and HP).
3.2.2 Layer two: the Internet applications infrastructure layer
Products and services in this layer build upon the above IP network infrastructure and make
it technologically feasible to perform business activities on-line. In addition to software
applications, this layer includes the human capital involved in the deployment of electronic
commerce and e-business applications. For example, Web design, Web consulting, and Web
integration through XML are considered as a part of this layer. This layer includes the
following categories (Internet Indicators 2001):
Internet consultants (e.g. MarchFIRST, Scient);
Internet commerce applications (e.g. Microsoft, Sun and IBM);
multimedia applications (e.g. RealNetworks and Macromedia);
Web development software (e.g. Adobe, Allaire and Vignette);
search engine software (e.g. Inktomi and Verity);
on-line training (e.g. Sylvan Prometric and SmartPlanet);
Web-enabled databases (e.g. Oracle, IBM DB2, MS SQL Server – only
Internet/Intranet related revenues are counted here);
network operating systems;
Web hosting and support services; and
transaction processing companies.
3.2.3 Layer three: the Internet intermediary indicator
Internet intermediaries increase the efficiency of electronic markets by facilitating the
meeting and interaction of buyers and sellers over the Internet. They act as catalysts in the
process through which investments in the infrastructure and applications layers are
transformed into business transactions.
While much has been written about a large-scale disintermediation in the transformation of
the physical to the digital economy, the Internet necessitates a new breed of intermediaries
whose roles are naturally information and knowledge intensive (Timmers, Stanford-Smith
and Kidd 1998).
In the physical world, intermediaries are distributors and dealers, whose primary role is to
increase the efficiency of distribution and to lower buyer transaction costs by locating close
to the customer population. In sharp contrast, physical proximity is not an issue on the
Internet; on-line search, evaluation, communication, coordination, assurance of vendor and
product or service quality are the important aspects in the Internet economy. Internet
intermediaries play a critical role in filling information and knowledge gaps, which would
otherwise impair the functioning of the Internet as a business channel. This layer includes:
Market makers in vertical industries (e.g. VerticalNet and PCOrder);
on-line travel agencies (e.g. TravelWeb and Travelocity);
on-line brokerages (e.g. E*trade, Schwab.com and DLJ direct);
content aggregators (e.g. Cnet and Cdnet);
portals or content providers (e.g. Yahoo! and Excite);
Internet advertisement brokers (e.g. DoubleClick and 24/7 Media);
on-line advertising (e.g. Yahoo, ESPN Sportszone); and
Web-based virtual malls (e.g. Lycos shopping).
3.2.4 Layer four: the Internet commerce indicator
Internet commerce includes companies that generate product and service sales to consumers
or businesses over the Internet. This indicator includes on-line retailing and other B2B and
business-to-consumer (B2C) transactions conducted on the Internet:
E-tailers selling books, music, apparel, flowers, etc. over the Web (e.g. Amazon.com,
and 1-800-flowers.com);
manufacturers selling products directly such as computer hardware and software (e.g.
Cisco, Dell, and IBM);
transportation service providers selling tickets over the Web (e.g. Delta, United and
Southwest);
online entertainment and professional services (e.g. ESPN Sportszone and guru.com);
and
shipping services (e.g. UPS and FedEx).
It is important to note that many companies operate at multiple layers. For instance,
Microsoft and IBM are important players at the Internet infrastructure, applications, and
Internet commerce layers, while AOL and Netscape has businesses that fall into all four
layers. Similarly, Cisco and Dell are important players at both the infrastructure and
commerce layers.
Even though the four-layer Internet economy framework makes it time-consuming to
separate revenues for multi-layer players, the framework presents a more realistic and
insightful view of the Internet economy than a monolithic conceptualization that does not
distinguish between different types of activities. Further, the multi-layered approach lets us
analyse how companies choose to enter one Internet layer and later extend their activities to
the other layers.
Each layer of the Internet economy is critically dependent on every other layer. For instance,
improvements in layer one can help all other layers in different ways. As the IP network
infrastructure turns to broadband technologies, applications vendors at layer two can create
multimedia applications that can benefit from the availability of high bandwidth. Companies
at layers three and four can benefit from improvements in both layers one and two –
providing media-rich content to consumers as well as new digital products and service
(information and software delivered on-line). This interdependence also exhibits itself in the
form of alliances where conduit and content providers or applications vendors and e-tailers
join hands to create bundled offerings that are valuable to consumers (Internet Indicators
2001).
4 Basic architectural needs to conduct B2B over the Internet using XML standards
and Web services today
The task of creating a universal XML business language is a challenging one. Most large
enterprises have already invested significant time and money in an e-business infrastructure
and are relactant to change the way they conduct electroninc business. Furthermore, every
company has different requirements for the information exchanges in a spesific business
process, such as procurement and supply-chain optimalisation. Universal business language
(UBL) envisions a world where all companies, large and small, can interact seamlessly with
their trading partners as if they were part of the same virtual enterprise. It achieves that goal
by standarizing the form of information exchange.
UBL envisions a world where all companies, large and small, can interact seamlessly with
their trading partners as if they were part of the same virtual enterprise. It would achieve that
goal by standardizing the form of information exchange. The lack of a standard for business
documents is not due to a shortage of specifications but rather to an overabundance. A
multitude of XML business libraries are already in existence. And this has created a big
interoperability problem for both users and system vendors. A company that adopts one of
these specifications is likely to find that many of the companies with which it would like to
trade are inaccessible due to incompatible definitions and XML encoding for many of the
same ordinary information elements – product and business descriptions, measurements,
dates, locations, and so on. Since use of any e-commerce standard requires significant
investment, this 'Tower of Babel' greatly increases both the cost of integration and the cost
of commercial software.
UBL proposes a single ubiquitous language for business communication that takes into
account both the requirements common to all enterprises and the specific needs of
companies in different regions and different vertical industries. Convergence on a single
standard will significantly reduce the cost of integration. By lowering the bar to adoption of
e-business technology, UBL promises to extend the efficiencies of automated enterprise
resource planning (ERP) systems beyond the individual enterprise (UBL 2001).
4.1 Documents, components and context
The primary deliverable of UBL is a set of standard formats for common business
documents such as invoices, purchases orders and advance shipment notices. These formats
are designed to be sufficient for the needs of many ordinary business transactions and, more
importantly, to serve as the starting point for further customization. To enable this
customization, the standard document formats will be made up of standard 'business
information entities', which are the common building blocks (addresses, prices, and so on)
that make up the bulk of most business documents. Basing all UBL document schemas on
the same core information entities maximizes the amount of information that can be shared
and reused among companies and applications. In a UBL-enabled world, companies publish
profiles of their requirements for the business documents involved in specific interactions
top
(UBL 2001). These profiles specify the business context of each transaction, that is, specific
parameters such as the industries and geographic regions of the trading partners. The context
parameters are applied to the standard formats to create new formats specific to a given
transactional setting. Since these context-specific formats are based on the standard
components, interoperability is guaranteed while taking into account the requirements of
each party to a particular transaction.
4.2 Opening e-commerce to small businesses
International agreement on a concrete XML syntax for business documents is the key to
bringing the majority of the world's businesses into electronic commerce (UBL 2001).
XML markup (the 'tags') transforms documents into hierarchical sets of information objects
with logical 'handles' that can easily be manipulated by simple pattern matching and text
processing tools like Perl, Python and Emacs. And free lightweight parsers can apply
rigorous structural and semantic validation to XML documents to ensure interoperability.
These two characteristics of XML documents mean that when UBL arrives, any reasonably
computer literate person with a PC and some free software tools will be able to interact with
the UBL-compliant purchasing system of a Fortune 500 company. Custom programming
with expensive data extraction and mapping software typical of electronic data interchange
(EDI) implementations will still be possible but no longer required. And taming the problem
of context-driven data requirements – met in EDI with implementation guides – should
greatly reduce the cost of adapting applications to the requirements of particular trading
relationships.
International agreement on a single vocabulary for electronic business will create an
environment where businesses of every size can automate their processes to exactly the
degree that they can afford – from manually programmed text-hacking systems at the low
end to off-the-shelf software in the midrange to completely automated and integrated
purchasing systems at the high end (UBL 2001). In addition, it will allow all of these
businesses to interoperate as if they were technologically at the same level.
The growth of the World-Wide Web demonstrated the inherent power of a standardized tag
set running over a ubiquitous transport layer. UBL plus ebXML messaging services will
allow businesses of every kind access to electronic commerce, just as HTML plus hypertext
transfer protocol (HTTP) allowed publishers of every kind access to the Internet. Like
HTML, UBL will someday be seen as a transitional technology. And like HTML, its effect
will be revolutionary.
4.3 Example of the impact of XML on everyday lives
Matty arrives at work at eight, grabs a coffee and sits down in front of her computer. Her
automated agent has posted a message on her active desktop, with links to a short company
profile, the company's 90-day stock history, and its Web site. All of this information has
been formatted according to her personal style sheet.
The company, named simply 'E-NTertainers', specializes in on-line entertainment. Matty
runs a full-text search on the company's name in her document repository of press releases
and news articles and finds three links. The most recent article explains that 'E-NTertainers'
is engaged in talks with Steve Hofmeyer about future rights to his interactive music videos.
If the deal goes through this would mean dramatically increased revenues and visibility for
'E-Ntertainers'.
Matty presumes that the deal is forthcoming and that word has already started to leak out,
pushing up the stock. She drags the category indicated in the original alert, 'on-line
entertainment', to her address book, which builds a mailing list of clients interested in
companies in this area. She drags the press release and stock history into the message
window and adds a short note asking if anyone would be interested in acquiring the stock. A
large private investor quickly replies with a request for 20000 shares at the current price of
R55,25. She drags the order from this message to her automated ordering agent, which
places the order on-line and archives the transaction record, complete with the digital
signatures of both buyer and seller.
Three hours later Matty's mailer beeps, triggered by another automated agent instructed to
notify her of any new press releases from companies in which she has recently bought stock.
Sure enough, the deal between E-NTertainers and Eden has been announced and E-
NTertainers stock is shooting up. It reaches R120,25 before levelling off.
Not bad, she thinks, not even noon and I have already made over R1300000 for my client.
Just then, she hears another beep. It is her mother's birthday. Matty's calendar program calls
up an on-line buying agent, which automatically grabs her mother's mailing address from
her address book. It prompts her to enter a short note and then goes out in search of an on-
line flower shop with delivery service to Hermanus in the Northern Cape Province. The
agent searches for one dozen red long-stem roses, at the best price, including delivery and
taxes. As for Matty, she settles back in her chair feeling vaguely guilty about how easy it all
is.
Wait a minute! Doesn't this sound a bit too much like science fiction? Just getting two
different software applications to work together is close to impossible, let alone a fully
integrated array of software agents connected to an active desktop, calendar, mail program,
financial information provider, document repository and Internet-based purchasing system.
Speaking of which, an automatic purchasing agent has a hard enough time getting consistent
price information from different on-line stores; how could it also find out where delivery is
available?
While viewing an on-line stock history is no problem, formatting it with a custom style
sheet in an e-mail message is very difficult. How could the formatting program figure out
how to rearrange the pieces of information according to Matty's style sheet? In addition,
wouldn't a full-text query on 'E-NTertainers' retrieve every document in the database?
The technology that is making scenarios like this feasible is XML, which is now clearly
emerging as HTML's equivalent for data, according to Hogan (2002). Unlike HTML, XML
data elements have well defined tags that describe their content, enabling applications and
autonomous agents to extract useful information from them.
In addition, the available tags are not fixed but can be defined in a document type definition
(DTD) for a given application. The DTD describes which tags are allowed in the document
as well as defining a hierarchy that determines where tags may occur (i.e. a 'paragraph' may
occur in a 'chapter', but not vice versa).
Although XML standards are very new, mainstream DTDs for describing software
components have already been deployed for applications needing standard-based, open
access to information.
Examples of these standard DTDs include Channel Definition Format (CDF) for describing
push content and Open Software Description (OSD). Because the investment in XML
development tools and expertise can be leveraged across a wide range of applications,
communication based on well-defined, standard DTDs will become increasingly important
in providing universal access to information across applications from different vendors
(Hogan 2002).
In Matty's case, XML DTDs for address book entries, stock price histories, company
information and the like are what enable her diverse set of applications to extract and act on
or interact with the relevant information.
Since XML provides a mechanism for tagging data with fields that describe the data, the
applications can pinpoint the exact data they need and share it in a manner that is easily
brokered between the applications. For example, searching for flowers for Matty's mother
simply entails:
Searching a directory for on-line flower shops;
connecting to these flower shops and searching for XML tags that correspond to the
appropriate delivery area for Hermanus in the Northern Cape
Province;
Then searching this subset of shops for the appropriate flowers:
Rose, Red, Long, none; and
From this subset, the agent computes the corresponding costs by searching for the
information tagged by , and and doing the
appropriate calculation. It then selects and purchases the least expensive offering.
In fact, there is more to this story. Most of the tasks that could be accomplished in this way
would occur in the background without any interaction from Matty.
4.4 XML-enabled technologies with specific reference to e-commerce (EC)
The long-expected rise of electronic commerce has been hindered by the difficulty
encountered by consumers in finding the desired product among the myriad of vendors
setting up shop on the Internet, all with different product lines, prices, on-line viewing
capabilities, delivery options and so forth. So-called intelligent agents have not helped
because they have an even harder time than humans in trying to make sense of the digital
morass presented by HTML.
With XML repository technology, on-line stores can present product information in a
standard, structured format, independent-of-page design. Electronic commerce is obviously
focused on financial transactions. Using HTML, the user must manually wade through
HTML information to extract relevant data like price and tax (Freter 2002).
Unlike text, numbers have no inherent context. In other words, price means something, but
how do you know whether a number is associated with a price, a tax, an address, or
anything? XML creates this association, making human, and machine interpretation a reality
(Freter 2002). XML is the catalyst that will finally unleash the explosive potential of
electronic commerce. The XML-aware query facilities of the repository make it possible to
retrieve relevant information directly and re-purpose it, as needed for processing by an
automatic agent or a user. By reducing the time needed to locate a product, a price or any
other relevant information on the Internet, XML repositories will play an important role in
making on-line shopping more efficient and enjoyable.
5 Universal business language (UBL) – the next step
Two previous efforts in particular lay the foundation of this B2B conceptual model.
Following a layered approach similar to the ISO and Gartner models described in the next
top
sections and taking the latest XML standards and Web services developments into
consideration, result in a model that highlights what is needed conduct B2B over the Internet
using XML standards and Web services today.
5.1 ISO open EDI view
One of the most widely used attempts at classifying electronic business transaction
standards is the ISO open EDI Reference Model (ISO/IEC 14662), as illustrated in Figure 3.
The ISO described a two-layer system, defining business transactions in terms of a business
operational view (BOV), which focuses on the business aspects of business transactions; and
a functional service view (FSV), which encompasses the information technology aspects of
these transactions (ISO/IEC 14662 1997).
5.1.1 Business operational view (BOV)
The BOV comprises:
The semantics of business data in business transactions and associated data
interchanges; and
rules for business transactions, namely operational conventions, agreements and
mutual obligations.
5.2.1 Functional service views (FSV)
The FSV comprises:
Functional capabilities, namely capability of initiating, opening, responding and
tracking the progress of transactions;
service interfaces, such as user application interfaces and transfer infrastructure
interface; and
protocols, namely security mechanism handling; protocols for inter-working of
information technology systems of different organizations; and translation
mechanism.
Figure 3 ISO open EDI reference model
5.2 Gartner Group view
A view produced by Gartner Group (2000) divides the conceptual architecture into three
major sections:
Top: business content and process standards – meaning of information and processes
Middle: message and associated structure standards – syntax
Base: messaging protocol and tools standards – communication.
A quote from the Gartner report (Gartner Group 2000) defines the terms as follows:
Meaning of information: the relationship between values in the fields and the external
world to which the data relates.
Process definitions: the business rules, the definition of the roles of the parties
involved, and the trigger events that provide the context for the exchange of
information. Process definitions should cover the complete set of business events
required to accomplish a business objective (e.g. placing an order would include steps
such as sourcing, issuing a purchase order, receiving acknowledgments and dealing
with changes) rather than just discrete steps (e.g. issuing a purchase order).
Syntax: the structure of the message, usually as a sequence of data fields.
Communication layer: the mechanisms by which messages will be transported from
party to party.
Based on the categories described above, a high-level functional block diagram of the
B2B architecture appears in Figure 4 (Gartner Group 2000)
Figure 4 Gartner B2B architecture functional blocks
Several basic architecture considerations (principles) were employed in developing this
model. They are illustrated in Table 1 (Business Internet Consortium 2000)
Table 1 Basic architectural principles
6 Conceptual model for B2B integration
Companies across all industries are realizing the fundamental benefits of using the Internet
to integrate their supply chains in an automated fashion. The potential to reduce inventory,
improve time-to-market, reduce transaction costs, and conduct business with a broader
network of supply chain partners has direct and measurable benefits to a company's bottom
line.
Figure 5 illustrates the components of the B2B architecture conceptual model for the e-
Openness: open standards, open platform.
Layered specifications: enabling technologies provide foundations to higher level
functions and business logic; complex problems are divided into manageable
layers, applying a divide-and-conquer methodology.
Loosely coupled: supports a loosely coupled approach to integrating trading
partners' business applications, as well as B2B other components, connecting
using messaging rather than programmable function calls.
Extensibility: architecture can grow over time to cover more business processes
and more industries.
Reuse: reuse of business objects and patterns; reuse of technologies and
architectural components, not reinventing the wheel.
Self-describing: each component has clearly defined interfaces to describe the
services provided and methods for interactions. The description is publishable
and searchable on a common registry.
Dynamic discovery and binding: services could be dynamically located on a
common registry and composed into more complex services or transactional
steps to serve particular business needs.
top
economy (Business Internet Consortium 2002)
Figure 5 Components of the B2B architecture conceptual model for the e-economy
Because of the benefits that result from supply chain integration, companies are exploring
open, XML-based standards to help remove the formidable barriers associated with
developing a common business language and protocol for Internet-based collaboration,
communication and commerce.
The conceptual model was developed with the input from many industry and technology
organizations and respected thought leaders. This model is divided into two general
sections: the Business Conceptual Model on the top and the Technical Conceptual Model at
the bottom.
The business conceptual model focuses on the definition, structures and formats of business
transactions, as well as the business processes that handle transactions.
The technical conceptual model provides the technical foundation that enables business
document and process definition. The model is also divided into left and right halves: the
left side represents the components that support business content (payload of business
transactions), while the right side represents layers associated with business processes.
The three vertically oriented layers (trading partner agreement, security and management)
have implications across all, or nearly all, layers.
7 Descriptions of the components for architectural e-commerce
top
Back-end integration provides hooks into the back-end enterprise systems through API or
shared messaging bus. Includes functions like business logic processing and format
transformation.
7.1 Service orientated architecture – development platform for Web services
This layer provides basic development standards and tools (Java, C#, J2EE, .NET, etc.) and
related development environments. This layer also defines APIs that 'glue' e-business
transaction systems with the back-end enterprise resource planning (ERP) systems. In this
layer we find all the components that help us in writing the code that allows us to connect to
our back-end systems but also the business logic of the e-business software itself (Business
Internet Consortium 2002).
7.2 Network transport
This layer addresses the basic messaging transport protocols needed to communicate on the
Internet, messaging services that provide for asynchronous publish/subscribe, asynchronous
message queuing and synchronous request and reply. Additionally, it addresses how
messages are placed on and off the transport bus. These standards specify mechanisms for
transporting messages in a secure and reliable way (UBL 2001).
7.3 Core XML standards
Basic XML protocols are associated W3C standards for defining document types and for
accessing the data within the documents. This syntax is used to express specifications in the
layers above for defining the representation of business content and processes (W3C 2002).
7.4 Messaging
These are standardized message and envelope structure and layout definitions, which have
specific technical purposes. This layer addresses the need to record session and
communication settings for message transport in order to enable coordination between
parties in a business transaction, including parameters that control reliable messaging,
secured messaging, etc. (UBL 2001).
7.5 Repository
These are standardized repository services that specify the structure and access protocol and
schemas for business content storage and retrieval, which includes the term, its constraints,
its representations, etc. Many repository initiatives have emerged over recent years.
Repositories can provide a very diverse set of functionalities such as the storage of business
object definitions, business processes and data-dictionaries (UBL 2001).
7.6 Registry service
This service specifies the structure and access protocol of registries and repositories that
trading entities can access to discover each other's capabilities and services. While the
repository stores actual objects, the registry provides the management interfaces and
protocols to register and discover the entries of a repository. A registry covers naming,
directory, search, dynamic binding links, privacy, authentication, authorization,
identification services, etc. (W3C 2002).
7.8 Business content format definition
Business content includes everything that composes the payload of business transactions,
such as dictionary entries, composition of dictionary entries, special business documents and
attachments. Business content format definition is the specification of the data structures,
data types, constraints and code lists of all the items necessary to compose valid business
content (Kreger 2001).
7.9 Universal business content
This specifies business terminology and accepted values that may be universally used in
business messages that support a broad range of industries, business models and locales; the
vocabulary used to construct the business content of a message. This content covers many
domains of discourse, such as product, materials management, finance and quality (Hall
2001).
7.10 Specialized business content
This refers to industry or supply chain-specific technical lexicon (terms, properties, values,
taxonomic structures) to be used to extend and specialize the universal content, to construct
the content of an industry-specific business document (Hall 2001).
7.11 Service description language
This layer describes the tools and languages for service implementation and service
interface, which is key to achieving loosely coupled architecture and reducing the amount of
custom programming as well as the effort of integration between service requester and
service provider (Kreger 2001).
7.12 Process description language
This specifies the way in which any business process (whether universal or specific in
nature) is recorded, such that it is understood and executable in a repeatable fashion by a
wide array of humans and/or applications (Kreger 2001).
7.13 Universal business process
Specifies business processes that are applicable to a broad range of businesses, regardless of
the vertical industry or locale within which the business operates or of the specific
characteristics of the business. These processes cover many domains of activity that
businesses engage in, such as collaborative product development, request for quote, supply
chain execution, purchasing and manufacturing (UBL 2001).
7.14 Specialized business process
Specifies business processes are not universally applicable but instead are specific to a
business operating within a specific industry or supply chain (such as electronic
components, pharmaceuticals or automotive) and locales or business models (UBL 2001).
7.15 Business transaction patterns
These describe allowable (legal) patterns and rules of business content exchanges following
certain business processes. The business content used in transactions should be composed of
terms defined in the universal business content and specialized business content layers,
following predefined patterns and rules. The same is true on the business process side; the
process portion of a business transaction is made up of components defined in the universal
business process and specialized business process layers, also following predefined patterns
and rules (RosettaNet 2001).
7.16 Business transaction instance
This describe an actual instance of a valid or legal business transaction following the
patterns defined in the Business Transaction Patterns layer. This is a real conversation and
information exchange between two trading partners (RosettaNet 2001).
7.17 Trading partner agreement (TPA)
This determines the dynamic creation and management of relationships between partners,
and includes profiles of trading partners' B2B infrastructure, protocols and contractual
agreements for transactions (UBL 2001).
7.18 Security
This layer spans a wide range of abstractions from basic encryption, authentication and
authorization on the core XML layer, to non-repudiation and security policies in the
business process layer. It includes the technologies used to implement both transient and
persistent security functions, and the policies that manage and apply these technologies
(W3C 2002).
8 Examples of some exciting new technologies enabled by XML
8. 1 Internet search engines
Imagine a search engine that understands and uses contextual information when performing
a full-text search. Searching for information about the Java programming language would no
longer yield links to coffee sites or the Island of Java. This is because searching for the term
'Java' is narrowed down to those fields tagged as a 'programming language'. As a result, the
speed and accuracy of the search is dramatically improved. Widespread use of XML
repository technology on Web servers will play a vital role in easing the 'information
overload' currently suffered by Internet users (Hogan 2002). For example, when searching
for information on a subject that is contained in a single chapter or even a single page within
a book, XML enables you to retrieve only that chapter or page, while HTML currently gives
you the entire book. Of course, all of these benefits require a sophisticated, scalable and fast
repository. This repository must be able to manage the rich XML links and understand XML
structure so that it indexes text based on its context and use in a document.
8.2 Self-describing binary large objects (BLOBs) and distributed object file systems
File systems today store files as BLOBs, whether they are word processing documents,
presentations, databases, pictures, CORBA objects, etc. The information is locked inside the
file and is not accessible to the file system. The file system has only minimal information
about the files. File systems could do no better with XML data. However, by leveraging a
sophisticated object-oriented structure, the contents and structure of XML data can be
indexed, searched and manipulated in a sophisticated and granular fashion. As a result, the
XML data can be managed in a distributed fashion, much like the Internet itself. A powerful
XML linking mechanism also makes it possible to tightly bind a BLOB with a set of XML
metadata describing it, sort of a machine-readable summary (Hogan 2002). Since an object
database management system (ODBMS) powered XML repository can natively manage
structured content, binary data types and arbitrary links between the two, it is the only
solution with the power to build distributed object repositories that offer scalable
management of legacy documents and XML data.
8.3 Electronic data interchange (EDI)
EDI is done today through secure, value-added networks (VANs) that map the data between
companies and their disparate applications. By leveraging XML, the applications easily
broker information between themselves. Mapping data from one company's purchasing
system to another company's inventory is just a matter of understanding the XML tags on
the data (UBL 2001). XML then becomes the universal format for EDI, enabling companies
to create ad hoc secure extranets with other companies over the Internet transport, while
eliminating the VAN intermediary.
8.4 Data repurposing
By breaking documents into discrete elements, it becomes very easy for individuals to
extract the truly relevant information from several sources and reassemble it into any format
(e.g. Web page, document or presentation). This helps to address the current information
overload, because the user receives only the relevant information. In fact, a personal agent
might even assemble the information (Business Internet Consortium 2002). This ability also
top
facilitates the acceleration of learning since it becomes much easier to assemble the 'current'
body of work on a particular subject, and then takes it a step further thus pushing the
development of human knowledge always forward.
8.5 Content personalization (intelligent pull, agent accumulation and push)
Today, people use the Internet as a news service. By defining a few key words or general
topics of interest in specific industries, you can get a fairly good news service. However,
this type of service requires human interaction to determine what is actually new. The
alternative is to monitor a few news sites. Unfortunately, this approach limits the ability to
filter and personalize the information. However, XML, combined with a sophisticated
repository, can solve all of these problems. Using XML, you could create a very
sophisticated personal news filter that spans multiple sites or the entire Internet. The XML
repository would provide the date stamp, enabling agents or search engines to filter the
information to extract only the 'new' information. Then, the information could be easily
extracted, formatted and delivered in any way that you choose, whether it is an active
desktop, a personal news Web page, e-mail or pager (Hogan 2002). This capability will
allow all individuals to create 'custom' newspapers with the latest information formatted and
delivered any way they want it.
8.6 Customized bandwidth allocation
Allocation and prioritization of bandwidth is an increasingly important issue. The allocation
of bandwidth can be accomplished at a number of points in the flow of data, including the
information serving company, their access provider, the Internet backbone, the recipient's
local service provider, or the recipient's company (of course, this is an oversimplification).
At each of these points, bandwidth can be prioritized, but the problem is how to determine
the priority and then how to bill accordingly. By attaching the appropriate XML tags, or
simply reading the appropriate tags can implement this process in an efficient and consistent
fashion (Hogan, 2002). For example, 'urgent' e-mail messages might be tagged urgent and
therefore have certain additional rights, also determined by the recipient, such as superior
bandwidth, auto-routing to a pager, placing a call to the recipient's cell phone to give a
machine translation of the message over the phone, and more. The repository plays a critical
role in associating the appropriate routing tags, managing these tags throughout the system,
identifying and billing the appropriate customer and more.
8.7 Individual content cache (local cache)
A local XML cache provides a means for more efficient utilization of the facilities enabled
by the Internet. As individuals surf the Web, transact business, compile information,
communicate, etc., a local content cache could facilitate the process in many ways. For
example, as you or your automated agent transacts business, the local cache could provide a
dated receipt storage facility. This local cache might also generate an index on the fly as you
surf the Internet, enabling you to query your past findings for that gem of a site you now
need. A local cache could also store content that is incrementally updated, based on time
stamps and element-level delta synchronization. Alternatively, you might choose to cache
selected data for off-line browsing (W3C 2002). The exciting opportunities enabled by a
local content cache are boundless.
These are just a few of the exciting technologies enabled by XML, and it is easy to
understand why XML is creating such excitement in the Internet community. As software
developers begin to implement XML applications, however, they will have to address the
need to turn these ideas into reality, while keeping up with the ever shortening development
cycles characteristic of Web development. In many cases, developers will find that their
prototypes work fine in the test laboratory but do not scale to address real world conditions
of concurrent usage and data volume. XML's rich interlinking and hierarchical naming
structure introduces a whole new set of requirements that bring solutions based on the file
system of relational architectures to their knees (Hogan 2002). An XML-wise object
repository, designed to be embedded in XML applications of all types, or the operating
system itself, is the only solution that provides the functionality and scalability required to
drive the realization of this vision of a new generation of networked applications.
9 XML repository requirements
All of the exciting applications described above require data persistence, and these
requirements are very different from the requirements of monolithic file storage that we are
used to. XML extends HTML's simple unidirectional linking, adding support for links to
multiple targets, indirect addressing and bi-directionality. Handling this rich linking requires
a storage mechanism with far more powerful management of references between objects
than that provided by the file system or relational databases. To effectively address the XML
opportunity, the storage mechanisms must also be able to understand the structure of XML
content, which is composed of a dynamic number of objects, while scaling effectively to
handle increased usage load and data volume.
Furthermore, according to the Business Internet Consortium (2002), XML's object-centric
focus will create the need for an API that enables rich object-centric manipulation from
object-oriented languages like C++ and Java. In other words, what is needed is an XML-
aware object repository. Only an ODBMS can maintain information about XML document
structure in a scalable manner while handling standard data types and BLOBs with rich
hyperlinking and navigation in the database.
9.1 File system storage of XML data
HTML storage management is usually implemented using flat file storage. This is because
HTML, lacking any definable structure, is stored as a monolithic block. Using the file
system, this way provides acceptable functionality and therefore wins out because it is
extremely easy to implement. There are a number of tools that build on the file system's
functionality, and file systems are included in operating systems at no additional charge.
XML, however, has very different storage requirements. XML applications must store and
index the fine-grained elements as well as the document structure. In addition, they must be
capable of linking these fine-grained elements directly to each other and to a variety of data
types containing associated information (Business Internet Consortium 2002). The increased
demands implied by this functionality mean that additional care must be taken to build a
system that scales under increasing load. Attempts to parse XML data of any complexity
would overwhelm the capabilities of the file system to maintain the rich linking structure
and semantics of the data.
Of course, the alternative is to store the XML data as BLOBs and then parse them on the fly
each time they are used. This results in sub-optimal performance, because of repeated
parsing. In addition, once the data had been parsed, the file system could not execute the
complex data manipulation required. In addition, this BLOB storage approach undermines
the ability to link various disparate elements into a rich tapestry of information that models
real-world usage cases. Of course, implementing all of these features on top of the file
system is a possibility, through custom development, but this would essentially recreate
object database functionality from scratch.
9.2 Relational database storage of XML data
Relational database management systems (RDBMS) are the other plausible candidate.
Unfortunately, their table-based data model is very poorly suited to the hierarchical,
interconnected nature of structured XML content. Never the best systems for managing
top
variable length data and BLOBs, RDBMSs are further hampered by the fact that they must
represent the tree structure of XML content with an inefficient set of tables and joins.
Relational databases disassemble the XML objects in order to fit them into their tabular
architecture. As a result the XML object's structure and semantics are either lost,
minimizing its value, or must be duplicated in the design of the database.
Duplicating the structure and semantics of complex XML objects in the design of the
database is very difficult, particularly if the structure of the XML data is variable, as it
usually is. The rigidity of the relational design is a poor fit with the dynamic assembly and
manipulation of XML data (W3C 2002). Relational databases also cannot handle object-
level locking; the best they can provide is row-level locking. Since relational databases
decompose XML elements into various tables, linked via keys, it is very difficult to
implement an effective locking scheme that does not dramatically hinder concurrent use and
scalability.
In concurrent editing environments, there will be an increase in demand for related objects
from disparate users. Relational databases respond by locking entire rows across multiple
tables. This can cause unacceptable performance degradation if multiple users are requesting
different objects that are locked via this broad locking scheme. If the DBA responds by
separating the information into a larger number of more granular tables, the performance is
degraded by the number of joins required to model the richly linked structure of structured
XML content. In addition, relational databases are typically too weighty to form an
infrastructure for embeddable storage and require substantial development to adapt to the
complex structure of structured XML content. Quite simply, relational databases, while
excellent for many purposes, are not architecturally compatible with the storage needs of
XML data.
9.3 Object database storage of XML data
The architecture of object databases is ideally suited to handling XML data; in fact, the
adoption of XML data could be the 'killer application' for the object database market. Object
databases are designed to handle objects in their native forms. The objects then maintain
their own data, methods, relationships and the semantics of the whole model. This is ideal
for the creation and management of hierarchical XML trees, while providing both
hierarchical tree navigation and rich link traversal (Business Internet Consortium 2002).
For example, traversing to the other side of a tree in the two dimensional relational model
forces the developer to climb up the tree, through joins, and then back down the other side.
The rich relationship linking of an object-oriented model enables both hierarchical
navigation and rapid branch traversal, reducing computation and increasing performance.
Object databases are also designed to handle arbitrary, variable-length data types and
interrelated data. This is critical due to the various data types linked within structured XML
content.
XML also enables an ever-changing Web of relationships between hyperlinked data
elements, such as on-the-fly creation of documents. Object databases, with their flexibility
and rich relationship management are ideal for managing this type of information.
Those object databases that allow object-level locking, provide for a much more granular
locking than relational or file system-based solutions. This granular locking is critical for
user scalability, since it limits the conflicts between user requests for data.
Object databases are also designed to handle larger-than-memory content, providing for
content scalability. Object databases also offer more simplified creation and management of
distributed partitions, which further addresses the issue of database volume scalability and
distributed implementation. In short, object databases were designed to address the very
requirements that XML is just now starting to force upon tomorrow's storage solutions.
Just as most software developers would never consider developing their own encryption
technology for a new product, licensing it instead from a third-party vendor selling best-of-
breed solutions, it makes sense for XML developers to seek a third-party storage engine that
instantly provides them with the feature set they require for effective XML data
management. These applications need a mature database engine well suited to the structure
and data types associated with XML.
9.4 Ideal repository for XML data
The ideal XML repository should address the needs of XML as well as the needs of the
associated applications. In evaluating the requirements of applications in this field, the
following criteria are critical:
Scalability
Language support
Ease of programming
Embeddability.
Scalability will be very important since the XML applications described above will run on
both the client and the server. It is important that the object database scale down as well as
up, while leveraging the same APIs, to simplify application development. Language support
is also important. Ease of programming is critical due to the compressed development
cycles, particularly in the Internet. Embeddability encompasses two criteria – zero-
management and low memory footprint. Embeddability is an interesting issue since it has
both short and long-term ramifications (Business Internet Consortium 2002).
In the short-term, embeddability is important because most initial XML applications,
lacking sufficient XML support in the file system, will build-in this support. However, in the
long-term, the object database will replace the standard file manager running on top of the
file system. This of course will make embeddability an absolute requirement.
Storing and retrieving XML data are only the start. In addition to standard XML data
storage, the ideal repository would offer tightly integrated XML-specific tree navigation,
versioning, management of arbitrary links, import and export, publishing of structured
content on the Web, support for object-oriented programming languages as well as common
scripting languages, and more. Building these facilities on top of a object database are non-
trivial, yet they are critical to the actual process of managing and manipulating the XML
data stored in the object database.
10 Conclusion
E-commerce, enterprise resource planning (ERP), database, file sharing and similar
applications are mission critical. When the infrastructure breaks down, these applications
stop working, and the losses, both in revenue and productivity, start accruing (Morrissey
2001).
While XML has evolved from SGML and HTML, its impact will not be evolutionary – it
will be revolutionary! XML will transform the Internet from a massive collection of
unmanageable data into the intelligent transport we have all been waiting for. The
development community is just beginning to recognize the far-reaching benefits of XML as
top
a standard, flexible, structured content format. Existing XML DTDs, such as CDF for push
channel management and OSD for on-line software distribution, merely hint at the types of
applications that will benefit from XML's unique characteristics. The announcement by Bill
Gates that future versions of Microsoft Office will support XML virtually assures its broad
adoption (Business Internet Consortium 2002).
Because XML applications have a far wider scope than their HTML counterparts, in terms
of both application domain and type of data being managed, existing solutions for HTML
storage are not sufficient for XML. File system-based solutions do not support XML's rich
linking and hierarchical tree structure, negating XML's added value. Relational databases
are based on a two-dimensional, table-based architecture that is also ill suited to the needs of
XML. Attempts to force a relational database into storing XML will result in sub-optimal
performance, concurrent access and scalability due to the architectural mismatch between
XML content and relational databases.
The only database architecture that is suited to the demands of XML is the object database.
Only object databases support granular element access and locking for superior concurrent
access, rich high-performance hyperlinking and fast hierarchical navigation.
11 References
Bloomberg, J. 2002. XML and Web services unleashed. Wakefield, Maccachusetts: SAMS
Publishing.
Business Internet Consortium. 2002. High-level conceptual model for B2B integration –
XML convergence workgroup. Version 2.0. 12 March 2002.
Freter, T. 2002. XML: document and information management. Sun microsystems.
[Online]. Available WWW: http://www.sun.com/980908/XML/.
Gartner Group. 2000. OAGI: fostering standards evolution or revolution? A Gartner
advisory research note. 14 December 2000.
Goldfarb, C.F. 2002. XML in an instant: a non-geeky introduction. [Online]. Available
WWW: http://www.XML.org/XML/goldfarb.shtml.
Hall, R.E. 2001. W3C Web service position paper from Intel. W3C Workshop, April 12.
Hogan, M. 2002. XML: The foundation for the future. [Online]. Available WWW:
http://www.XML.org/XML/XML_foundation_future.shtml.
Interleaf, 2002. Putting XML to work: advantages of content management. [Online].
Available WWW: http://www.xml.org/xml/putting_xml_to_work.shtml.
Internet Indicators. 2001. January 2001 Internet economy indicators. [Online]. Available
WWW: http://www.Internetindicators.com/.
ISO/IEC 14662,1997. Information technology – open EDI reference model. First edition.
December 15.
Kreger, H. 2001. Web service conceptual architecture 1.0. June.
top
Morrissey, P. 2001. Infrastructure – The survivors guide to 2002. Network Computing
(December). [Online]. Available WWW:
http://www.networkcomputing.com/1226/1226f8.html.
RosettaNet. 2001. RosettaNet architecture conceptual model. July
Timmers, P., Stanford-Smith, B. and Kidd, P. 1998. Opening up new opportunities for
business. Proceedings of the G8 Global Marketplace for SMEs Electronic Commerce
Conference. Manchester, September. Cheshire Henbury, Macclesfield, UK.
UBL. 2001. UBL: the next step for global e-commerce. UBL Marketing Subcommittee. 26
December.
W3C. 2002. XHTMLTM 1.0 The extensible hypertext markup language (Second Edition).
[Online]. Available WWW: http://www.w3.org/TR/xhtml1/.
Disclaimer
Articles published in SAJIM are the opinions of the authors and do not
necessarily reflect the opinion of the Editor, Board, Publisher, Webmaster
or the Rand Afrikaans University. The user hereby waives any claim
he/she/they may have or acquire against the publisher, its suppliers,
licensees and sub licensees and indemnifies all said persons from any
claims, lawsuits, proceedings, costs, special, incidental, consequential or
indirect damages, including damages for loss of profits, loss of business
or downtime arising out of or relating to the user’s use of the Website.
ISSN 1560-683X
Published by InterWord Communications for the Centre for Research in Web-based Applications,
Rand Afrikaans University