Beginning XML - Part III (Building Blocks)

By Amrit Hallan
Posted Tuesday, July 20, 2004

XML documents (and HTML documents) are made up by the following building blocks:

· Elements
· Tags
· Attributes
· Entities
· PCDATA
· CDATA

This is a brief explanation of each of the building blocks:

Elements

Elements are the main building blocks of both XML and HTML documents.

Examples of HTML elements are "body" and "table". Examples of XML elements could be "my-schedule" and "date". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img".

Tags

Tags are used to markup elements.

A starting tag like mark up the beginning of an element, and an ending tag like mark up the end of an element.

Examples:

A body element: body text in between. A message element: some message in between

Attributes

Attributes provide extra information about elements.

Attributes are placed inside the start tag of an element. Attributes come in name/value pairs. The following "img" element has an additional information about a source file:

The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /".

PCDATA

PCDATA means parsed character data.

Think of character data as the text found between the start tag and the end tag of an XML element.

PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.

CDATA

CDATA also means character data.

CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

Entities

Entities as variables used to define common text. Entity references are references to entities.

Most of you will known the HTML entity reference: " " that is used to insert an extra space in an HTML document. Entities are expanded when a document is parsed by an XML parser.

The following entities are predefined in XML:

Entity References Character

< means "less than - < "
> means "greater then - > "
& means "ampersand - & "
" means "quotes - " "
' means "apostrophe - ' "

Since, right now we do not plan to go very deep into XML coding, we'll leave the data definition here, and move the future implication of XML.

Extensible Markup Language (XML), which complements HTML, promises to increase the benefits that can be derived from the wealth of information found today on IP networks around the world. This is because XML provides a uniform method for describing and exchanging structured data. The ability to describe structured data in an open text-based format and deliver this data using standard HTTP protocol is significant for two reasons. XML will facilitate more precise declarations of content and more meaningful search results across multiple platforms. And once the data is located it will enable a new generation of viewing and manipulating the data.

Consider an industry where interchange of data is vital, such as banking. Banks use proprietary systems to track transactions internally, but if they use a common XML format over the Web, then they'd be able to describe transaction information to another institution or an application (like Quicken or MS Money). Of course, they'd also be able to present the data in a pretty Web page. FYI: This markup does exist. It's called OFEX, the Open Financial Exchange format.

Under certain circumstances, if IE 4 on the PC comes across a tag with the proper contents, a function is started that gives a user the opportunity to update installed software. If you're using Windows 98, it's possible that you've seen this
process in action without knowing it was an XML application.

About the Author
Amrit Hallan is a freelance web designer. For all web site development and web promotion needs, you can get in touch with him at (http://www.bytesworth.com). For more such articles, visit (http://www.bytesworth.com/articles) and (http://www.bytesworth.com/learn) You can subscribe to his newsletter [BYTESWORTH REACHOUT] on Web Designing Tips & Tricks by sending a blank email at bytesworth-subscribe@topica.com