Sunday, September 26, 2010

XML SCRIPTING

XML Language can be understood as a generic language used to describe other markup languages. You need to understand that XML makes a clear distinction between the markup and the content of the webpage. Here markup implies tags and attributes that are being used in the XML document and content refers to the information being presented in the document.

E.g.
< p > XML is used to store data in a structured way < /p >

In this example < p >…. </p > refers to the markup being used in the document and the text written between these tags refer to the content of the document.

You can say that markup is actually used to describe the presentation of the content. This is done using standard tags and attributes that are available in HTML. You will find that the XML markup is generally used to describe the content of the document and is not related with the appearance of the document.

E.g. < quiz answer=”Qutab Minar” > Can you name a famous monument in delhi?< /quiz >

In this example < quiz > tag is being used to describe the type of content and the answer attribute specifies the answer for this question.

To start using XML effectively you need to learn about the terminology used in XML and understand the structure of a XML file. Consider following example:

< catalog >
< movie >
< title > Jung < >
< duration > 3 hrs < /duration >
< /movie >
< /catalog >

As you can see XML files have hierarchical structure. Each tag used in XML defines an element. Each element defined should have an opening as well as a closing tag. E.g. < catalog > has opening as well as closing tag. You will find that some of the elements are self-contained. You do not need to enclose any information in them. These tags can be considered empty element. Such tags can be made self-closing by adding "/ > >" at the end of the opening tag. The hierarchical structure enables easy parsing of the document. As in above example catalogue contains information about movie which ultimately contains detail about title and duration of the movie.

XML Syntax

Check out the following example:

Line 1: < ?xml version="1.0" ? >
Line 2: < library >
Line 3: < type="Operational Research" year="1992" >
Line 4: < book1 > Linear Programming < /book >
Line 5: < book2 > Non Linear Programming < /book >
Line 6: < book3 > Mathematical Programming < /book >
Line 7: < /book >
Line 8: < /library >

The first line is a processing Instruction. Processing Instruction is used to define the XML version of the document. From Line 1 you will find that the example written conforms to the 1.0 specification of XML:
< ?xml version=”1.0”? >

2 defines the first element of the document which is the root element:
< library >
next lines define child elements of the root i.e. Book which further has child elements (book1, book2, book3).

You can see that an XML documents use a self-describing syntax which is very simple to understand. Before you read further about xml scripting you need to be aware of the major components of an xml document. XML mark-up document can be broadly divided into a set of components which describe the makeup of a XML document. These components can be defined as follows:

1. Element Tag: An element can be understood as a piece of information that corresponds to a tag or a set of tags in a XML document. In other words element can be understood as a logical piece of markup that is represented as a tag in a XML document. E.g. In above example ‘quiz’ is an element which has been used as < quiz > &lgt; /quiz > tag in the document.


Note that an element need to have both starting and ending tags like < quiz > …< /quiz > ,< p > …< /p > or a simple empty tag like < img/ >. While coding in HTML empty tag < br > do not need to have end tag. However with XML be careful you need to close every tag.

2. Processing Instruction: Apart from markup and content you will find processing instructions written in a XML document which is the first statement in the document. A processing instruction can be understood as a special command passed along to the program which will process the document. Processing instruction written in < ?.....? > .
E.g. < ?xml version="1.0"? >

This processing instruction is the first statement of a XML document. You will find that the processing instruction is similar to a tag. It includes name and attribute/value pair. This processing instruction tells that the document adheres to the standard of xml version 1.0.

3. Comments in XML:

In a XML document comments can be written using following syntax:

< !-- In this document you are learning about xml -->

Note: You can write comments in XML in the same way as you write in HTML.

4. Document Type Declaration: It is used for describing the structure of an XML document. It identifies the external DTD that defines the structure of an xml document. The external DTD( DTD stands for Document type definition) is created for describing the structure of the xml document. You need to put the ‘Document Type declaration’ on the top of the xml document. It is written just below the processing instruction. Its use is to perform three basic tasks:

1. Document Type Declaration is used to identify the root element of the document. In an xml document there is a root element such that all other elements are the children of the root element.

2. Identifies the external DTD of the file. An xml file is created according to the document structure defined in the DTD.

E.g. Check out the XML below which describes audio/video collection

< ? xml version="1.0"? >
< !DOCTYPE entertainment SYSTEM entertainment.dtd >
< entertainment >
< Audio >
< track1 > Tara Rampam < >
< track2 > Let's go for party < /track2 >
< /Audio >
< Video >
< track1 > Jumanji < /track1 >
< track2 > Home Alome < /track2 >
< /Video >
< /entertainment >

In above example first line is a processing instruction which shows that this document should be processed according to the xml version 1.0 standards. In second line is the document type declaration which states that the root element for this xml file is ‘entertainment’. Further it identifies that the document need to be verified according to the external DTD namely “entertainment.dtd”. While processing this file browser needs to look for “entertainment.dtd” and then validate the document structure according to this file.

Just quick recap

I am sure you must be able to answer following questions:


1. How XML differs from HTML?
2. What is advantage of XML?

Difference between HTML and XML

You need to understand that you will not use XML for replacing HTML. Both XML and HTML have been designed for different goals which can be summarized as follows:


a. XML is designed specifically for describing and structuring the data where as HTML is used for formatting and displaying the data.

b. XML is focused on defining data with its attributes. It basically tells what data is all about. HTML is focused on presentation of data and is used to customize looks of data.

c. In case of HTML Document tags to be used and the structure of the documents are predefined. While using HTML you can only use tags which are pre-defined in the HTML standards. In case of XML you can define your own tags and develop your own document structure.

d. An XML document is saved with an extension .XML whereas an HTML document is saved as .HTML.

E.g. The following example is an e-mail from Ram to Shyam stored as XML

< email >
< to > Ram </to >
< from > Shyam < /from >
< subject > Hi how are you? </subject >
< content > Let’s go for a New Year party </content >
< /email >

In above example e-mail has been stored using XML markup language. You can see that own tags have been created to store the names of sender and receiver. Similarly different tags have been created to store the subject and content of the web page.

CHARACTERISTICS OF XML

XML stands for ‘Extensible Markup Language’. It is a general-purpose specification which is commonly used for creating custom markup languages. It is an extensible language as it provides its users an ability to define their own elements. Thus it enables users to create custom tags that suit their requirement. XML has been primarily developed to information systems share their structured data online. It can be used to encode documents as well as to serialize data so that it can be efficiently used. Some of the features of XML have been summarized below:
1. XML can be understood as an extensible language which is freely available.
2. XML tags are user made tags. They are not predefined tags. In case of HTML predefined tags are used (like < p >, < h1 > etc.). While using XML users can define custom tags and develop document structure as per their requirement.
3. XML is not a replacement for HTML. It is actually a complement to HTML. Both scripting languages have their own purpose. As web is developing XML is being popularly used to describe and structure the data where as HTML is being be used for formatting and displaying the data.


4. XML has been inherited from SGML


Let us define SGML


SGML:SGML implies Standard Generalized Markup Language. SGML is an ISO standard that defines an extremely powerful markup language. It is popularly used in the publishing industry and large manufacturing companies. It is a meta language used for creating other markup languages such as HTML. It marks the origin of XML.




XML

XML can be understood as markup language like the Hypertext Markup Language (HTML) which is commonly used for scripting web page. XML is specifically designed to describe data so that it can be effectively stored online. Web today contains such vast information. XML enables structuring of data so that it can then be mined to get suitable information. In case of XML unlike HTML there are no predefined tags. XML can also be called as self-descriptive markup language as users need to define their on tags.

For better understanding check out the example below:

Suppose you are storing information about a set of books. You may store the information in html as follows:

Book.html

< html >
< head > < title > Storing Information < /title >
< body >
< p > Linear Programming by A.S. Bajaj

< p > Marketing Research by Kotler

< /body >
< /html >

Book.xml

< catalog >
< book >
< title > Linear Programming < title >
< author > A.S Bajaj </author >
< /book >
< book >
< title > <Marketing Research="" <br=""> < author > Kotler < /author > <br /> < /book > <br /> < /catalog > <br /> <br /> In the above example you can see that you can easily define data in an XML file. The file shows that a catalogue of books is being developed which contains title and author detail of the book. You can see that XML the file size in XML is more then the other file size. You may feel that XML will loss in efficiency those results from this increased size. However XML makes this loss by speeding up the processing of a well-defined XML file. The way you interpret an html file is dependent on the pre-defined tags available in html. In contrast XML file tags are user defined and represent a piece of information in a hierarchical manner. Such kind of data which describes is also called metadata. Such data provides great strength to XML as it provides ability for creating own specifications and structure the data in the way you want it to be interpreted by any other system.

Introduction to XML

Dear Students


In this an subsequent posts we will read about XML. We will be covering following topics in XML.



• List the applications and advantages of XML
• Create well-formed and valid XML documents.
• Make a XHTML document.
• Create XM DTD
• Use XSL for transforming XML data and display it in a Web browser.
• Apply data binding and the Document Object Model for displaying dynamic XML data in a Web browser.