XML
Home Up Academics Interests Photo Album Favorites

 

Up

    One may ask what drawbacks people have found in HTML  forcing them to seek new avenues.   The following are some points of concern to the internet community.

1.  HTML lacks syntactic checking:   This means, there is no scientific method for validating its code. 

2. HTML lacks structure.   For example HTML has an ordered set of heading tags (h1 to h6) and browsers do not care how the heading tags are nested.   That is, there is no hierarchial ordering for these tags.   This is somewhat uncomfortable.

3. HTML is not international.   There is no tag in HTML to identify the language used.

4. HTML is not object-oriented.   We  know that object oriented approach has very many powerful tools for managing complex problems and it is really a pity if we are not able to leverage these things with HTML.

5. HTML lacks a robust  linking mechanism.   HTML’s links are very much one-to-one, with the linking hard-coded in the source HTML files.   If the location of one target file changes, a Webmaster may have to update dozens or even hundreds of other pages.

6. HTML is not reusable.   HTML pages and fragments of HTML code can be extremely difficult to reuse because they are so specifically tailored to their place in the web of associated pages.

7. HTML is not extensible.   That is, it does not grow with the problem.

8. HTML tags are not content-aware.   So complex databases, mathematical equations, chemical formulae etc., cannot be handled.

XML.     XML stands for eXtensible Markup Language.  It is hailed as the wonder child of the World Wide Web, because of its ability to present the huge ocean of diverse knowledge in an extremely simple  self-describing and self-validating  structured form.  It is really a fantastic presentation.  It is evolved from a  more generalized markup language known as SGML (Standard Generalized Markup Language) released during 1985-86.   SGML is a very complex language mainly used in big government departments, companies and army for storing  and transferring large volume of data in electronic form.   Since SGML was too cumbersome for use in smaller establishments, XML was developed during the years 1996-97 which is supposed to contain all the salient features of SGML sans its complexity.

Design Goals of XML:      The design goals for XML were set up as follows and the language was developed on these lines:

1. XML shall be straightforwardly usable over the Internet.

2. XML shall support a wide variety of applications.

3. XML shall be compatible with SGML.

4. It shall be easy to write programs which process XML documents.

5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero.

6. XML documents should be human-legible and reasonably clear.

7. The XML design should be prepared quickly.

8. The design of XML shall be formal and concise.

9. XML documents shall be easy to create.

10. Terseness is of minimal importance.

Steps involved in creating XML:      Since XML is more complex than HTML, there are several steps involved before we can create good XML.

1.     We must first learn about the basic building blocks of an XML document like elements and entities.   Then we must learn how to build an XML document which is well-formed and also valid.

2.     In order to check the validity of an XML document, we must write another document called the DTD, which gives the grammar rules pertaining to the construction and organization of elements and entities.

3.     As we have said before,  XML documents by themselves have no powers for processing and display  and so we have to write supporting programs either CSS  or XSL depending on the requirement and check their correctness.

4.     The final step is to open the XML document in the browser window, when you will get the desired output.  Please note that unless the XML document is accompanied either by a CSS or an  XSL, nothing useful can be displayed in the browser window.   Only the XML document will be displayed in the browser window provided the XML document is valid.   So if there is no accompanying CSS or XML file, you can check only for the validity of the XML document.

A simple XML Document:      Consider a simple XML document which gives information about two books, their titles, the authors’ names and their codes.

 <?xml version=”1.0”?>

<booklist>

    books

       <code>isb5467</code>

       <title>HTML programming</title>

       <author>krish</author>

    </books>

    <books>
        <code>idr5432</code> 

        <title>DHTML programming</title>

        <author>karthik</author>

     </books>

  </booklist>

Well-Formed XML Documents:       Before an XML document is  used, we have to check that it is written as per syntax or in other words it is well formed. An XML document made of elements, attributes and entities is said to be well formed if:

1.  It contains one or more elements;

2.  It has just one element called the root element which contains all other elements.

3.  Its elements are properly nested inside each other. 

4.  The names used in its element start tags and end tags match exactly.

5.  The names of attributes do not appear more than once in the same element  start tag.

6.  The values of its attributes are enclosed in either single or double quotes.

7.  The values of the attributes do not reference external entities, either directly or indirectly.

8.  The replacement text for any entity referenced in an attribute value does not contain a < character (it can contain the string &lt;).

9,  Its entities are declared before they are used.

10.  None of its entity references contain the name of an unpaired entity.

11.  Its logical and physical structures are properly nested.

 

Validating XML Document using  DTD.     Even if a document is well-formed it may not be a valid one unless it is cleared by a validating parser.   An XML  document must necessarily be accompanied by its DTD (Document Type Definition or Document Type Declarator) for proper validation.   A DTD is something like data declaration statements.

Declarations in a DTD:     There are four kinds of declarations that pne has to make about the XML in  the DTD. and they are:

1.     Element declarations

2.     Attribute list declarations

3.     Entity declarations  and

4.     Notation declarations.

Extensible Style Sheet Language:     XMl hold large volume of structured data.   In order to present the data in human likeable formats in the web pages, we need some sort of report generator programs.  Style Sheet programs can be thought of as Report Generators for XML documents.  

CSS and XSL Programs:      There are two major types of Style Sheet Programs in use and they are

1.             Cascade Style Sheets (CSS).

2.             Extensible Markup Language (XSL)

XSL.     We know that an XML source document is made of a tree of nodes.   An XSL program can select any and every input node as per the selector command, transform them as required and present them as output nodes in a tree structure.  The output tree of nodes can then be viewed by the browser which present them in a human likeable form.   The input, the output and the XSL program, all have the same tree structure.   The tree structure of XSL programs makes validation easier. XSL is a collection of templates and rules.   A template is something like a formula for a problem, a form without substance, a skeleton without flesh and blood.  A rule consists of a pattern and a template.   What XSL does is, it selects the node to be transformed by matching the input template with the pattern of the node given by the user, extracts the data from the input template node and places them in the output template node.   Thus we see that both the input and the output are XMLs, of course with different tags and different data.  Then it looks for next input template and does a similar process for the next template.   This continues until all the templates are exhausted.   Finally we get the output tree of nodes which when viewed in the browser gives the desired result.

 Example:     The program below is intended to store information about two books  and the  XSL program after that is intended to display it in the table form.

XML:

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="id2.xsl"?>

<xsltest>

<bold>Hello, world.</bold>  

<red>I am </red>

<italic>fine.</italic>

</xsltest>

 XSL:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

<xsl:template match="/">

<P style="color:green"><b><xsl:value-of select="//bold"/></b></P>

<P style="color:red"><b><xsl:value-of select="//red"/></b></P>

<P style="color:blue"><b><i><xsl:value-of select="//italic"/></i></b></P>

</xsl:template>

</xsl:stylesheet>

Output:

 

Another Example:

XML:

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="booklist.xsl"?>

<booklist>

    <books>

       <code>isb5467</code>

       <title>HTML programming</title>

       <author>krish</author>

    </books>

    <books>

        <code>idr5432</code> 

        <title>DHTML programming</title>

        <author>karthik</author>

     </books>

  </booklist>

 XSL:

<?xml version="1.0"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

<xsl:template match="/">

<html><head><title>using xsl stylesheet</title></head>

<body>

<p>my personal library books are</p>

<table border="1">

<tr><th>code</th><th>title</th><th>author</th></tr>

<xsl:for-each select="//books">

<tr><td><xsl:value-of select="code"/></td>

<td><xsl:value-of select="title"/></td>

<td><xsl:value-of select="author"/></td></tr>

</xsl:for-each>

</table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>