We come across several documents in our day-to-day working. In fact, documents have existed from the beginning of civilization. It has, therefore, always been important to manage them so that their intended meaning is clear. Earlier, the documents recorded with pen and paper, were formatted by human typesetters who followed pre-defined markup instructions. For example, to provide extra information or to highlight the importance of some text in a document, the typesetters used different colors, or quotation marks or color to a document, to add special meaning to it, is referred to as markup.
Definition of Markup Language.
To ensure that the user understands the markups, we need to follow a set of rules. A markup language defines this set of rules, and helps add meaning to the content and structure of documents.
Markup can be classified as follows:
• Stylistic Markup Language
• Structural Markup Language
• Semantic Markup Language
1] Stylistic Markup
The presentation of a document is determined by this markup. In the case of word processor applying stylistic markup would involve making the text bold, italicized or changing the font. In case of HTML and are two of the many tags that help in stylistic markup.
2] Structural Markup
The structure of a document is determined by this kind of markup. This markup, for example, determines the heading or paragraph in a document. For example, the tag in HTML is two of the many tags that help to structure the document.
3] Semantic Markup
The content of the document is determined by this kind of markup. Tags in HTML can be considered as examples of semantic markup.
In the late 1960's, three researchers at IBM began working on the problem of dealing with documents created on disparate systems that used proprietary formats. The research made it obvious that the following three primary requirements were essential, in order to have an interoperable system:
* There should be a common document format supported by the document processing programs.
* The common format should be specific to their domain, for example, a domain for legal documents and a domain for medical documents.
*There should be specific rules for the format of the document.
The system of formatting documents was named Generalized Markup Language (GML). Over the years, GML was fine-tuned, and came to be known as Standard Generalized Markup Language (SGML).
SGML is, therefore, considered to be the mother of all markup languages. SGML is a powerful language, but many of its features are rarely used because of its complexity. SGML describes markup languages, allowing authors to create their own tags that relate to the content. An SGML document needs a file, which has all the rules for the language, for its interpretation. That files needs to be sent along with the SGML document, so that the tags can be interpreted. Markup languages derived from SGML are called SGML applications.
HTML is the most famous markup language derived from SGML. It has been extensively used on the web and is popular because of its simplicity.
HTML was created to mark up technical papers so that they could be transferred across different platforms to be accessed by the scientific community. As the use of the Internet became popular, the non-scientific users became concerned about the presentation of their documents. Manufacturers of browsers started offering different tags that would allow authors to display documents with more creativity.
The rapid growth in the numbers of tags created new problems. The implementation of the same tag in different browsers was different. Therefore, even today, we can find web sites, which inform the user as to which browser the site can best be viewed in.