HTML History
CS 4244 - Marc
Abrams
Reference
SGML
First there was SGML (ISO 8879:1986): standard generalize markup
language.
-
Markup = insert tags into content (vs. word processing)
-
Device-independent
-
Well documented
-
Specification in public domain
-
SGML spec was not designed for easy implementation, hence not widely adopted
HTML
Then there was HTML: hypertext markup language
-
Today, HTML is a Document
Type Declaration (DTD) of SGML. That's why you see html docs
starting with something like...
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML
3.2//EN">
-
First HTML spec in summer 1999 from Tim Berners-Lee at CERN
-
Here
is the very first HTML description!
-
Then came additions, debates, tests
-
HTML 1.0 finalized in March 19993
-
Describes "content, not format", just like SGML
-
Very simple language
-
NCSA produced Mosaic Web browser in November 1993, implementing HTML 1.0
-
WWW Consortium wanted to add forms, tables, math tags, and more to HTML
2.
-
There were only a few hundred Web sites at the time.
-
HTML+ stayed in draft form for two years.
-
Then HTML 3.0 was proposed, which simplified HTML+.
-
There was lots of arguments and HTML 3.0 was never standardized.
Netscape & Microsoft Extensions
-
Meanwhile Netscape and Microsoft implemented some HTML 3.0 features
-
They also responded to marketplace demands to allow authors to precisely
format their pages
-
The added formatting information violated the rule SGML of content and
not format; for example:
-
Finally there was a new standard, which incorporated Netscape and Microsoft
extensions: HTML3.2.
-
Two incompatible extensions to HTML by Netscape and Microsoft
-
Allows scripting language to modify HTML (Microsoft treats HTML tags as
objects)
-
Allows positioning
-
Uses style sheets (see CSS1 below).
-
We'll discuss this later in detail.
HTML 4.0
- HTML 4.0 spec finished in Dec 1997
- Attempt to create standard out of the fragmented dynamic HTML world.
- What 4.0 adds to 3.2:
- Style sheets
- Internationalization support (right to left text, LANG attribute to
specify display appropriate for certain language)
- Accessibility or visually impaired, etc.
- Richer tables and forms
- Ways to insert scripts and OBJECT tag for multimedia
- Example of how much a standard lags reality: JavaSript has never
been legal in HTML until 4.0!
- Three flavors, indicated by line at beginning of document, so browser or
validator knows what to parse:
- Transitional - sprinkle a few 4.0 features in older html docs
- Strict - adheres to HTML 4.0 spec
- Frameset - document uses HTML frames
- HTML 4.01 spec finished in Dec 1999
- Revision of HTML 4.0 to prepare for XHTML
- Minor errors in HTML 4.0 fixed
- XHTML relies on HTML 4.01 spec for meaning of tag
Style Sheets
-
CSS = cascading style sheets
-
CSS allows incorporation of page style (e.g., fonts) and
layout
(e.g, positioning) info
-
We'll discuss this later.
Supports:
- media-specific style sheets (e.g. printers, braille devices, aural devices)
- downloadable fonts
- element positioning
- table layout
- internationalization
Improvements in...
- User interface enhancements: more control over look of form elements, new
cursors and colors to blend with user's desktop environment
- Enhancements to border properties -- rounded corners, shadows, complex borders
made up of tiled images
- Description of "box model" for the normal flow layout of text
on a Web page. Flow accounts for vertical (Chinese, Japanese) as well as horizontally
oriented langauges
- Color models
- Support for non-Western text and "ruby" (small annotations on
top of words in Asian languages to give pronunciation or meaning)
- "Paged media properties" -- running headers, footers, page numbers
- Description of background images and colors, with ability to stretch backgrounds
HTML had inherent problems:
-
HTML tags are for presentation only.
-
Is <b>Apple</b> a fruit, a computer
company, a record company, ...?
-
Better: <fruit>Apple</fruit>,
with style sheet mapping <fruit> to <b>
-
Tags give semantic meaning for programs that process the documents (e.g.,
<company>Microsoft</company><price>106</price>)
-
HTML has fixed tag set (endless debate to standardize on new tags)
-
Web-based apps wound up using CGI-bin scripts, slowing Web servers; could
Web documents help more?
XML's birth
-
July 1996: Jon Bosak at Sun and small
group of SGML experts proposed new W3C working group to devise modified
SGML
-
Three phases:
-
XML: the syntax itself
-
XLL (Extensible Link Language): the linking semantics of XML
-
XSL (Extensible Stylesheet Language): the presentation of XML
- August 1996
-
SGML experts met, led by Jon Bosak at Sun
-
Wanted to know how to reinject SGML into Web
-
Especially SGML's extensible tags
- 1.0 standard was published February 10, 1998
- XML requires you to:
-
make tags case-sensitive
-
include end tags e.g. </p> and </li>
-
add a / to empty tags, e.g. <br /> and <hr />
-
quote all attribute values, e.g. <img src="karen.jpg"
/>
- XML is a meta-language, not a language.
- Can replace RTF, PDF, various proprietary doc formats (e.g., MS Word)
- Many companies are developing XML-compliant products
- Internet Explorer 5 has extensive XML support
- IBM's www.alphaworks.ibm.com has XML parser, other tools
- Commercial tools to edit XML, render XML, use XML for database queries,
etc.
- Industry-specific groups now formalize DTDs to standardize tag sets (e.g.,
for NASA's space instruments, telephone switches, user interfaces, etc.)
- Future of XML: multiple languages:
- XML Base
- XML Inclusions (XInclude)
- XML Fragment Interchange
- XML Query (query facilities to access XML docs like databases)
- XML Schemas (vocabularies for XML langauges)
- XPath (to address a part of an XML document)
- XPointer (defines XLink, a more sophistocated form of hyperlink)
- Alternative to XSL
- Use HTML-like syntax for style sheets (actually syntax is XML-compliant)
- Permits algorithms to be embedded in style sheets, making them very powerful
- Often used to transform XML to other XML-compatible languages (e.g., XHTML)
- Three languages:
- XSL Transformations (XSLT): for transforming XML to XML
- XML Path Language (XPath): expression langauge used by XSLT to refer
to a part of an XML doc
- XSL Formating Objects: something like the properties from CSS to format
XML docs
(See also this link.)
-
Problems with HTML:
-
Syntax is not XML-compliant, so HTML won't work with new generation of
Web tools
-
HTML has many complex tags - don't want to implement all of them on portable
computing devices, cars, kiosks, ...
-
HTML needs more specialized tags for specific devices, like televisions
- Want to improve the Web's incredibly boring forms
-
XHTML characteristics...
-
modular: defines different groups of tags, and a device can
say what group(s) it implements
Here is a link
to the current list of modules!
-
interoperable: permits embedding of markup tags specific to
vector graphics, multimedia, math, ecommerce, etc.
-
XHTML 1.0 docs can still be read by HTML browsers, if a few conventions
are followed
-
Utility exists to convert HTML to XHTML
-
Extensions to HTML frames underway for richer user interface
-
Transformation used to go from XHTML doc to end device... (diagram
is from here)
- XHTML Basic - for simple clients (PDAs, mobile phones, ...)
- XHTML 1.1 - requires new user agents (vs. XHTML 1.0 being compatible with
existing browsers)
-
SMIL = Synchronized Multimedia Integration Language
-
Enables authoring of TV-like multimedia presentations
-
SMIL presentation can combine streaming audio, video, images, text, and
other media
-
Three step process
to create SMIL docs:
-
Create areas for your media
-
Fill in areas with media objects
-
Determine sequence in which to play them (in parallel or sequentially)
Other Markup Languages
-
SVG: Scalable Vector Graphics
-
MathML
-
User interface languages: VoiceML, Wireless Markup Language, User
Interface Markup Language
-
Many more coming...
Proprietary formats...
Portable Document
Format (PDF): Adobe's replacement for Postscript, which
permits total control over page layout.
Macromedia Flash
and Shockwave: Allow multimedia and vector graphics