Blog

May 2010

Introducing HTML 5

HTML5 is a specification being formalized by the Web Hypertext Application Technology Working Group (WHATWG) that defines concrete language syntax for an API that can describe documents and applications. The WHATWG specification incorporates both the existing HTML 4.01 and XHTML 1.0 features, and also introduces new items, including:

  • new layout elements
  • programming changes to the Document Object Model (DOM)
  • updated Web Forms
  • server-sent DOM events
  • dynamic graphics capabilities

WHATWG is a growing community of browser vendors, web developers, and other parties interested in the design and implementation of the next generation of HTML and related technologies. WHATWG’s primary concern is to enable authors to write and deploy applications over the World Wide Web.

WHATWG seeks to invigorate web application development by extending HTML so that it is suitable for expressing the semantics of now-ubiquitous applications such as forum sites, auction sites, search engines, and online shopping sites. WHATWG plans to facilitate such development in two ways: by defining an abstract language for describing documents and applications, and by defining APIs for interacting with in-memory representations of instances of the abstract language.

The Need for HTML5
Apple, Mozilla, and Opera became increasingly concerned about the W3C’s direction (or lack of direction) with XHTML, their lack of interest in HTML, and their perceived disregard for the needs of web application developers. In response, these organizations took it upon themselves to address these concerns.

Markup for documents on the World Wide Web has always been some incarnation of HTML. Although it was originally designed as a language for semantically describing scientific documents, HTML was adopted for general use and was rapidly extended during the 1990’s. It’s now used to describe most documents transmitted across the web.

HTML worked well for publishing static web pages. But many modern web documents aren’t individual static pages at all; instead, they’re partial pages or one page among many that, collectively, compose a web application. The current HTML specification inadequately addresses the entire area of web applications (session-oriented conversations between web clients and web server components). The WHATWG specification is an attempt to correct this situation and, at the same time, update the HTML specifications to address other issues that have annoyed web developers over the last few years.

Currently Apple, Mozilla, and Opera are the only browsers offering support for the development of HTML5. However, HTML5 is being developed with “IE compatibility in mind.”

Comet – A new approach to Ajax

In web development, Comet is a neologism to describe a web application model in which a long-held HTTP request allows a web server to push data to a browser, without the browser explicitly requesting it. Comet is an umbrella term for multiple techniques for achieving this interaction. All these methods rely on features included by default in browsers, such as JavaScript, rather than on non-default plugins.

In theory, the Comet approach differs from the original model of the web, in which a browser requests a complete web page or chunks of data to update a web page. However in practice, Comet applications typically use Ajax with long polling to detect new information on the server.

Streaming

In an application using streaming Comet, the browser opens a single persistent connection to the server for all Comet events, which is handled incrementally on the browser side. Each time the server sends a new event, the browser interprets it, but neither side closes the connection. Specific techniques for accomplishing streaming Comet include the following.

Hidden IFrame

A basic technique for dynamic web application is to use a hidden IFrame HTML element (an inline frame, which allows a website to embed one HTML document inside another). This invisible IFrame is sent as a chunked block, which implicitly declares it as infinitely long (sometimes called “forever frame”). As events occur, the iframe is gradually filled with script tags, containing JavaScript to be executed in the browser. Because browsers render HTML pages incrementally, each script tag is executed as it is received.

One benefit of the IFrame method is that it works in every common browser. Two downsides of this technique are the lack of a reliable error handling method, and the impossibility of tracking the state of the request calling process.

XMLHttpRequest

The XMLHttpRequest (XHR) object, the main tool used by Ajax applications for browser–server communication, can also be pressed into service for server–browser Comet messaging, in a few different ways.

In 1995, Netscape Navigator added a feature called “server push”, which allowed servers to send new versions of an image or HTML page to that browser, as part of a multipart HTTP response (see History section, below), using the content type multipart/x-mixed-replace. Since 2004, Gecko-based browsers such as Firefox accept multipart responses to XHR, which can therefore be used as a streaming Comet transport.[9] On the server side, each message is encoded as a separate portion of the multipart response, and on the client, the callback function provided to the XHR onreadystatechange function will be called as each message arrives. This functionality is only included in Gecko-based browsers, though there is discussion of adding it to Webkit.

Instead of creating a multipart response, and depending on the browser to transparently parse each event, it is also possible to generate a custom data format for an XHR response, and parse out each event using browser-side JavaScript, relying only on the browser firing the onreadystatechange callback each time it receives new data.

Ajax with long polling

None of the above streaming transports works across all modern browsers without causing negative side-effects in any—forcing Comet developers to implement several complex streaming transports, switching between them depending on the browser. Consequently many Comet applications instead opt for long polling, which is easier to implement on the browser side, and works, at minimum, in every browser that supports XHR. As the name suggests, long polling requires the client to poll the server for an event (or set of events). The browser makes an Ajax-style request to the server, which is kept open until the server has new data to send to the browser, which is sent to the browser in a complete response. The browser initiates a new long polling request in order to obtain subsequent events.

Specific technologies for accomplishing long-polling include the following.

XMLHttpRequest long polling

For the most part, XMLHttpRequest long polling works like any standard use of XHR. The browser makes an asynchronous request of the server, which may wait for data to be available before responding. The response can contain encoded data (typically XML or JSON) or Javascript to be executed by the client. At the end of the processing of the response, the browser creates and sends another XHR, to await the next event. Thus the browser always keeps a request outstanding with the server, to be answered as each event occurs.

Script tag long polling

While any Comet transport can be made to work across subdomains, none of the above transports can be used across different second-level domains (SLDs), due to browser security policies designed to prevent cross-site scripting attacks.[11] That is, if the main web page is served from one SLD, and the Comet server is located at another SLD, Comet events cannot be used to modify the HTML and DOM of the main page, using those transports. This problem can be side-stepped by creating a proxy server in front of one or both sources, making them appear to originate from the same domain. However, this is often undesirable for complexity or performance reasons.

Unlike IFrames or XMLHttpRequest objects, script tags can be pointed at any URI, and JavaScript code in the response will be executed in the current HTML document. This creates a potential security risk for both servers involved, though the risk to the data provider (in our case, the Comet server) can be avoided using JSONP.

A long-polling Comet transport can be created by dynamically creating script elements, and setting their source to the location of the Comet server, which then sends back JavaScript (or JSONP) with some event as its payload. Each time the script request is completed, the browser opens a new one, just as in the XHR long polling case. This method has the advantage of being cross-browser while still allowing cross-domain implementations.