Chapter 2. The Design Of Html5

Chapter 2. The Design Of Html5

THE FRENCH REVOLUTION was an era of extreme political and social change. Revolutionary fervor was applied to time itself: for a brief period, the French Republic introduced a decimal time system, with each day divided into ten hours and each hour divided into one hundred minutes. It was thoroughly logical and clearly superior to the sexagesimal system.

Decimal time was a failure. Nobody used it. The same could be said for XHTML 2. The W3C rediscovered the lesson of post-revolutionary France: changing existing behavior is very, very difficult.

DESIGN PRINCIPLES

Keen to avoid the mistakes of the past, the WHATWG drafted a series of design principles to guide the development of HTML5. One of the key principles is to “Support existing content.” That means there’s no Year Zero for HTML5.

Where XHTML 2 attempted to sweep aside all that had come before, HTML5 builds upon existing specifications and implementations. Most of HTML 4.01 has survived in HTML5.

Some of the other design principles include “Do not reinvent the wheel” and “Pave the cowpaths,” meaning that when there is a widespread way for web designers to accomplish a task—even if it’s not necessarily the best way—it should be codified in HTML5. Put another way: “If it ain’t broke, don’t fix it.”

Many of these design principles will be familiar to you 
if you’ve ever dabbled in the microformats community 
(microformats.org). The HTML5 community shares the same pragmatic approach to getting a format out there, without worrying too much about theoretical problems.

This attitude is enshrined in the design principle of “Priority of constituencies,” which states, “In case of conflict, consider users over authors over implementers over specifiers over theoretical purity.”

Ian Hickson has stated on many occasions that browser makers are the real arbiters of what winds up in HTML5. If a browser vendor refuses to support a particular proposal, there’s no point in adding that proposal to the specification because then the specification would be fiction. According to the priority of constituencies, we web designers have an even stronger voice. If we refuse to use part of the specification, then the specification is equally fictitious.

KEEPING IT REAL

The creation of HTML5 has been driven by an ongoing internal tension. On the one hand, the specification needs to be powerful enough to support the creation of web applications. On the other hand, HTML5 needs to support existing content, even if most existing content is a complete mess. If the specification strays too far in one direction, it will suffer the same fate as XHTML 2. But if it goes too far in the other direction, the specification will enshrine font tags and tables for layout because, after all, that’s what a huge number of web pages are built with.

It’s a delicate balancing act that requires a pragmatic, level-headed approach.

ERROR HANDLING

The HTML5 specification doesn’t just declare what browsers should do when they are processing well-formed markup. For the first time, a specification also defines what browsers should do when they are dealing with badly formed documents.

Until now, browser makers have had to individually figure out how to deal with errors. This usually involved reverse engineering whatever the most popular browser was doing—not a very productive use of their time. It would be better for browser makers to implement new features rather than waste their time duplicating the way their competitors handle malformed markup.

Defining error handling in HTML5 is incredibly ambitious. Even if HTML5 had exactly the same elements and attributes as HTML 4.01, with no new features added, defining error handling by 2012 would still be a Sisyphean task.

Error handling might not be of much interest to web designers, especially if we are writing valid, well-formed documents to begin with, but it’s very important for browser makers. Whereas previous markup specifications were written for authors, HTML5 is written for authors and implementers. Bear that in mind when perusing the specification. It explains why the HTML5 specification is so big and why it seems to have been written with a level of detail normally reserved for trainspotters who enjoy a nice game of chess while indexing their stamp collection.

GIVE IT TO ME STRAIGHT, DOCTYPE

A Document Type Declaration, or “doctype” for short, has traditionally been used to specify which particular flavor of markup a document is written in.

The doctype for HTML 4.01 looks like this (line wraps marked »):

<!DOCTYPE HTML PUBLIC » 
"-//W3C//DTD HTML 4.01//EN" » 
"http://www.w3.org/TR/html4/strict.dtd">

Here’s the doctype for XHTML 1.0:

<!DOCTYPE html PUBLIC » 
"-//W3C//DTD XHTML 1.0 Strict //EN" » 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

They’re not very human-friendly, but, in their own way, they are simply saying, “This document is written in HTML 4.01” and “This document is written in XHTML 1.0.”

You might expect that the doctype declaring “This document is written in HTML5” would have the number five in it somewhere. It doesn’t. The doctype for HTML5 looks like this:

<!DOCTYPE html>

It’s so short, anyone can memorize it.

But surely this is madness! Without a version number in the doctype, how will we specify future versions of HTML?

The HTML5 doctype appears at first to be the height of arrogance. Do they really believe that this will be the final markup specification ever written?”

It seems to be a textbook case of Year Zero thinking.

In fact, though, the doctype for HTML5 is very pragmatic. Because HTML5 needs to support existing content, the doctype could be applied to an existing HTML 4.01 or XHTML 1.0 document. Any future versions of HTML will also need to support the existing content in HTML5, so the very concept of applying version numbers to markup documents is flawed.

The truth is that doctypes aren’t even important. Let’s say you serve up a document with a doctype for HTML 4.01. If that document includes an element from another specification, such as HTML 3.2 or HTML5, a browser will still render that part of the document. Browsers support features, not doctypes.

Document Type Declarations were intended for validators, not browsers. The only time that a browser pays any attention to a doctype is when it is performing “doctype switching”—a clever little hack that switches rendering between quirks mode and standards mode depending on the presence of a decent doctype.

The minimum information required to ensure that a browser renders using standards mode is the HTML5 doctype. In fact, that’s the only reason to include the doctype at all. An HTML document written without the HTML5 doctype can still be valid HTML5.

KEEPING IT SIMPLE

The doctype isn’t the only thing that has been simplified in HTML5.

If you want to specify the character encoding of a markup document, the best way is to ensure that your server sends the correct Content-Type header. If you want to be doubly certain, you can also specify the character set using a <meta> tag. Here’s the meta declaration for a document written in HTML 4.01:

<meta http-equiv="Content-Type" »
content="text/html; charset=UTF-8">

Here’s the much more memorable way of doing the same thing in HTML5:

<meta charset="UTF-8">

As with the doctype, this simplified character encoding contains the minimum number of characters needed to be interpreted by browsers.

The script tag is another place that can afford to shed some fat. It’s common practice to add a type attribute with a value of text/javascript to script elements:

<script type="text/javascript" src="file.js"> »
</script>

Browsers don’t need that attribute. They will assume that the script is written in JavaScript, the most popular scripting language on the web (let’s be honest: the only scripting language on the web):

<script src="file.js"></script>

Likewise, you don’t need to specify a type value of text/css every time you link to a CSS file:

<link rel="stylesheet" type="text/css" »
href="file.css"> 

You can simply write:

<link rel="stylesheet" href="file.css">

SYNTAX: MARKING IT UP YOUR WAY

Some programming languages, such as Python, enforce a particular way of writing instructions. Using spaces to indent code is mandatory—the white space is significant. Other programming languages, such as JavaScript, don’t pay any attention to formatting—the white space at the start of a line isn’t significant.

If you’re looking for a cheap evening’s entertainment, get an array of programmers into the same room and utter the words “significant white space.” You can then spend hours warming yourself by the ensuing flame war.

There’s a fundamental philosophical question at the heart of the significant white space debate: should a language enforce a particular style of writing, or should authors feel free to write in whatever style they like?

Markup doesn’t require significant white space. If you want to add a new line and an indentation every time you nest an element, you can do so, but browsers and validators don’t require it. This doesn’t mean that markup is a free-for-all. Some flavors of markup enforce a stricter writing style than others.

Before XHTML 1.0, it didn’t matter if you wrote tags in uppercase or lowercase. It didn’t matter whether or not you quoted attributes. For some elements, it didn’t even matter whether you included the closing tag.

XHTML 1.0 enforces the syntax of XML. All tags must be written in lowercase. All attributes must be quoted. All elements must have a closing tag. In the special case of standalone elements such as br, the requirement for a closing tag is replaced with a requirement for a closing slash: <br />

With HTML5, anything goes. Uppercase, lowercase, quoted, unquoted, self-closing or not—it’s entirely up to you. This looseness in terms of style has been criticized; some have commented that it encourages bad markup. It’s as if a programming language that enforced significant white space suddenly changed over to a more forgiving rule set.

Many of us have been using the XHTML 1.0 doctype for years, and like the fact that we must write in one particular style, a style that the W3C validator enforces. Now that we have moved to using HTML5, it’s up to us to enforce the style we want. A non-theoretical issue with this looser syntax is the fact that it does leave teams—and individuals—to decide on their own style, and to stick to it.

A traditional HTML validator is unable to help with enforcing a house style. For example, an HTML5 document with two lists would still be completely valid if one list were closed with a </li> and the other left unclosed.

Instead, you can use a lint tool—a program that flags suspect coding practices, or those that are simply contrary to a house style—such as the htmllint project, which you can find on GitHub (http://bkaprt.com/html52e/02-01/). Ideally, you would use a validator to check for errors against your doctype, and then a lint tool to enforce your chosen coding style.

It’s a good idea to develop a style guide and enforce it; even if you work alone, your future self will thank you for it. The web is riddled with example style guides, such as the Google Style Guide (http://bkaprt.com/html52e/02-02/), to help you get started.

WE DON’T USE THAT KIND OF LANGUAGE

Deprecation is the process of removing a previously existing element or attribute from a specification. In past versions of HTML, web designers were advised not to use deprecated elements, or send them holiday cards, or even mention them in polite company.

There are no deprecated elements or attributes in HTML5. But there are plenty of obsolete elements and attributes.

No, this isn’t a case of political correctness gone mad. “Obsolete” has a subtly different meaning from “deprecated.”

Because HTML5 aims to be backward compatible with existing content, the specification must acknowledge previously existing elements even when those elements are no longer in HTML5. This leads to a slightly confusing situation where the specification simultaneously says, “Authors, don’t use this element” and “Browsers, here’s how you should render this element.” If the element were deprecated, it wouldn’t be mentioned in the specification at all; but because the element is obsolete, it is included for the benefit of browsers.

Unless you’re building a browser, you can treat obsolete elements and attributes the same way you would treat deprecated elements and attributes: don’t use them in your web pages and don’t invite them to cocktail parties.

If you insist on using an obsolete element or attribute, your document will be “nonconforming.” Browsers will render everything just fine, but you might hear a tutting sound from the website next door.

So long, been good to know ya

The frame, frameset, and noframes elements are obsolete. They won’t be missed.

The acronym element is obsolete, thereby freeing up years of debating time that can be better spent calculating the angel-density capacity of standard-sized pinheads. Do not mourn the acronym element; just use the abbr element instead. Yes, there’s a difference between acronyms and abbreviations—acronyms are spoken as single words, like NATO and SCUBA—but just remember: all acronyms are abbreviations, but not all abbreviations are acronyms.

Presentational elements such as font, big, center, and strike are obsolete in HTML5. In reality, they’ve been obsolete for years; it’s much easier to achieve the same presentational effects using CSS properties such as font-size and text-align. Similarly, presentational attributes such as bgcolor, cellspacing, cellpadding, and valign are obsolete. Just use CSS instead.

Not all presentational elements are obsolete. Some of them have been through a reeducation program and given one more chance.

TURN & FACE THE STRANGE (CH-CH-CHANGES)

The big element is obsolete but the small element isn’t. This apparent inconsistency has been resolved by redefining what small means. It no longer has a presentational connotation of “Render this at a small size.” Instead, it has a semantic value—“This is the small print”—used for legalese, or terms and conditions.

Of course, nine times out of ten, you will want to render the small print at a small size, but the point is that the purely presentational meaning of the element has been superseded.

The b element used to mean, “Render this in bold.” Now it is used for some text “to be stylistically offset from the normal prose without conveying any extra importance.” If the text has any extra importance, then the strong element would be more appropriate.

Similarly, the i element no longer means “italicize.” It means the text is “in an alternate voice or mood.” Again, the element doesn’t imply any importance or emphasis. For emphasis, use the em element.

These changes might sound like word games. They are—but they also help to increase the device-independence of HTML5. If you think about the words bold and italic, they only make sense for a visual medium, such as a screen or a page. By removing the visual bias from the definitions of these elements, the specification remains relevant for non-visual user agents, such as screen readers. It also encourages designers to think beyond visual rendering environments.

The a element on steroids

While the changes to previously existing elements involve creative wordplay, there’s one element that’s getting a super-charged makeover in HTML5.

Without a doubt, the a element is the most important element in HTML. It turns our text into hypertext. It is the connective tissue of the World Wide Web.

The a element has always been an inline element. If you wanted to make a headline and a paragraph into a hyperlink, you would have to use multiple a elements:

<h2><a href="/about">About me</a></h2> 
<p><a href="/about">Find out what makes me tick. »
</a></p>

In HTML5, you can wrap multiple elements in a single a element:

<a href="/about"> 
<h2>About me</h2> 
<p>Find out what makes me tick.</p> 
</a>

The only caveat is that you can’t nest an a element within another a element.

Wrapping multiple elements in a single a element might seem like a drastic change, but most browsers won’t have to do much to support this new linking model. They already support it, even though this kind of markup has never been technically legal until now.

This seems slightly counterintuitive: surely the browsers should be implementing an existing specification? Instead, the newest specification documents what browsers are already doing.

Out of cite—revisited

The first edition of this book explained:

The cite element has been redefined in HTML5. Where it previously meant “a reference to other sources,” it now means “the title of a work.” … Before HTML5, you could mark up [a] person’s name using cite. Now that’s expressly forbidden.
That’s just plain wrong. I’m in favor of HTML5 taking its lead from browsers, but this is a case of the tail wagging the dog.

The text went on to point out that, as there was no way for a validator to tell if what appeared between these tags was a name or not, there was nothing to stop web designers from implementing cite in this backward-compatible way.

At the time of writing, the WHATWG still explicitly restricts cite from being used to mark up names. The W3C, however, does permit this usage (http://bkaprt.com/html52e/02-03/). This is one of a number of places where the W3C version of the specification differs from the WHATWG version. We are with the W3C on this, and continue to use cite to mark up names where it feels appropriate.

SHINY NEW TOYS: JAVASCRIPT APIS

If you’re looking for documentation on CSS, you go to the CSS specifications. If you’re looking for documentation on markup, you go to the HTML specifications. But where do you go for documentation on JavaScript APIs, such as document.write, innerHTML, and window.history? The JavaScript specification is all about the programming language—you won’t find any browser APIs there.

Until now, browsers have been independently creating and implementing JavaScript APIs, looking over one another’s shoulders to see what the others are doing. HTML5 will document these APIs once and for all, which should ensure better compatibility.

It might sound strange to have JavaScript documentation in a markup specification, but remember that HTML5 started life as Web Apps 1.0. JavaScript is an indispensable part of making web applications.

Entire sections of the HTML5 specification are dedicated to new APIs for creating web applications. These include access to functionality defined in the Vibration API, allowing developers to access a mobile device’s built-in vibration hardware, should that be available. There is a Battery Status API—a web application could use this to detect low mobile device battery levels and disable elements of a page that drain power. The Geolocation API can determine a visitor’s location—perhaps to show them content relevant to where they are.

The APIs in HTML5 are very powerful; however, they deserve their own separate book. If the brief mention here interests you, there is a wealth of information online. Get started by taking a look at some HTML5 API demos (http://bkaprt.com/html52e/02-04/).

Meanwhile, there’s plenty more new stuff in HTML5 for web designers to get excited about. The excitement starts in the very next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset