How the new HTML 5 will impact on SEO?

Later on 20th of January (or something like that), YouTube launched a fascinating experiment, introducing new video format for supported browsers, which rely on new HTML5 syntax.

HTML EvolutionTherefore, it appears that the new standard, which has been under development since 2007 (first draft in 2008) is going to be massively used.
Ok, it may be too earlier to be said, but considering all the new browsers – Internet Explorer 8 included – has a good support for the new specifics, a new era is definitely coming.

The new HTML5 specific are almost complete, so it’s just a matter of time before they will be definitely released, and sooner every coder should start rethinking on the way they code pages.

Are the coders the only person that should be worried about that?

I guess no. HTML5 will introduce a very new way to implement and represent data, and the syntax will be so different that all other web parties involved in the web industry – eventually – will have to face with it.

From HTML 4 to HTML 5, a long story

HTML 4 is a very old fashioned Meta coding language. Its first appearance is dated December 1997. One of the biggest limits of HTML 4 was the absence of a standard. Yes, they were the rules, but browsers were also so “smart” to understand a non-well written language and to interpret it rendering the page in any case.

This brought the Consortium to release an intermediate layer just two years later. This new language has been called xHTML, a hybrid between XML – well written document – and HTML, and it provided a new way to conceive the pages, at least from a semantic prospective, forcing coder to have a uniformed way to write it enforcing some rules like tag closure and lower-case representation.

In 1999, web sites was almost static; they were not the media-rich and socially interactive like today and so nobody though to some specific tag for media embedding, nor to the most recent micro format to allow a computer to interpret the information with just a quick scan.

That is what HTML5 aim to do. HTML5 it’s a mileage stone and it will sign the beginning of the standardization of websites.

Code, from now on, will be divided into several – specific parts – and all the coder will be able to update web page faster, even those one they don’t originally wrote, making it easier to read for humans and bots alike.

How HTML 5 will affect SEO?

HTML 5 will allow for better cross browser compatibility between mobile, desktop, netbook, pda, Ereader and whatever else can display a web page. The new HTML 5 mark-up will be more similar to the XML structure rather than HTML.

There will be new and more understandable Meta tags, which help spiders to distinguish what content is on a page more easily. In the past, and elements have been (ab)used; in HTML 5 a new array of elements will be available for a specific document sections like the navigation menu rather than article part and so on.

Let’s have a look on what is going to change. To do that I will borrow a couple of diagrams from an article on A List Apart.

Here is how today a – typical – web page is represented:

Old HTML 4 Layout

Currently the structure of HTML tags is not semantic and not in any particular order – which makes it challenging a search engine to figure out what is actually important.

This is, instead, the new format provided for HTML 5:

New HTML5 layout

The improved sectioning could ease search engines in understanding the page structure leaving the algorithm may more time to concentrate on relevant content. The same every person involved in the industry should do.

New HTML5 Tags

Here are some (not all) of the tags that will hold the most importance concerning SEO and the categorization of a page:

Article: it points to the most important content on the page. It lets spiders know about the topic you are talking. This could be a forum post, blog post, newspaper article, a user comment, or any other independent item of content.

Section: this tag specifies separate sections of an article. This means that (hopefully) a search engine will be able to pay attention and evaluate each section singularly accordingly to the header.

Header: this tag holds the primary info about the content. It can be included more than once on a page, but as far as I understood just one per article. This should allow search engines to rank pages with multiple topics more easily.

Footer: can be used multiple times like those that the header, and it should be the less important part of the page.

Aside: contains any secondary info related the page, yet off topic argument. Just imagine the right (or left) column where related topics are generally listed.

Audio and Video: with these tags, you will be able to embed media content directly into the HTML pages, having an extra control over their appearance – nowadays impossible with third parties players.

Source: is a child element to audio and video. It allows to specify multiple alternative sources for the media element; it is particular useful for a browser that does not support all formats (i.e. Firefox can’t read Wav audio files).

From a SEO prospective, it should be useful for media intersection between the different location that may be assumed as important and receive trust rank once indexed.

So what?

At present, SEOers can’t do anything better than waiting. As soon as HTML 5 will be adopted on a wider scale, search engines will have no choice to consider the new syntax. At that time, using a nice and clean HTML markup will become a far more important SEO factor than it is today (don’t get me wrong, content will still have importance too).

Nothing is expected overnight, but – as you can imagine – to ensure a smooth transition SEOers and webmasters have to recommend and build sites respecting the XHTML standard thus facilitating the migration when times will be more mature (and as such avoid pulling hairs from their head).