Working in web development I often wonder what the best doctype to code a website in is. At University I was taught to use XHTML, other articles suggest using HTML, and then there’s the need to figure out whether to go with transitional or strict. So many options, and when you are close to making a decision, HTML 5 pops out of the woodwork to add to the woes. Where to begin?
I’m going to start with HTML 5 on this one and pretty much rule it out straight away. Not because it’s useless, pointless or plain crazy, but because it’s not finished. Sure, there is support out there for it in Firefox, Chrome and others, but it’s not fully supported in any browser out there, and even less so in Internet Explorer. Sure, I would love to ignore Internet Explorer completely, but as there is a vast majority of the web still running a version of it, you can’t ignore it – sorry!
It is worth noting that it is the way things are going and, once it has been fully ratified and set out as a standard, every web site, web application or anything web based should use it. Be prepared for it, especially as it has built in functionality for playing videos, without the need for a Flash plugin, which suits us Linux users who don’t always have a Flash plugin installed!
HTML 4.01 (or just HTML for the purposes of this) is the current, mature version of HTML out in the world at the moment, and has been around seemingly forever. It serves the purpose for which it was intended and, honestly, there is nothing wrong with it. It has a set standard out there, but there are some quirks in it which can lead to pages being rendered incorrectly in some cases. This isn’t as much of an issue for very modern browsers, but older ones which might struggle. For instance, not all tags need to be closed off, both of the following lines are completely valid within HTML 4:
<p>This is a paragraph, but has no closing tag
<p>This is a paragraph with a closing tag</p>
Clearly the second one is a lot neater as there is a definitive end to the paragraph. If you add up a lot of different tags which don’t close, then you might run in to some rendering issues over some browsers, which is clearly not desirable. The whole idea of a web page is that the information contained on it will be presented to each viewer the same.
XHTML is a different end of the spectrum to HTML in that everything opened needs to be closed at some point. In the examples above, the first would not register as being valid XHTML, only the second would. That’s not to say the page would not display. Modern browsers are wonderfully forgiving with these things, and will try and display things the way it thinks you intended, though not always as expected. However, the HTML validator from the w3c will tell you that you are incorrect and what you have produced is, in essence garbage. That is because XHTML is based on the XML standard, and XML is very fussy about what it likes and doesn’t. If opens are not closed, it will not play ball.
I’ve put together a simple example web page, but done a HTML and XHTML version of it. The code will render the same in browsers, but won’t validate properly for the alternate. Here is the HTML version:
<html> <head> <title>This is the page title</title> <meta name="description" content="this is the description of the content"> </head> <body> <p>This is a paragraph with<br>some<br>line<br>breaks </body> </html>
In that, the meta tag isn’t closed, and there is no close to the ‘br’ tags within the paragraph, which isn’t closed itself. Totally valid and, for all intents and purposes is valid, according to the online validator. Go ahead check it, I’ll wait. If you put that code through the validator but set the doctype to be XHTML 1.0 transitional or strict, you’ll be given a list of 6 errors. That’s not good. It means the code isn’t well formatted. That really bugs people like me who have to spend time pulling apart HTML which isn’t valid because someone insists, rightfully, that their website passes the validator checks!
The same page in XHTML would have the following code:
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>This is the page title</title> <meta name="description" content="this is the description of the content" /> </head> <body> <p>This is a paragraph with<br />some<br />line<br />breaks</p> </body> </html>
The code has all of the tags closed off, but also requires the xmlns=”http://www.w3.org/1999/xhtml” part to be added in the HTML tag at the top. Run that through the validator as XHTML 1.0 strict or transitional and it will pass. Run it through as HTML and it will kindly inform you that there are 3 errors in the document. Not brilliant, but better than the 6 which listed for the HTML code for XHTML.
Based on that, you would expect me to say that XHTML is better than HTML 4, but I can’t. If done properly, they are both correct to use and will work properly. It’s all excellent if you think back to the start of this when I mentioned HTML 5. It’s what is coming and what will be used in the future, so really any web development should be focused around what will pass the tests for that. Both HTML 4.01 and XHTML 1.0 versions of the page above will pass the HTML 5 validation rules, but the standard isn’t finished yet, so there may well be a time where one or both of those won’t validate. Until that time whilst both of them will validate, there’s no real reason to use one over the other so it comes down to personal preference. Use whichever one you want, but make sure it passes the validation, please!