This article gives you an introduction to why semantics are important. It doesn't contain much code samples, but it does provide you with an insight in why standards and CSS are important on the modern web.
When the first web sites were built, there were only very few browsers. Internet Explorer, Firefox, Opera, Safari, all of them didn’t exist. All there was, was a text browser. It just displayed the page as it was put on the server.
Nowadays, things are different. There are a lot of browsers (IE, Firefox, Mozilla, Netscape, Opera, Safari, Konqueror, etc.) and they all have different ways of rendering content. Because the most web designers and web site owners want to serve as many visitors as possible, standards were made up. Basically, a standard is an agreement between all parties, for example: render as bold text.
In the beginning, there were few browsers. Netscape was world-famous, until Microsoft pushed their Internet Explorer via Windows 95 (and higher). Netscape was pushed out of the market, and Internet Explorer dominated. IE, however, sometimes ignored the standards or just rendered things differently.
Because most webdesigners only tested in IE, other browsers were left out. This was fine, however, as there were practically no other browsers. Various techniques were used: spacer gifs (a transparent pixel stretched to align content), marquee-tags, various other proprietary tags, ActiveX. A lot of these techniques were Microsoft-only and only rendered by IE.
Pages were aligned with tables and formatted with . It looked great on screen, the source could look good too, but… other browser came. And search engines. And mobile devices with web browsers. A human can look at text and tell what’s more important and what’s not. Google doesn’t have eyes and needs other techniques. The screen resolution of a PDA is only 320×240 (or in rare cases 640×480), using a normal page on such a tiny window would mean the user would have to scroll a lot, and pay excessive bandwidth bills (when using WAP, GPRS, EDGE, UMTS, …).
Then, the modern day came. People started using the tags for what they where, hX-tags for defining headers (h1 through h7), strong for bold text, em for italic text, tables for tabular data only, etc. All formatting was put into external files called “Cascading Style Sheetsâ€, CSS. Putting the formatting (how big should the menu be? What width and height?) into a separate file has a few advantages: – It means less editing of files when a tiny bit of the layout changes. For example, you build a web site for a customer and give all documents (over 100!) a green background. You don’t work with a CMS or authoring tool, but with Notepad. The customer wants the background to be red instead: you’d have to edit 100 files. – It saves you bandwidth: the CSS-file is cached on the client, meaning the browser won’t ask for it on every page visit. Only on the first visit, and when the cache “expiresâ€. By default (depending on server configuration, of course) such a file is cached for a week.
Suppose you use 200 bytes for aligning your menu, and the visitor browses all 100 pages. It would generate 200 kilobytes of data, just for aligning the menu. Now, replace 200 bytes with 20 bytes, and 400 bytes of CSS. The visitor would request the same 100 pages, it would generate 20 kilobytes for the HTML for the menu, and 400 bytes in CSS.
Now, say you have 50.000 visitors doing exactly the same thing. It would generate rougly 10 gigabytes of data, while the CSS-version only generates about a gigabyte of traffic. It would save you 9 gigabytes of data traffic, reducing costs.
Google benefits from this too: by using header-tags (<h1>) Google can decide what’s the most important part of your page: the title. In an article, <h2>-tags define subtitles. A paragraph (<p>) is meant for the text that goes with a header. How to tell if the code you’re writing adheres to the standards? There is an online validator at http://validator.w3.org, you just give it your HTML (either by giving your URL, uploading your file or by copy/pasting your code) and it will tell you exactly what you are doing wrong. If you make no mistakes, you will get a happy green bar telling you your page is valid. If it’s invalid, fear the red bar of doom!
Bad use of tags:
<font face=”Comic sans ms” size=”5”>My title goes here!</font><br>
Hello, I am your content.
Good use of tags:
<h1>My title goes here!</h1>
<p>Hello, I am your content.</p>
Google (and other search engines), “read†your HTML and determine that “My title goes here!†is the header that applies to “Hello, I am your content.â€. The header-tag can be formatted using CSS, for example: h1 { font-family: “Comic Sans MSâ€, sans-serif; font-size: xx-large; color: blue; } This would render the header in Comic Sans MS, extra-extra large, in blue. You could even define multiple style sheets: one for desktop computers, and another one for mobile devices. The browser looks at each style sheet, and checks whether it’s meant for that particular session or not. A mobile browser, like Opera Mobile, would only download and apply the style sheet whose target is mobile devices, while desktop-IE renders the screen-version. The code above is especially useful for newssites. It calls a special stylesheet when a user wants to print a page. The news website has a list with newly updated articles, which can be useful on screen. When the page is printed, it’s got no use at all as that list is a waste of ink and paper, because it ages. The article might still be relevant 20 years later, but the list of what’s new is not. Other articles have been written, and besides: who cares about such information? Using CSS, you can specify that such block magically disappears when printing a page. You could even define some more styles and have everything disappear from the page, except the content area. The reverse is possible too: in your page, you could define a block called “print†(or how you’d like to call it) in which you include the name and logo of your website, the time and date of printing, the URL to the article, and a copyright notice that applies to printed pages. In your screen-CSS, you define that that block should be invisible, while you set its contents to display when printing. You can’t do this with tables and such, it would require a special print-only version. But why do that, why send the same data to the same user twice? You would be wasting your precious bandwidth while you could do so much more with it. In this article, I have given a short introduction about why semantics could be important and even save you money (the bandwidth, remember? ;)). If you want to learn something about writing semantic HTML, you could look in the tutorial section of this website, you can Google for it, but there is one site I would like to recommend: www.alistapart.com. A List Apart describes a lot of interesting things that are possible with just using HTML and CSS - no Javascript required. For example, hovering over an image. It can be done entirely in CSS. Now, I leave you here, and wish you good luck with writing semantic code.