Semantics

November 20, 2003 | View Comments (17) | Category: Our Thoughts

Summary: Semantics, structure, content, presentation and how they relate.

If you are a web designer/developer you are well aware of the fact that HTML/XHTML has tags that associate what different elements are within a HTML document. The <p> tag represents a paragraph and a <h1> tag represents a header. However, with the advent of web standards and making compliant sites these definitions somehow got lost and people started to use <div>'s when they could have used <p>'s or <h1>'s.

Any document, be it written or XML or HTML, has elements within it that help define parts of the content. A page title is simply a page title and should represent nothing else. A header of a section should only be depicted as a header of a section. When writing a paper we do not stick headers in the middle of paragraphs to place emphasis on text. That is not what headers are for. Headers are used to separate major sections of content within a text. The same should be applied to web documents. If you wish to emphasize some text in a paragraph then do so with a <span> class because that is the purpose they serve. Leave headers to just being headers.

Looking at my Google example you can see that there are no <div>'s because none were needed and the document does not really contain sections. However, each piece of content is surrounded by a tag that appropriately describes what it is within the document. Lists use either <ol> or <ul>. The logo is the header of the document so I was able to define it with a <h1> tag. Some people may not think that the logo should be the header of a document and to be honest my opinion changes with every site I create. That is one of the beauties of the FIR because it allows documents to remain semantic while adding some flavor (ignoring all the screenreader issues). Even though the site is not 100% semantically correct (or maybe it is) it is pretty close.

Until we have more control with CSS over how elements work and have browsers that all behave the same then no website will be purely semantic. However, a major step would be to use the appropriate elements to describe the content in your document. You might have to surround a <ul> with a <div>, but there should be no need to use a <div> in place of the <ul> when describing a list because that is not semantically correct.

<div>'s should control sections of the content and not the content itself. <div>'s help to separate the header from the body and the body from the footer. They should not define a group of sentences because that is what the <p> tag is for. Many designers seem to forget that you can apply styles to html elements by sticking a class or id within their tags.

If you plan out an XTHML document it is possible to come up with a fairly decent page without using <div>'s at all. If you do have to use <div>'s then do so appropriately. This may be a great exercise and I am not sure how many people do this already, but maybe it would be best to simply layout the XHTML document without any <div>'s so you can get the semantic part right. I believe the greatest accomplishment behind the CSS ZenGarden is that all the sites are using the same XTHML document. This shows that a well designed document can help in future redesigns. Their is no need to fear a redesign when all you are changing is the CSS file.

Standards People Got It Backwards

When advocating standards compliant sites, you must realize that it is possible to create a table-based layout that is 100% compliant. However, table-based layouts do not work semantically because they use tables to define the layout of the content. As many designers know that is not the purpose of tables within the HTML specification and therefore such sites lack semantic value. The W3C knows what all the HTML tags are for, but there documents are so uninviting for the non-scientific that designers begin to make their own interpretation of what a tag should do.

If we begin advocating semantics then standards will follow more readily because when you force a document to follow the semantics of the XHTML/HTML specs then their is a better chance that it will already be XHTML compliant. When you run a page through the W3C validator it isn't checking for standards, but for semantics. If you have a <h1> tag within a <p> it will throw an error because this is semantically incorrect. Again we are going back to the days of the beginning of the Internet when HTML documents were just documents meant for reading and not viewing pleasure. We now need to create these types of documents again, but now with CSS it is possible to add style without corrupting the semantic value. The difficulty in this is that it takes another mindset to start creating semantically structured documents and also some planning. I admit the sites I have created have been more planned from the design standpoint then from the semantic standpoint. However, I do believe if I put as much emphasis on the semantic structure of the document as I do the design then redesigns may come much easier (assuming the content doesn't change). This is equivalent to the shift in thinking that it took to replace table-based design with CSS-based design.

A well structured, semantically accurate XHTML document helps browsers, search engines, and designers. The benefit of semantic documents is that they follow standards not the other way around. It takes time and energy to develop correct documents, but this time and energy is well spent in the long run. Semantically structured XHTML documents are just another type of XML document and therefore the abilities to manipulate the content that the document holds are limitless.

Only the content of a document should be semantically correct. The structure of a site and the document are something completely different. The structure of the site needs <div>'s to give visual clarification of the different sections. The ideal situation to achieve semantics and standards is to use an XML document that contains the content and apply that to a XSL file and a CSS file. Then we can begin talking about semantically rich documents.

You can not talk about CSS separating content from structure, because the content always dicatate the structure of a document. CSS helps to separate presentation from content. Content and structure will always be intertwined. Now when more people start to realize that content dictates semantics and structure then maybe the presentation will come a little more easier. However, this will take many more years because many of us are just starting to truly understand the power of CSS and therefore will not be easy to force another way of thinking upon us.

Trackback URL: http://9rules.com/cgi-bin/mt/mt-tb.cgi/57

Comments

If divs are to "control sections of the content and not the content itself" -- by which I assume you mean separate or denote the sections of a given document -- then how can you present the CSS Zen Garden as a shining example, when the page has no less than 22 divs? There's no way the Zen Garden page has 22 different sections. And what is 'semantic' about that kind of design? Nothing, unless the named sections are named according to some kind of controlled, shared vocabulary.

Now, I love the Zen Garden as much as the next guy, and I think that the 'magic div' school of design is vastly superior to table markup and/or tag soup. But I don't understand where the black magic is that makes a sane, compact XHTML document + CSS 'semantic.'

At a granular level, for sure, the vocabulary is rich: headings, quotations, links, and flow control (paragraphs, lists, etc.) are all adequately supported in (X)HTML. But page-level, overall organization is lacking.

We used to plug that gap with jiggered table cells.

We now plug that gap with a few strategically placed divs (or, in the case of a more general-purpose document a la the Zen Garden, a host of divs).

There's a palpable improvement there. But it's not semantic, is it?

A good XHTML document minus CSS linearizes well -- but that's just a way of saying that some of the overall structural information is lost, but without any great (non-presentational) cost.

So we have decent paragraph-level semantics (taken in the strange webby, rather than dictionary, sense of the term). But beyond that, your answer is to resort to div for structuring presentation.

Well, fine (and that is the best solution at the moment, if you want to use XHTML and style it with CSS), but I simply don't understand the "semantics" behind organizing a page's presentation by enclosing strategic portions in named divs.

I'd love to be proven wrong or ignorant. What am I missing?

Brian (http://joechip.net/brian/)

You are right in saying that I should not use the Zen Garden as an excellent example of a semantically rich document, but it is one where the document is structured so that it allow for easier redesigns.

Semantics have nothing to do with presentation and therefore that is why they are separated. div's are used for presentation and shouldn't be used for their semantic value. As I said with current technologies and browsers I do not see how we can possibly make semantically rich documents that inlcude div's.

Scrivs (http://www.9rules.com/whitespace/)

Remove the stylesheet and the divs determine order of flow, which is structural, such as main styled navigation followed by subsidiary styled navigation. Is this not a semantic value?

Otherwise, I'm not sure I follow your objection to divs. How would you be able to style list navigation if you have 2 types without enclosing both in separate divs? If you have just one navigation system and you style by defining the ul tag etc in the stylesheet, how can you then present bullet points? If divs aren't necessary, why do we have them?

.... (http://biroco.com/journal.htm)

I do not see how we can possibly make semantically rich documents that inlcude div's

I'll quibble with that. I think *well-named* divs can add to the semantic value of a document, for identifying the sections of a document which are larger than a list, paragraph, header, etc., but smaller than a page.

(X)HTML has a serious deficiency (IMHO) in not having any way to describe sections. how else do you know what paragraphs belong together?

if you name your divs/sections well (I find myself reusing "header", "nav", "content", and "footer" on a pretty regular basis), then there's something in the document that describes the *meaning* of its constituent pieces.

and that seems to at least be heading in the direction of semantics. a smart parser might even be able to get something useful out of the div ids.

or, on 2nd thought, what ellipses said, but with bigger words. ;)

Elaine (http://www.epersonae.com)

Elaine -- there's that controlled vocabulary! Now, imagine the dividends if it was shared across sites / projects / domains...

Brian (http://joechip.net/brian/)

So what is the difference between an essay or article than a XHTML document besides links. There are paragraphs in an essay and those sections are clearly defined by headers and there is no need for div's as separators.

Scrivs (http://www.9rules.com/whitespace/)

I do not agree.

Where got you all about the DIVs from? As far as I know, DIV carries ZERO semantic information (as the only block element - as well as SPAN as the only inline element). DIVs (and SPANs) are (the only) generic elements WITHOUT any other meaning. Nobody but you believes DIVs are ment to have some semantic information (dividing document into sections or what). This is what the HTML4 specification says about them:

"The DIV and SPAN elements, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. These elements define content to be inline (SPAN) or block-level (DIV) but impose no other presentational idioms on the content. Thus, authors may use these elements in conjunction with style sheets [p.183] , the lang attribute, etc., to tailor HTML to their own needs and tastes."

pixy (http://www.pixy.cz)

If I'm understanding you correctly, Paul, you're not complaining about the use of div's for positioning, but for instances where people use them instead of the appropriate tag... like div class=myparagraph instead of p class=myparagraph, to wrap around a block of text.

I tend to think of divs as sectional dividers. If you have a block of stuff that needs to stay together, wrap a div around it, whether it's a group of lists, a bunch of paragraphs, a block of navigation, whatever... I don't think I've ever run into a situation where I've wanted to be able to control the positioning of a single distinct HTML entity, unless maybe my nav bar was an unordered list, and at that point, if everything else is in divs, that should be, too, both for the sake of consistancy and for output to old browsers and screen readers, so it'll be where you want it to be.

Not that I've done all that many sites of that nature anyway... but on the occasions when I do....

JC (http://www.thelionsweb.com/weblog)

"If you wish to emphasize some text in a paragraph then do so with a <span> class because that is the purpose they serve."

Actually, to be semantically correct, you would use <em> to emphasize some text. <span>s and <div>s are nothing more than generic "grouping" tags. They're there to help you structure your page, not to replace pre-defined tags, such as <h1>, <p>, <strong>, or <em>.

By all means, do use these tags for semantical structure, ie:
--code--
<div id="header">
<img src="myimage.png" />
</div>
<div id="body">
<p><span lang="es">hola</span> - <span lang="en">hello</span></p>
</div>
--/code--

...but don't a styled span merely to emphasize content. Thats just plain wrong. Use CSS to style an <em> tag instead. Screen-readers/older browsers will still get the message, newer browsers will still get their style, and everyone's happy!

For more information, visit:
http://www.w3.org/TR/REC-html40/struct/global.html#edef-SPAN

dysfunksional.monkey (http://dysfunksion.co.uk)

#10

You are correct in the use of <em>'s for emphasis. I guess I myself was suffering from a case of spanitis. Thank you for reminding me that most of the things we wish to do in an HTML document already have a tag available.

Scrivs (http://www.9rules.com/whitespace/)

#11

Except group pieces of a document into logical sections.

Why not just say that a section starts at the beginning of a new header and ends when the next header comes along? Because you might have multiple headers inside of a section -- maybe those are subsections, I don't know....

Maybe the div isn't a semantic tag per the W3C, but that doesn't mean that it doesn't or can't accrue semantics in practice.

"authors may use these elements in conjunction with style sheets [p.183] , the lang attribute, etc., to tailor HTML to their own needs and tastes."

One of our "needs and tastes" is for increase semantic value, more than what's available through the existing tag set.

Theoretically, we could include other XML dialects in our XHTML pages to indicate those sections or other logical chunks. Alas, at this point that doesn't necessarily get us much of anything in the real world.

As Brian points out, the key is to share amongst one another what semantics we try to put into our documents when the W3C's built-in semantics let us down.

Elaine (http://www.epersonae.com)

#12

Of course, the ultimate semantics would be using true XHTML... [introduction] tags around your introductions... [article] around your articles and whatever else... I've only ever seen one site that did the entire site like that, and I can't find the link now, but the source code was a beautiful thing.

It was a weblog... I think you might have linked it from pseudo, Paul... maybe I found it elsewhere though.

JC (http://www.thelionsweb.com/weblog)

#13

I think I know which one you are talking about JC, I will have to search for it when I get home.

Scrivs (http://www.9rules.com/whitespace/)

#14

If you find it please post it here - i'd like to see that site!

dysfunksional.monkey (http://dysfunksion.co.uk)

#15

Well I think I found the site, but it looks like he did a redesign and in the process killed all the beautiful XML code he was using and replaced everything with <div>'s. Now the site is loaded with them. For future referecnce the site was http://www.adventcode.net

Scrivs (http://www.9rules.com/whitespace/)

#16

yeah, that's it, I remember all the beer stuff in his source code.
Too bad he changed the coding... no complaints on the redesign, though, IIRC the site was really ugly before. But the code was beautiful.

JC (http://www.thelionsweb.com/weblog)

#17

The code has not been changed...as a matter of fact I use tags and support pseudo elements(IE has a hard time with lists in a for each ststement). I now have less divs not more. I now transform that site to HTML locally before posting it it produces
a HTML doc about 15k the files you are looking for are here
www.adventcode.net/beer.xml
www.adventcode.net/main.xsl
What you viewed before was the actual XML doc...Your browser was doing the client side transform. Beer.xml has been improved and refined to type class html as a xml attribute. I will post currennt builds of beer.xml and main.xsl with links in the index...glad you like the code...and the redesign you are right it was ugly

Don (http://www.adventcode.net)

Keep track of comments to all entries with the Comments Feed

Whitespace: Design, Business, Web++

Semantics

Standards People Got It Backwards

Comments