Using RDFa in XHTML 1
HTML and metadata
The main reason that RDFa fits neatly into XHTML is that it builds on the already existing semantic features available in HTML 4.01. For example, it is already possible in HTML to indicate that a document was written by some author, indicate the document's licensing level, and so on, as shown here in the XHTML 1 equivalent:
<head>
<meta name="author" content="Mark Birbeck" />
<link rel="license"
href="http://creativecommons.org/licenses/by/2.5/" />
</head>
RDFa builds on this syntax by firstly making it clear what this syntax means, and secondly by extending the syntax to allow namespace prefixed properties. For example, we might use Dublin Core and Creative Commons as a source of well known properties for the mark-up we just saw:
<head>
<meta name="dc:creator" content="Mark Birbeck" />
<link rel="cc:license"
href="http://creativecommons.org/licenses/by/2.5/" />
</head>
The
link element uses the rel and rev attributes to indicate the nature of a connection between two documents--often things such as stylesheets, ATOM end-points, and so on:
<link
rel="service.post"
type="application/atom+xml"
title="XForms and Internet Applications - Atom"
href="http://www.blogger.com/feeds/8029070/posts/default"
/>
<link rel="stylesheet"
type="text/css"
href="http://www2.blogger.com/css/blog_controls.css"
/>
But what is often forgotten is that in HTML these attributes are also perfectly valid on the
a tag. For example, if we wanted to indicate that an anchor in a calendar item was actually providing a description of the event, we might do this:
<a
rel="cal:description"
href="http://www.vecosys.com/2007/02/14/what-comes-next-web-30-preparing-for-semantic-web/"
>
What comes next Web 3.0 - preparing for semantic web?
</a>
You'll see from these examples that there is a large amount of similarity between
link and a, and that although link is used in the head of the document and a in the body, both use rel, rev and hrefto specify much the same thing--a relationship with another document. Interestingly, although the main difference between the two elements would appear to be that a allows a user to navigate from the first document to the second by clicking the label, even this is not so clear-cut; browsers such as Opera, for example, will display the labels from links that are defined in the head if they use a @rel value that it recognises. An example of such mark-up might be:
<head>
<link rel="next" href="next.html" />
<link rel="prev" href="prev.html" />
</head>
The class attribute
In addition to themeta and link elements, and the rel/rev/href combination of attributes, HTML also provides the class attribute as a semantic hook.The original intention of
@class was as both a style selector, and a general-purpose 'tag' for the element. Although good programming style has always been to use values for @class that reflect the purpose of an element, this is coming much more to the fore in recent years, with a clearly discernible vogue for 'semantic mark-up'.RDFa also uses the
class attribute, but extends it by allowing namespace prefixed values, just as in @rel, @rev and @property. Continuing our calendar entry, the class attribute can be used to indicate the 'type' of the item that we're dealing with:
<div class="cal:Vevent">
<a rel="cal:description"
href="http://www.vecosys.com/2007/02/14/what-comes-next-web-30-preparing-for-semantic-web/"
>
What comes next Web 3.0 - preparing for semantic web?
</a>
</div>
Properties for text
Whilsta can be seen as the equivalent in body, for the link element in head, there is no equivalent to meta for use in the body.It may not be immediately obvious why we would want such a feature, i.e., why we would want to be able to indicate textual metadata in the body of our documents. And it's certainly true that we don't gain extra functionality; in fact we could achieve everything we need using
meta in the head. But one of the key ideas behind RDFa is the notion that if the Semantic Web is ever to become a reality, it needs to be easy for people to create metadata, and perhaps the easiest way to do that is to augment an ordinary XHTML 1 page. So if we're going to get people to add metadata to their pages, we need to tackle one of the most annoying aspects of current mark-up, and that is the constant need to repeat content that is playing the role of both data and metadata.Don't repeat yourself
For example, say we have a blog entry that has the following metadata:
<head>
<meta name="dc:creator" content="Mark Birbeck" />
<meta name="dc:date" content="2007-02-15" />
</head>
The chances are that the blog itself will begin with exactly the same information, but in a more human-friendly style:
<head>
<meta name="dc:creator" content="Mark Birbeck" />
<meta name="dc:date" content="2007-02-15" />
</head>
<body>
Posted by Mark Birbeck on February 15th, 2007.
</body>
If we could 're-use' the string "Mark Birbeck", then we could avoid the unnecessary repetition in the author's metadata. The RDFa
property attribute provides just such a way to create the same data in the body that a meta tag would have created in head:
<head>
<meta name="dc:date" content="2007-02-15" />
</head>
<body>
Posted by <span property="dc:creator">Mark Birbeck</span> on February 15th, 2007.
</body>
RDFa actually mirrors the way that the attributes from
link can be used in the body, by making the content attribute (from meta) available throughout the document. However, it was thought best not to take @name along too, since when that attribute is not on a meta element, appearing on another element, it is difficult to know what it means. Instead we created a new attribute called property.If we look at the steps we have gone through so far, prior to the addition of this attribute everything we have seen has been standard XHTML 1 (
a, link, @rel, @rev, @href, @class, etc.). In other words, RDFa simply draws attention to the semantic features that we already had in HTML, as well as giving an interpretation to those things that are unclear. But to gain an equivalent for meta in the body--that is, something that reflects how it is used in the head--new rules were needed. It's these extra rules that sometimes lead people to think that RDFa is a language that can only be used with XHTML 2, but in fact the presence of the extra attributes doesn't stop RDFa being used in today's XHTML 1, even when viewed in today's browsers.To illustrate the use of RDFa in XHTML 1--i.e., independent of XHTML 2--we'll use a previous post describing a forthcoming meeting about the Semantic Web. In it I've embedded some RDFa that describes the meeting itself. We saw a cut-down version of this RDFa earlier, when looking at
@rel with @href on an a tag, and it looked like this:
<div class="cal:Vevent">
<a rel="cal:description"
href="http://www.vecosys.com/2007/02/14/what-comes-next-web-30-preparing-for-semantic-web/"
>
What comes next Web 3.0 - preparing for semantic web?
</a>
</div>
But the real version does exactly what we have been discussing, and re-uses the text inside the anchor to provide the
cal:summary value, by using the new attribute property:
<div class="cal:Vevent">
<a rel="cal:description" property="cal:summary"
href="http://www.vecosys.com/2007/02/14/what-comes-next-web-30-preparing-for-semantic-web/"
>
What comes next Web 3.0 - preparing for semantic web?
</a>
</div>
This post lives on Blogger.com and is delivered to all sorts of browsers, yet the use of the additional RDFa attribute causes no problem at all. But the advantages of being able to augment the mark-up in such a way means that an RDFa parser--whether running on a server or inside the user's browser--can make use of this information in all sorts of ways.
We've seen how adding the
property attribute allows us to 'not repeat ourselves' when used with XHTML 1, and still allows the document to function normally. Now we'll see another attribute that RDFa adds to solve the problem of adding metadata about many different resources at the same time.Multiple resources on a page
RDFa provides another attribute to HTML, calledabout. The reason it's needed is that many documents on the web today contain multiple 'items' and a great deal of metadata about those items. This means that simply using link and meta in the head of the document is not enough for these kinds of documents, since placing metadata in head only tells us about the document itself, and nothing about the various pieces of content within the page.A simple example would be a blog page, where each individual post needs to have its own metadata. Although the basic metadata features in HTML aren't sufficient, by using the RDFa
about attribute we can change the 'target' for some of the embedded metadata. For example, we might have two blog entries on the same page, like this:
<div>
<h2 property="dc:title">What I did on my holidays</h2>
Posted by <span property="dc:creator">Mark Birbeck</span>
<div>
First, we got on the plane...
</div>
</div>
<div>
<h2 property="dc:title">Looking forward to going on holiday</h2>
Posted by <span property="dc:creator">Mark Birbeck</span>
<div>
It will be lots of fun...
</div>
</div>
As things stand, both sets of metadata--
dc:title, dc:creator, and so on--will only tell us something about the document as a whole. But by using @about we can 'localise' the metadata to refer to only one part of the document:
<div about="#post-2">
<h2 property="dc:title">What I did on my holidays</h2>
Posted by <span property="dc:creator">Mark Birbeck</span>
<div>
First, we got on the plane...
</div>
</div>
<div about="#post-1">
<h2 property="dc:title">Looking forward to going on holiday</h2>
Posted by <span property="dc:creator">Mark Birbeck</span>
<div>
It will be lots of fun...
</div>
</div>
(Bob DuCharme has a discussion of his use of RDFa with Moveable Type.)
Another example would be a site that has lots of different media on one page; for example, Flickr has many images per page, and each image can have different licensing information, a different photographer, different tags, and so on.
The main point is that the HTML metadata story has needed 'beefing up' for a while if it was to cope with the changing nature of the web, and being able to handle complex pages of the kind we've just illustrated; RDFa has provided the necessary additional metadata features to cope with this.
To close our running example of a calendar entry, the finished mark-up would look like this:
<div about="#talk" class="cal:Vevent">
<a rel="cal:description" property="cal:summary"
href="http://www.vecosys.com/2007/02/14/what-comes-next-web-30-preparing-for-semantic-web/"
>
What comes next Web 3.0 - preparing for semantic web?
</a>
</div>
With the
about attribute, this pattern could be repeated many times within a page, once for each available event.Schemas for validation and document creation
Perhaps the most interesting development for RDFa in the context of its relationship to XHTML 1 is the forthcoming release of some XML schemas that conform to the guidelines in XHTML Modularization 1.1. These will be used to create a dialect of XHTML that includes both XHTML 1 and RDFa, and therefore means that a document that uses RDFa can be validated. The schemas will even allow XML editors to guide the editing process.Conclusion
RDFa can be used now in XHTML 1, and provides an efficient, easy to learn, generic solution to the problem of embedding metadata in XHTML documents. By leveraging the already existing HTML features found in XHTML 1, RDFa provides a gentle on-ramp. But by adding to these basic features support for namespaces, and some additional attributes to cope with documents that contain many resources, RDFa provides everything that is needed for incredibly complex documents. If you're interested in making a start on RDFa, there are a number of entries in this blog that are tagged with 'rdfa'. You'll also find rdfa.info a useful resource.Labels: declarative programming, knowledge, metadata, programming, rdfa, role, semanticweb, semweb, standards, w3c, web2.0, webapps, xhtml, xhtml2









0 Comments:
Post a Comment
Links to this post:
Create a Link
<< Home