Thursday, April 23, 2009

More RDFa goodness from UK government web-sites

With my semweb consultantancy hat on, I've been working for a few months now on a number of RDFa projects with the UK's Central Office of Information. These projects have generally followed the same pattern:
  • define a vocabulary for some specific area of interest, such as job vacancies or government consultations;
  • use that vocabulary in HTML pages, via RDFa;
  • get my colleagues at webBackplane to build a prototype application using Drupal and ARC2, that both publishes and consumes pages in the right format;
  • add an application to Yahoo!'s SearchMonkey to process the RDFa pages.
A couple of days ago the UK Civil Service web-site was updated with a new look, and some exciting new features, some of which stem from the projects I've been involved in. There is still some more testing to do, so there haven't been any firm announcements yet, but I'm allowed to talk about one particular feature that is very exciting, the presence of RDFa in each of the job vacancies.

Read more at More RDFa goodness from UK government web-sites on my webBackplane blog.

Labels: , , , , , , , , , , , , , ,

Monday, April 20, 2009

RDFj: Semantic objects in JSON

"Essentially, what we've done with RDFj is to map JSON to RDF -- with a few extra tweaks thrown in -- rather than simply mapping RDF to JSON. (RDFa took the same approach, starting with HTML, and then working out what RDF various patterns might represent.)

"There are of course many uses for the straightforward serialisation approach, taken by RDF/JSON. But we're finding that as our applications increasingly use both JavaScript and RDF, it's very useful to blur the lines between the two. RDFj takes us an important step towards that."


Read more at RDFj: Semantic objects in JSON on my webBackplane blog.

Labels: , , , , , , , , , , , ,

Saturday, April 18, 2009

Update to 'Getting started with RDFa: Creating a basic FOAF profile'

Just over a year ago I wrote a blog post that showed how to create a FOAF profile on a web page, using RDFa. The idea was not only to show how easy it was to do in terms of the markup, but also to illustrate that once you are able to publish RDF via a web page, you need nothing more than a blog page to join the semantic web.

This blog post updates that old post, by first adding some guidance on how to check your document (using the Ubiquity RDFa parser), and then proceeding to add more features to your blog page.

Read more at Getting started with RDFa: Creating a basic FOAF profile, on my webBackplane.com blog.

Labels: , , , , , , , , , , , , , ,

Friday, November 07, 2008

Compact HTML: A mark-up language for micro-blogging

This post also appears on Mark Birbeck's webBackplane blog.

When sending small comments via services such as Twitter, it's pretty straightforward to add links to other documents. The general pattern is to abbreviate the link using an online service, and then paste the shortened link into your post. Software that displays your posts can then replace any string that begins with http: with a real link.

However, there are many occasions where a link is just not good enough. Sometimes you'd like to embed an image, or even a video. But if we start trying to add HTML mark-up, we'll pretty soon hit the character limit imposed by micro-blogging platforms.

Enter compact HTML, or CHTML...for short.

Compact HTML

The simple idea of CHTML is that we use keywords to indicate the mark-up. For example, to add an image you would ordinarily write:
<img src="http://www.fineart.ac.uk/images/works/Dundee/90/du0008.jpg" />
Which would give you this:



Of course, we can shorten the fragment by using a URL service:
<img src="http://snurl.com/252rj" />


But we can go further if we make the mark-up compact:
img=http://snurl.com/252rj
That's a pretty efficient way to transmit an image in a post.

Why HTML?

You could ask how this relates to HTML?

The idea is that all elements and attributes from HTML can be used in a generic way. So the full version of the image we just saw, would actually be:
img(src=http://snurl.com/252rj)
We could also write:
img(src=http://snurl.com/252rj,alt=A picture of Stooky Bill)

Each element would have a 'default attribute' that anonymous values would set. In the case of img it would obviously be @src, so:
img=http://snurl.com/252rj
is equivalent to:
img(src=http://snurl.com/252rj)

Other tokens


Since all we're really doing is looking for patterns of the form:
token=value
then we needn't limit ourselves to HTML in our compact mark-up.

For example, we could express a YouTube video like this:
tube=http://snurl.com/25300

An example

To see this in action, take a look at the Compact HTML Twitter sample from the Ubiquity RDFa library. Note how two tweets from Stooky Bill are shown, one containing an image, and one containing a video. As you can see from Stooky's Twitter page, the actual tweets contain simple Compact HTML.

Labels: , , , ,

Tuesday, June 24, 2008

Microformats and RDFa are not as far apart as people think

The BBC have caused a bit of a storm recently by announcing that they won't be using the hCalendar Microformat on their pages. The reason is the well-known problems with accessibility. One of the proposed solutions is to look at RDFa.

Confusing syntax with vocabulary

Whilst the BBC obviously understand their web-pages, and are conscious of the issues of accessibility, they may have demonstrated here a bit of a misunderstanding of the semantic nature of the web. And in my view, most of the follow-up discussions I've seen fall into the same trap.

The main point is to not confuse the syntax used to convey information, with the information itself--the vocabulary. The reason I think this is what is happening is that when the BBC say "we'll be looking at the possible use of RDFa", they may have missed the point that when you use RDFa, you still need a vocabulary, and if one doesn't exist, you'll need to create one.

It would be far better to look at the Microformat that is already in use (in this case, hCalendar), and see if it can be tweaked to make use of the generic nature of RDFa--but that requires a new approach to both Microformats and RDFa.

So before panic takes hold of the Microformats community, or smugness grips the RDFa one, let's try to get underneath what these two technologies are actually about.

Microformats are about vocabulary

Microformats were devised as a way to let authors add little pieces of semantics to their documents, in such a way that applications could make use of them. An application might be a search engine that can improve indexing and search, or a browser that can display extra information based on the embedded values.

There are three key advances that Microformats made. The first was to say that you can still do useful things, even if you only have a small amount of information. This was pretty radical, since up until that point the semantic web seemed to be an 'all or nothing' proposition. (For many people it still is.)

The second was to say that we should be able to publish semantic information in the same easy way that we publish web-pages--through content-management systems, blogs, and so on.

But the third advance is perhaps the most important; Microformats essentially said that we can teach end-users about a specific set of terms that help them do something useful, without having to teach them 'big picture' stuff. This meant that authors could be taught just enough mark-up to add contact details, events, licensing information, geo locations...and so on.

Problems with Microformats

Microformats opened the way for a new approach to the semantic web, but it does of course have its weaknesses.

The first is that it 'overloads' many of the HTML attributes to carry semantic information, in such a way that they can interfere with the normal use of the attribute. This is why people are now having accessibility problems, because some of the attributes are trying to play two or three roles.

The second weakness is that mixing vocabularies starts to get messy; because each Microformat is a combination of vocabulary and syntax, then it is actually quite specific, and they can therefore interfere with each other.

The third weakness if that because each vocabulary also requires 'syntax', then even if a perfectly usable vocabulary exists, it has to be 'converted' to be a Microformat.

RDFa is about syntax

RDFa started life at around the same time as Microformats, and holds with many of the same philosophies--that the semantic web is never going to happen unless metadata is as easy to publish as a web-page, and that authors should have an easy way to add information without having to understand the 'big picture'.

But RDFa set out to address a slightly different set of problems; instead of defining new vocabularies, it sought to create a generic syntax that can accommodate any vocabulary, allowed multiple vocabularies on a page, and did so in any mark-up language.

And whereas the Microformat community took as a fundamental principle that they wouldn't modify HTML at all, we took advantage of one of the key extension mechanisms of HTML which is that unrecognised attributes should simply be ignored by an HTML parser (i.e., they should not throw an error). So by adding a handful of new attributes to HTML and XHTML, RDFa doesn't interfere with the existing uses of attributes, such as for accessibility.

Microformats and RDFa

Ultimately, if people can put their prejudices to one side, there should be no reason why these two ground-breaking technologies can't work together.

To illustrate, take the rel-license Microformat. It's a nice, simple, self-contained document about how to use the license value in @rel and @rev. To specify the license of a current document, an author can add something like this:
<a href="http://creativecommons.org/licenses/by/2.0/" rel="license>cc by 2.0</a>
This is perfectly valid HTML, making use of @rel with @href, and it's nice and easy to explain--the key component of Microformats.

But it's not widely known that this is also perfectly valid RDFa.

However, RDFa takes this a step further, and provides authors with the ability to talk about other things, not just the current document.

Individual licenses for images

A common situation is to have a number of images on a page, and to want to indicate the license of each of them. With RDFa you now can.

Imagine we have this image in our page:
<img src="http://www.flickr.com/photos/detached/2529217704/" />
Our previous example of a license link needs only minor modification to make it refer to the image:
<a about="http://www.flickr.com/photos/detached/2529217704/"
href="http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en_GB" rel="license
>
cc by 2.0
</a>
It's difficult to argue that this is complex, since it's only a minor change for authors.

Extending rel-license

But there is no reason why this formulation shouldn't also be included in rel-license; after all, the key purpose of a Microformat is to promote re-use and provide bite-size pieces that authors can quickly learn and put to work.

rel-license could remain 'the last word' on the use of license in @rel and @rev, but it could also provide information about how to use license in HTML, XHTML and RDFa. That way there is only one place that authors need to go.

Solving the BBC problem

I said at the beginning that the BBC still needed to create a vocabulary, even if it adopted RDFa. In other words, it may gain a solution to the accessibility problem by dropping hCalendar, but it loses the advantage of having a community-maintained syntax.

That doesn't mean we have the answer straight away, but I strongly suggest that we don't throw the baby out with the bathwater here.

The Microformats community is responsible for getting many people to look again at the semantic web, after having been put off by the more theoretical approach. But it needs to build on this success, and look at updating its formats to make use of the new attributes and generic parsing algorithm provided by RDFa. That way it can solve its technical problems, and continue to lead the way in defining 'agile' vocabularies for the semantic web.

And whilst the RDFa community has shown that it is possible to have a syntax that supports the 'full' semantic web, without it needing to exist in a separate space of complex mark-up and extra documents, it should remember that a generic syntax is nothing without vocabularies.

I've just started work on an exciting project to mark up job vacancies in the UK public sector, using RDFa. And although the use of RDFa will make it very easy for departments to publish the metadata, it's still going to require the creation of a vocabulary of terms. I'm going to be looking all over for suitable terms and vocabularies, and I'll certainly be looking at how Microformats might fit in.

Labels: , , , , , , , , , , , ,

Thursday, May 01, 2008

Upcoming talks on RDF, RDFa and XForms

May is going to be pretty busy with talks about XForms, RDF and RDFa coming up.

First up is my talk XForms, REST, XQuery...and skimming at XTech 2008. The talk embraces themes I've been pursuing for a couple of years now; that as we put more functionality into the client, and servers get 'cleverer', it becomes much easier to build sophisticated web applications. Of course server technologies are moving so fast now that this whole approach is making more and more sense, so I'm looking forward to taking in recent developments in my talk. For example, both Amazon and Google effectively have 'databases in the cloud' that can be used to store data and query it, via APIs, with literally no configuration.

A few weeks later I'm going to be giving a tutorial on RDF at SemTech. This is an interesting development for me, because RDF and the semantic web were always my first interests--before XForms and before XHTML 2. (As well as writing RDF parsers, and designing applications, I also contributed chapters on RDF and RDFS to a couple of books on metadata and XML.)

But one problem I always had when trying to build semantic-web applications was that defining the user interface was pretty hairy. This was partly because RDF Schema is tricky to process, but also because HTML was insufficiently powerful in its core feature-set, so the translation from RDF to HTML involved a lot of work.

The need for a user interface language that was much richer than HTML was therefore why I got involved in the XForms standard (and worked with a team of people to produce the first fully conforming XForms processor, formsPlayer). So although it may not seem directly connected to the semantic web, I believe that in the coming period XForms will start to become a key part of the semantic web's architecture.

Another problem I kept coming up against whilst developing for the semantic web was the difficulty in actually publishing metadata. In particular I always found it frustrating that there was a lot of really useful metadata just sitting in ordinary web pages, and no-one could get at it. Attempting to resolve this problem gave rise to RDFa, and I'm excited that the RDFa in XHTML working draft is extremely close to becoming a stable recommendation. And as interest in RDFa grows, I'm pleased to say that some of my other presentations in May will be 'tech talks' on RDFa at Yahoo!, eBay and Google. (I'm really excited that I might be getting to meet some of the guys behind Yahoo!'s SearchMonkey.)

My final talk of the month will be at the excitingly-named Kings of Code, and I'm looking forward to talking about XHTML, XHTML 2, HTML 5, XAML, and anything else I can think of in relation to web languages.

Labels: , , , , , , , , , , , , , , , , ,

Friday, March 21, 2008

So how about using RDFa in Microformats?

Yes, I know...everyone seems to think that RDFa and Microformats are at war. And maybe some would prefer it to be that way. But whatever way you look at it, the work of the Microformats community has been key in getting people fired up about what they might do with metadata that is placed in HTML and XHTML pages. And RDFa is benefiting from the vibrant atmosphere which they have created.

That doesn't mean I'm saying we can ignore the problems and limitations that Microformats have, such as the difficulty in mixing different formats in one document, the work involved in creating new formats (often wastefully duplicating work that has already been done, since specialist formats invariably already exist), or perhaps most significantly, the inability to refer to things that are not 'the current document'.

So I have a suggestion.

Why don't we emphasise the 'micro' in Microformats?

Emphasise the 'micro'

One of the original motivations for Microformats was that they were small, self-contained sets of rules that authors could apply to their documents, which would give the author some kind of benefit. Nothing in that broad definition says 'so therefore steer clear of all other formats or you'll catch the plague'.

So why not use RDFa features within microformats, as appropriate? But Microformats could still retain the compactness of a particular format.

I'll use rel-license as an example.

rel-license

The rel-license microformat is simply the use of the value "license" in the @rel attribute, in some mark-up. For example, if we want to say that the current document is licensed under one of the Creative Commons licenses, we might use the following mark-up in our document:
<a rel="license" href="http://creativecommons.org/licenses/by/2.0/">cc by 2.0</a>

However, what if we have a search page that returns lots of images or videos? What if each image or video is available under a different license to other images or videos on the page, or to the license for the page itself? Trying to solve this problem within the framework set by Microformats is proving quite difficult (see the open issues on the rel-license issues page), so at some point we need something more than rel-license.

Reusing @src attribute

The mark-up we gave earlier to indicate that 'this document' is available under a certain license, amounts to two attributes, @rel and href:
<a rel="license" href="http://creativecommons.org/licenses/by/2.0/">cc by 2.0</a>
RDFa lets us take that little 'package' (or 'microformat') and use it anywhere that we can set a 'subject'. Of course normally the subject is the current document, which is why the rel-license microformat looks just the same to an RDFa parser as it does to a Microformats parser. But RDFa allows this 'package' to be used with the @src attribute, giving us the following possibilities:
<img src="my-picture.png" rel="license" href="http://creativecommons.org/licenses/by/2.0/" />

<object src="my-video.mov" rel="license" href="http://creativecommons.org/licenses/by/2.0/" /></object>

As you can see, this straightforwardly solves the problem that rel-license is presenting to those who want to put many items in a page, so all that would be required to take this solution into the Microformats world would be to slightly extend the rel-license microformat to allow the src attribute as a subject.

RDFa or Microformats?

This example illustrates that if we see RDFa and Microformats as playing slightly different roles, then there is no need to take an 'either/or' approach to the two techniques.

Since RDFa is a general-purpose syntax that is capable of supporting any vocabulary that anyone comes up with, now or in the future, then we are often tempted to talk about the general rules. But for new users who are looking to achieve a specific goal, seeing documentation about a specific usage pattern (such as the one I've shown here with licensing), will almost certainly be more useful to begin with.

But that is what Microformats is all about--documenting common usage patterns in such a way that people can re-use them in their own mark-up. So what would be wrong with enhancing rel-license to allow the use of @src?

Note that people wanting to parse this microformat could either create a specific parser in the way that they need to do now for other microformats, or they could use a general-purpose RDFa parser.

The key thing is that the mark-up is exactly the same, whether you arrived at the use of @rel="license" via the Microformats page, or an RDFa page.

And that seems to me to be a good result all round, for the authors, for the programmers, for the search engines, and especially for those of us who want to see a more dynamic, usable, and semantic, web.

Labels: , , , , , , , , , , , ,

Monday, March 03, 2008

First steps in RDFa: Creating a FOAF profile

Please note that this article has been updated to Getting started with RDFa: Creating a basic FOAF profile.




Now that the RDFa syntax document is in last call, and people like Yahoo! are starting to index the data, it's worth putting more of your own data into your web-pages, using RDFa. A simple place to start is to modify your home-page or blog profile so that it includes FOAF information.

FOAF

If you're not familiar with FOAF, or Friend-of-a-friend, it's a set of terms that can be used to describe people, organisations, and their relationships to each other. For example, we can mark up our names, point to our home-pages, indicate the companies and projects we work on (and point to their home-pages), and so on.

Since this vocabulary is gaining in popularity, and since RDFa allows us to use any vocabulary we like without having to re-write it (or ask anyone), we'll use FOAF via RDFa to mark up our pages. We won't use every part the vocabulary, so if you want to find further properties, or more detail on the properties used below, look at the full FOAF specification.

Creating a person

The first thing we need to do is to create a person object that will hold our information. This is done using the RDFa typeof attribute, which is much like @class in HTML. The type of the object we want to add is a Person and since 'person' comes from the FOAF vocabulary, we write it like this:
<html xmlns:foaf="http://xmlns.com/foaf/0.1/">
<head>
<title>Mark Birbeck's profile</title>
</head>
<body>
<div typeof="foaf:Person">
...
</div>
</body>
</html>
Now we're ready to add our personal information to this block.

Adding personal information

The FOAF vocabulary is packed with useful properties that we can set, so let's start with some basics such as our name and the URL for our blog.

We can add our name using the foaf:name property, which is set via the new RDFa property attribute:
    <div typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
</div>
Our blog is indicated using the foaf:weblog property. However, unlike foaf:name which is simply a string of text, the item we're going to refer to is a URL, so we must use the HTML rel attribute instead of @property:
    <div typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
</div>

Creating a profile

We now have a person with some properties about them, but in FOAF terms that's slightly different to having a profile about the person. If that seems a little subtle, it is, but what it amounts to is that the document that contains information about me, is not actually me. In many situations it won't appear to make any difference, but unfortunately it can cause a lot of problems. For example, if the document is used to represent both my profile and me at the same time, what would be the result of adding some information about when the document was created? How do we know whether the information is indicating when I was born, or when the document was?

FOAF allows us to prise these two things apart with the foaf:primaryTopic property, so we're going to use this to say that the main subject-matter of our profile (a web document) is me (a person):
<html xmlns:foaf="http://xmlns.com/foaf/0.1/">
<head>
<title>Mark Birbeck's profile</title>
<link rel="foaf:primaryTopic" href="#me" />
</head>
<body>
<div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
</div>
</body>
</html>
Whilst we're here, it might be useful to use the foaf:maker property to indicate who created the profile; in this case it's the same as the subject of the profile (i.e., the person who the profile is about), so it's easily set as follows:
<html xmlns:foaf="http://xmlns.com/foaf/0.1/">
<head>
<title>Mark Birbeck's profile</title>
<link rel="foaf:primaryTopic foaf:maker" href="#me" />
</head>
<body>
<div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
</div>
</body>
</html>
Now we can read this whole thing as follows:
  • we have a person that is identified as #me;
  • this person has a name of "Mark Birbeck";
  • this person has a blog at <http://internet-apps.blogspot.com/>;
  • this person is the creator of the current document;
  • this person is the main subject of the current document.

Adding friends and colleagues

Now that we have our basic framework in place, it's pretty easy to drop more and more FOAF properties in. Perhaps the most commonly used is foaf:knows which is used to indicate the people that you know. Since the target of this property is once again a URL, we'll need to use the HTML rel attribute again:
    <div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
<a rel="foaf:knows" href="http://www.w3.org/People/Ivan/#me">Ivan Herman</a>
</div>

Adding a picture

The FOAF vocabulary also allows us to indicate pictures that we appear in, and pictures that we might want others to use to represent us. To set a picture of yourself that other software might use to represent you, use the foaf:img property:
    <div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
<a rel="foaf:knows" href="http://www.w3.org/People/Ivan/#me">Ivan Herman</a>
<span rel="foaf:img">
<img src="http://www.formsplayer.com/files/pictures/picture-11.jpg" alt="Picture of Mark Birbeck" />
</span>
</div>

Linking to a Twitter account

The final illustration we'll show is how to use the FOAF vocabulary to point to your Twitter account, which will make it easy to build tools that will allow people to follow you with one click. The first thing to do is create a relationship called foaf:holdsAccount, which will connect our 'person object' with an online account object. To connect two objects we use the HTML rel attribute again:
    <div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
<a rel="foaf:knows" href="http://www.w3.org/People/Ivan/#me">Ivan Herman</a>
<span rel="foaf:img">
<img src="http://www.formsplayer.com/files/pictures/picture-11.jpg" alt="Picture of Mark Birbeck" />
</span>
<span rel="foaf:holdsAccount">
...
</span>
</div>
Next we create an object of type foaf:OnlineAccount, in exactly the same way that we did when creating a person earlier:
      <span rel="foaf:holdsAccount">
<span typeof="foaf:OnlineAccount">
...
</span>
</span>
Finally, we indicate that the particular type of account we're dealing with is a Twitter account (using foaf:accountServiceHomepage), and also provide our account name (using foaf:accountName):
      <span rel="foaf:holdsAccount">
<span typeof="foaf:OnlineAccount">
<a rel="foaf:accountServiceHomepage" href="http://twitter.com/">Twitter</a>
<span property="foaf:accountName">markbirbeck</span>
</span>
</span>

Human and machine-readable

Everything we've marked up so far is human and machine readable, but the layout is not great for a human. Although the links to the blog will work, and the text will show names and accounts correctly, there is no context information. However, additional mark-up can be placed in the document, and as long as it is outside of the scope of the RDFa attributes it won't be classed as metadata. For example:
<html xmlns:foaf="http://xmlns.com/foaf/0.1/">
<head>
<title>Mark Birbeck's profile</title>
<link rel="foaf:primaryTopic foaf:maker" href="#me" />
</head>
<body>
<div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span> writes a blog called
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>.
He knows
<a rel="foaf:knows" href="http://www.w3.org/People/Ivan/#me">Ivan Herman</a>.
<span rel="foaf:img">
<img src="http://www.formsplayer.com/files/pictures/picture-11.jpg" alt="Picture of Mark Birbeck" />
</span>

His inane comments are available on his
<span rel="foaf:holdsAccount">
<span typeof="foaf:OnlineAccount">
<a rel="foaf:accountServiceHomepage" href="http://twitter.com/">Twitter</a>
account. His ID is '
<span property="foaf:accountName">markbirbeck</span>'.
</span>
</span>

</div>
</body>
</html>

The whole shebang

If you want to use the mark-up as a template, then here is everything that we've seen, above. Just replace the values in red with your own details, add any human-readable text you want around it, and you are off and running:
<html xmlns:foaf="http://xmlns.com/foaf/0.1/">
<head>
<title>Mark Birbeck's profile</title>
<link rel="foaf:primaryTopic foaf:maker" href="#me" />
</head>
<body>
<div about="#me" typeof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:weblog" href="http://internet-apps.blogspot.com/">XForms and Internet Applications</a>
<a rel="foaf:knows" href="http://www.w3.org/People/Ivan/#me">Ivan Herman</a>
<span rel="foaf:img">
<img src="http://www.formsplayer.com/files/pictures/picture-11.jpg" alt="Picture of Mark Birbeck" />
</span>

<span rel="foaf:holdsAccount">
<span typeof="foaf:OnlineAccount">
<a rel="foaf:accountServiceHomepage" href="http://twitter.com/">Twitter</a>
<span property="foaf:accountName">markbirbeck</span>
</span>
</span>

</div>
</body>
</html>

Publishing your FOAF page

Since our FOAF page is embedded into an HTML page, then you can publish your FOAF profile pretty much anywhere that you are able to publish HTML or XHTML. Unfortunately, I was not able to update my Blogger profile to include RDFa, so instead I've created a new blog page, which contains my profile. You'll see a few minor changes to the structure described above, to take into account that we're not creating the entire page, but essentially it's the same.

Labels: , , , , , , , , , , , , , , ,

My Profile

The following is a sample profile, created using the techniques described in the post First steps in RDFa: Creating a FOAF profile:

Mark Birbeck has spent the last 7 years designing, building and thinking about a framework that enables dynamic user interfaces, driven by data content. Such a framework can dramatically increase programming productivity, and open up the world of application-building to many more people. Since he believes that the framework should be built on open standards, Mark is heavily involved in the W3C, as an invited expert with the XForms and XHTML 2 Working Groups, and also as the designer of RDFa.

His companies created the formsPlayer XForms processor, and Sidewinder, an open source, next-generation semantic web browser.

His blog focuses on building a new generation of internet applications, and a number of entries relate to Ajax, XForms, the semantic web, and the use of declarative mark-up. You can find him on Twitter, where his ID is markbirbeck.

Labels: , , , , , , , , , , , , , ,

Thursday, February 21, 2008

RDFa is now in last call

After a great deal of effort by some very dedicated people, the RDFa in XHTML: Syntax and Processing document has entered last call. The full story is on the front-page of the W3C site, or at this permalink. The status of 'last call' means that as far as the people who have been involved in creating the spec are concerned, any technical problems that might have existed have been resolved. Now it's the turn of the wider community to make comments about the document, and even try to implement RDFa processors.

If you want to read more about RDFa, I've blogged a lot about RDFa, written an Introduction to RDFa, and presented on it. Bob DuCharme's Introducing RDFa is, as we'd expect from him, a great introduction, and Manu Sporny uses his vocal and artistic talents to provide RDFa Basics in video form.

Labels: , , , , , , , , , ,

Wednesday, February 13, 2008

Google's Social Graph API, RDFa and the future of web search

One of the main goals of RDFa (syntax and primer), and its precursors, is to provide richer semantic information for search engines. In one of my early drafts I gave the example of a news story that referred to the British Prime Minister:
Yesterday in Parliament the Prime Minister said that we will fight them on the beaches.
Such an article would be easily understood on the day of its publication, and even years later, someone reading this would probably know who the Prime Minister in question was, and possibly even be able to decipher 'yesterday'.

But what about trying to find this article? If I search in Google for articles about Prime Ministers I'll retrieve articles about the Australian and Malaysian PMs, both of whom are in the news this week, as well as links to the 10 Downing Street web-site, stories about Gordon Brown, and so on. And if I search for "Winston Churchill" the article we're using as an example won't appear, since his name does not occur in the text.

So the original proposals for RDF in XHTML--now called RDFa--suggested that one way to solve this problem would be to allow metadata about anything to be added to HTML and XHTML documents. The example given above was extended as follows:
Yesterday in Parliament the <span resource="p:WinstonChurchill">Prime Minister</span> said that we will fight them on the beaches.


The identifier p:WinstonChurchill was a made up placeholder, indicating simply that we need a unique identifier that tells us that this particular person is Winston Churchill, but since writing that early draft, DBPedia has come along, which means that assigning a unique identifier for the person Winston Churchill is easy.

So if we have the identifier, and we have a way of adding the mark-up, all we need now is to get the data into the search engines, so that we can use them to find the article.

And Google's Social Graph API shows that this is already possible.

Google have released an API that allows us to search for people and their relationships to each other. The data is gathered during the normal indexing process, as long as it is provided to the indexer using the XFN microformat or via an external link to a FOAF document. This shows some interesting developments; first, that the Google indexing process is flexible enough to pick up different sorts of information and tie it back to a URL, and second that the search process is flexible enough to allow users to search for these different types of information in different contexts.

So the logical next step is to add an RDFa processor to the Google indexing pipeline. Since RDFa is a generic language, able to express not just FOAF, but any RDF vocabulary, then Google will only ever need to add one RDFa processor to their system, and that processor will allow them to index all current and future RDFa, even if documents use vocabularies that hadn't even been invented at the time the processor was installed.

This is a very different approach to the Microformats technique, since with Microformats, each format is unique and has its own parsing rules, which would mean that Google would have to add a different processor for each format. Not only that, it's very difficult to make multiple Microformats in the same document play together, and since each Microformat has to be centrally approved, there is no great momentum being built towards formats for disciplines such as engineering or chemistry.

None of which is to belittle the fact that getting XFN indexed by Google is a great start. But whilst I've given examples here that use people, since that is what the Social Graph API does, my point is that everything is in place for something far broader; imagine that chemists could search for any document about a particular compound or molecule (see RDFa, chemistry and the sharing of knowledge), or that doctors were able to find all references to a certain disease. In any number of different specialisms, experts have already created the vocabularies that are relevant to their communities, and RDFa has finally made it possible for these vocabularies to be used directly in news stories, peer-reviewed papers, blogs posts and more, via HTML and XHTML. (RDFa is in the final stages of becoming a W3C-approved specification, but is already in use in many different environments.)

To complete the triangle, we now need the search engines to start indexing this data, and Google's Social Graph API initiative shows that this is finally a technical possibility.

Labels: , , , , , , , , , ,

Monday, December 10, 2007

RDFa, @profile, and following your nose

One of the issues that people continually return to in relation to metadata in HTML (and with RDFa in particular), is how to know when it is safe to interpret some mark-up as 'additional' metadata. This often gets termed 'following your nose', since the idea is that you should be able to obtain from the document you are processing the information you need to add an extra layer of interpretation to the mark-up.

(This shouldn't be confused with the other form of 'following your nose', practised by Gogol's Major Kovalyov, which is an altogether more fraught experience.)

For example, if we were processing an HTML document that contained the hCard microformat, we might have:

<span class="fn">Mark Birbeck</span>
The nose-related question being posed is, how do we know that it is legitimate to interpret the value called out by "fn" as a full-name? Is it not possible that on some occasions an author has simply created a @class value of "fn" to help their CSS?

@profile

One way this question is answered in the world of microformats is to use the profile attribute to 'scope' these values; if the author has added the following, to the top of their document:

<head profile="http://www.w3.org/2006/03/hcard">
then it is pretty clear that they want "fn" values to be interpreted as a 'full name', and it would be quite legitimate for a parser to extract this value and do something with it. The complete mark-up might look like this:

<html>
<head profile="http://www.w3.org/2006/03/hcard">
.
.
.
</head>
<body>
<span class="fn">Mark Birbeck</span>
</body>
</html>
This is a good solution for microformats, since there are quite a few of them. Each microformat has different rules for parsing, so indicating the presence of one or more microformats in the profile attribute works well, and is very much in the spirit of the original meaning of @profile in HTML.

But what of a generic solution like RDFa? Does the same situation arise?

RDFa is unambiguous

RDFa uses new attributes, as well as full identifiers for properties and values, so the problem of ambiguity does not arise. To make use of the FOAF value name an author would declare a namespace prefix, and then use any values that are enabled by that prefix, as follows:

<html xmlns:foaf="http://xmlns.com/foaf/0.1/">
.
.
.
<body>
<span property="foaf:name">Mark Birbeck</span>
</body>
</html>
This is about as good as it gets, since it makes use of existing, well understood XML techniques to define properties unambiguously. And since it is using the new property attribute, it is nigh on impossible that we would happen upon this mark-up in a situation where the author intended some non-RDFa meaning. In other words, RDFa provides a far better solution than the usual 'follow your nose' approach, which would have us load some kind of profile (as practised by microformats and GRDDL), since it requires no other indications in the document to help discern meaning, other than a few simple, generic, rules.

Indicating the presence of metadata

However, there is a separate problem that is often raised in conjunction with RDFa, which is how to know whether to actually parse the document or not; if an author has not included any RDFa, why spend time processing it?

This is not a problem unique to RDFa, being constantly raised in relation to all sorts of document types, since there are of course many situations where it would be preferable to know something about the content of a document before opening it and processing it.

But that should flag up to us that just because it would be desirable in some situations to avoid the extra processing, it does not mean that a flag to indicate the presence of RDFa should be hard-coded into the syntax and made mandatory, as many are suggesting. In fact, some argue that without such an indicator, the whole structure of RDFa will collapse, which is simply not the case. Since RDFa is unambiguous, detecting its presence is merely about saving on processing time, and is therefore best seen as a use-case...and as a use-case it is no more or less important than other use-cases, such as those that might involve the processing of every single HTML document.

The confusion that has arisen around the use of @profile in RDFa is caused by the merging the two--distinct--issues, that of ambiguity of meaning, and that of 'detecting' the presence of RDFa.

Confusion

When using microformats, the question of ambiguity and the 'presence' of microformats are one and the same. This is necessarily the case because microformats make use of existing HTML attributes, such as class and abbr, but populates them with values that are indistinguishable from the ordinary use of such attributes. It is therefore imperative that a microformat parser (or indeed a GRDDL processor) does not process the values unless it is given some indicator that it is safe to do so.

RDFa on the other hand uses new attributes, so the presence or otherwise of RDFa is clear. And in the cases where existing HTML attributes are used (namely @rel and @rev), RDFa values are 'scoped' anyway, so the ambiguity question does not arise.

This means that if an RDFa processor were to process every document it found, it would be extremely unlikely to come to any 'conclusions' that were invalid. That doesn't mean that every processor should parse every document, but it does show that the question of 'presence' relates to optimisation rather than being a fundamental issue.

Conclusion

RDFa does not have the same problem as microformats in relation to ambiguity. So whilst the use of @profile in microformats is a good way to resolve the ambiguity problem in that context, it is simply irrelevant for RDFa. That doesn't mean we might not wish to indicate when it is worth parsing a document for RDFa, and one way to do this may be to use @profile. But we have to be clear that ambiguity of terms and detecting the presence of content, are two separate problems.

Labels: , , , , , , , , , , , , , , ,

Sunday, December 02, 2007

Embedding OWL-RDFS syntax in XHTML with RDFa

David Decraene is looking for feedback on his article Embedding OWL-RDFS syntax in XHTML with RDFa, which is a "Short introduction to RDFa, OWL and Microformats", and aims to come up with:
a solution that reconciles the ease of use of Microformats with the expressivity of a language like OWL. Some problems hindering OWL adoptation will be highlighted, and a first experiment with the use of RDFa mark-up to embed OWL data directly into an XHTML page will be demonstrated, a solution that can be considered as a step higher than Microformats on the evolutionary ladder of the web.
He argues that OWL has a poor presence on the web, attributing this mainly to the fact that no-one has explored approaches to "align / integrate OWL with current web content":
For ontological data to truly be useful, you need to somehow tie current web content with semantic classes and instances. OWL has failed miserably in this respect so far. It is much like a far away island of Eden, and a man without a canoe. All the important data could be there but no-one knows how to reach it. Granted, OWL does define a standard format for data interchange between applications, but this limited scope cannot be what the semantic vision is about.
Interestingly enough, if you replaced the word 'OWL' with the word 'RDF', you'd have pretty much the main motivations that spurred the development of RDFa in the first place. David goes on to give an example:
What we need is semantic annotation. You need to be able to tag sections of your content with explicit ontology classes and even relations, without it hindering the display of your content. If you wrote a piece on a certain bordeaux for example, you could mark up the section as being about an instance of wine, perhaps even with some properties defined (or even more amazing, just by knowing it is about wine properties can be extracted automatically, a mention of red is bound to be about the wine color).
Whilst there is no doubt that the kind of complex RDFa David uses in his post is not for the everyday user of RDFa, the fact that RDFa can be used for these kinds of complex examples shows that the architecture is solid.

Labels: , , , , , , , , , , , ,

Wednesday, November 21, 2007

Once more on information resources and RDFa

Benjamin Nowack has a great explanation of how RDFa deals with the very tricky issue of information resources, and indeed, offers a unique solution to the problem. (His post is in response to Ian Davis's post, Is the Semantic Web Destined to be a Shadow?.) I agree with all of Benjamin's post, and in particular it's very impressive that he draws attention to some very subtle issues, such as why it was important not to include @id processing in RDFa.

However, there is one point where I think I would be more positive than Benjamin--or perhaps a better word would be 'optimistic'. Either way, he says towards the end of his post:
One practical issue remains, though: Current browsers don't (natively) support navigating to RDF identifiers encoded in RDFa-, microformats-, or GRDDL-enabled HTML pages.
That is true, but given the points that Benjamin has made earlier in the post about how we should be careful when using URIs with fragment identifiers, this navigation question might be a non-issue. After all, if we want to navigate to some point in another document, then by definition we are dealing with an information resource anyway, and so it's legitimate to use an @id.

RDFa cleanly copes with the difference between resources that you might want to navigate to, and those that you might only want to refer to, by supporting the use of two different attributes--@href and @resource. In terms of the triples generated there is no distinction between the two, but there is a big difference in terms of their behaviour--only @href yields a clickable link, in the normal HTML manner.

Illustration

To illustrate, let's say that in my profile I indicate an identifier for me, as well as my name:
<div about="#mark" instanceof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
</div>
This means that anyone can now refer to my 'identifier', perhaps to indicate that they know me. But it would be wrong to create a clickable link to that identifier, since as Benjamin is correctly implying--and this lies at the heart of the debate about information resources and ordinary resources--I am a person and not a web-page. (If you are interested in this topic, I wrote a longer post about some of these issues back in May last year, called The Information Resource Debate, and RDFa.)

However, whilst it would not be a good idea to create a clickable link that points at my identifier (or 'me', to all intents and purposes), there would be nothing wrong with creating a clickable link that refers to my home-page.

So, to illustrate all of this, let's say that Benjamin used the following mark-up in his web-page, to both indicate that he knows me, and to link to my home-page:
<div about="#benjamin" instanceof="foaf:Person">
<span property="foaf:name">Benjamin Nowack</span>
knows
<div rel="foaf:knows" resource="http://some.profiles.com/mb#mark">
<span property="foaf:name">Mark Birbeck</span>
(<a rel="foaf:homepage" href="http://internet-apps.blogspot.com">home page</a>)
</div>
</div>
As you can see, the distinction has been made between a link to a resource that is another web-page (known as an information resource) and a link to a resource that is a person--one uses @href and the other uses @resource.

Intelligent links

In the future I see this being optimised in a very interesting way, as follows.

I already have a client-side RDFa parser that whilst parsing will load many of the external documents it comes across, in the hope of finding even more triples. Once parsing is complete, the document being viewed in the browser is 'augmented' with some of the additional information that was found in the triples.

With this behaviour, it means that in mark-up someone only needs to indicate that they 'know' me, and the RDFa parser does the rest, providing other information gathered from my profile. For example, continuing with the above example, Benjamin could change his mark-up to this:
<div about="#benjamin" instanceof="foaf:Person">
<span property="foaf:name">Benjamin Nowack</span>
knows
<div rel="foaf:knows" resource="http://some.profiles.com/mb#mark">
<span property="foaf:name">Mark Birbeck</span>
</div>
</div>
The parser would see the reference to my profile and load it, looking for more triples, and one of the triples that would be gained would be the URL for my home-page:
<div about="#mark" instanceof="foaf:Person">
<span property="foaf:name">Mark Birbeck</span>
<a rel="foaf:homepage" href="http://internet-apps.blogspot.com">home page</a>
</div>
The final step is for the parser to add a navigable link to the div or span associated with me in the document being parsed, using of course the URL of my home-page. And what's really neat is that if I were to change my home-page and re-publish my profile, anyone using the reference to my profile wouldn't need to do a thing to get the updated link to my home-page.

Conclusion

In my view this kind of functionality shows the benefit of keeping the world of resources and information resources apart; in the latter examples you can see that what the author is marking up is in many ways more 'correct', since they are expressing a relationship between people (i.e., that they know each other) rather than the merely technical notion of 'here is someone's web-page'.

Labels: , , , , , , , , , , , , , ,

Tuesday, November 20, 2007

Using URLs to pass parameters to web applications, widgets and gadgets

The Sidewinder Viewer contains a growing number of innovative features, and one I'd like to highlight here is the ability to pass run-time parameters for a web application, via a URL.

meta and link values

You'll probably already know that Sidewinder can make use of values in the HTML meta element to set things like its initial position, the size and width of the initial window, opacity and transparency, and so on. The ability to place important information like this into the document itself is a simple but powerful technique, since it means that you don't need to create manifests containing configuration information, giving you one less file to maintain and deploy.

However, whilst this technique is simple and straightforward when you are building your own application, what happens when you want to run someone else's web application on your desktop? Or what should you do if you want to take the map display you've created and re-use it as a large desktop application one minute and a small gadget the next? Surely we don't have to create two documents that are exactly the same except for the height and width?

The answer to both of these question is to move some (or all) of the run-time parameters that are in the head of the document, out to the end of the URL. To do this we use the meta and link XPointer schemes.

Google Reader

For example, let's say that you wanted to load Google Reader into a window that was 900 by 500, positioned at the top of the display, and that would autohide when you moved the mouse away; this is easily achieved by slightly modifying the URL used to open Google Reader in Sidewinder, as follows:
swviewer2 http://www.google.com/reader#meta(width=900,height=500,autohide,position=top)


The result would be something like this:

Screenshoot of Google Reader running in Sidewinder

Similarly, what if you wanted to run the Facebook iPhone application as a gadget on your desktop, with no chrome, docked to the side, and using the same dimensions as an iPhone? As before, take the base URL of the iPhone application, and then add the relevant meta parameters before passing the whole thing to Sidewinder:
swviewer2 http://iphone.facebook.com#meta(width=320,height=480,chrome=false,autohide,position=right-top)


The result would be something like this:

Screenshot of Facebook iPhone application running in Sidewinder

Command-line parameters

Putting parameters into the URL like this essentially places the command-line parameters that one would ordinarily expect to use when running an application--such as blah.exe -height=500 -width=900--into the URL. This in turn has the important effect of 'factoring out' the application that acts on the URL, by which I mean that the application is now transparent to the whole process of running our web application.

That might seem a little obtuse, but the point I'm making is that by using XPointer frameworks like this, we have leveraged perhaps the most important invention of the web, the URI. And just as URIs are often used as a universal document identifier, independent of any browser used to view that document, so we use URIs as a 'universal command-line', independent of any web applications processor that might process that command-line.

User control

But this only tells half the story. For me there is something even more exciting, which is that this technique puts control over the use of an application into the hands of the people running it. For example, we just saw how the size and position of the Google Reader web application can be set in the URL with the meta XPointer framework, but we could go further than that and use the link XPointer framework so that when we run the application we use a different stylesheet. We could even refer to a script, some RDFa, and so on.

Next steps

The next step with this work is to start to formalise the concept, so that it can be used in many different circumstances. For example, although we've seen here how the technique can be used when running web applications with the Sidewinder Viewer, it would also be possible to use run-time parameters when loading widgets into frameworks such as iGoogle, passing in parameters such as preferred colour-scheme, nearest city, and so on. By having some kind of specification it will also make it possible for others to comment and provide input on how to take this work forward.

Further reading

There are more examples of how the XPointer frameworks can be used, in The 10 minute guide to Sidewinder (or 'How to turn a web app into a desktop app without programming'). For a more in-depth look at running Sidewinder as a web applications viewer, see Sidewinder as a web applications viewer.

Thursday, October 25, 2007

Converting HTML to RDF

An interesting discussion came up recently on the 'RDF in XHTML' taskforce mailing-list. The question was asked:
I was wondering whether anyone has tried to transform HTML into RDF.
The obvious answer was to describe how the content of the HTML document could be extracted, using RDFa, for example, or XSLT as used by GRDDL and the guys at SIMILE. But it seemed to me that the question had a slightly deeper meaning.

To convert HTML to RDF would not simply be to take out the content, but to actually convert the document's structure to RDF. For example, at the moment if an RDFa parser sees this mark-up:

<img src="http://...holiday.png" />
it will not generate anything. If it sees this:

<img rel="foaf:depiction" src="http://...holiday.png" />
then it will generate a triple that establishes are relationship between some object and a resource, stating that the resource is a 'depiction' of the object:

<> foaf:depiction <http://...holiday.png> .
However, you could not round-trip the document from this information. I.e., you could not take the triples and reconstruct the HTML document, or construct some other document in a different language (say Docbook) based on the higher level metadata you've stored. For that you would need to store triples that relate to the elements in the mark-up. In other words, even the first example, with no @rel value, would need to generate some triples, even if it's as simple as:

<> xh:img <http://...holiday.png> .
Obviously this is a discussion for the future, but I believe this to be an important scenario for the 'next generation web'. This is because, by storing the 'intent' of your mark-up rather than the mark-up itself then you would be insulated from changes to rendering language, it would be much easier to be device-independent, you could deliver different versions of a document to different audiences, and so on.

Labels: , , , , , , , , , , , , ,

Saturday, September 22, 2007

One step closer to bridging the clickable and semantic webs

We've reached a point where new editors' drafts of the syntax and processing document, and an introductory primer are available, ready to be reviewed by the W3C's Semantic Web Deployment Group at their next face-to-face meeting.

Take a look. RDFa is pretty close to completion...and now the fun can really begin.

Labels: , , , , , , , , , , , ,

Tuesday, August 28, 2007

Sidewinder and the need for a semantic web applications framework

A couple of recent discussions in the RDFa and microformat communities concern areas of particular interest to those of us working on Sidewinder, a semantic web applications framework.

The initial discussion is taking place on the microformats lists, and concerns how to allow authors to indicate what actions are available to be performed on items appearing in a document. The second discussion is taking place on the 'RDF in XHTML Task Force' list; this post provides a good summary of some of the issues.

All in all I find these very exciting discussions, because they concern exactly the types of use-case that prompted me to get involved with the XHTML work at the W3C a number of years ago.

This was because I'd been trying to create the kind of flexible user interface that these threads are describing--no doubt just as lots of other people had--and in my own endeavours I ran up against a number of very serious problems that made me conclude that it was pretty much impossible with the technologies available at the time. And since I still haven't seen a convincing solution to the problem of creating extremely flexible user interfaces, I've concluded that the issues I ran up against are of quite a fundamental nature.

Some of the problems that seem to me to be absolutely necessary to solve are:
  • an HTML page contains insufficient metadata about what its content 'means', making it difficult to work out what kind of UI constructs to render;
  • even were you are able to work out what the data means, HTML is not itself powerful enough to express the kinds of complex user interfaces that you would want to 'bind' to this underlying data;
  • and even if you work out what the data means and define complex UI components, you still can't define binding rules that indicate what widget to use with what data;
  • the browser offers only one 'paradigm' for interacting with information of interest, whilst we often want to create applications that can make use of the same rich features.

For every problem...

The first problem--that HTML is not 'rich' enough--is now largely solved by RDFa. It has taken quite a long time to get here, but I think the effort that has gone into getting RDFa right is going to be worth it. The key thing that RDFa does is to get RDF into HTML--once you've got that, all sorts of possibilities open up.

The second problem--that it's difficult to define rich user interfaces with only HTML--is largely solved by XForms. That XForms is a solution is currently not obvious to most people so XForms remains peripheral in application development at the moment. Of course it is possible to define widgets using script but it quickly gets very messy. There is also an enormous problem of re-use, in the recursive sense; most Ajax libraries are works of art, but very few can support the kind of complexity we need. Take the example shown here:

Metabar showing metadata for a BBC news story

Here we have widgets that are made up of other widgets to an arbitrary depth, but the key thing is that the binding mechanism is based on abstractions; it could be the run-time data type or an abstract widget type, and in both cases it means that at any point in the hierarchy a different widget could be swapped in without disturbing anything above or below. This kind of complexity is extremely difficult to achieve with procedural languages. (See also Introduction to custom controls and Understanding the MVC separation.)

The third problem--the use of binding rules to indicate which widget should be connected to what data--is still a little in the air. One part of it we've solved by specifying binding 'rules' using XPath selectors that are data type aware. This means that we can indicate that data of type 'geo location' should have one set of behaviour bound to it, whilst data of type 'time' can have a different type. These are essentially binding stylesheets and I think this is where we can answer the question posed on the RDFa list as to whether it is the author, the end-user or the browser vendor that should be in control of the widgets--the answer is all of them! By allowing users to express binding rules that override those from authors, we can achieve the best of both worlds.

Finally, the problem that the only paradigm we have for interacting with web-based data is 'browsing' is what we are trying to solve with Sidewinder. The ultimate goal is to have a framework that can be used to build any type of internet-facing application, whether web or desktop based. By allowing all applications to make use of the same semantic functionality and the same binding rules we can allow the users themselves to get control of their data and their applications. (See also Web 2.0, Copernicus and Sparticus: Moving the centre of the web.)

Silverlight clock as a custom control in XForms/formsPlayer

Once you have this kind of 'platform' (see Platform 2.0) then things really start to open up. I could build a clock widget using Silverlight, perhaps, and have that appear anywhere that the time data type is used--whether in my messaging client, my email client, my Twitter gadget, my Facebook desktop notification system, and so on. But you might choose a clock built using SVG, whilst someone else might decide to have a clock that speaks. But any of us could swap one of these behaviours for another at will, for a component we have created or one that we have downloaded from elsewhere--the ultimate, flexible, programming platform.

Labels: , , , , , , , , , , , , , , , , ,

Wednesday, August 22, 2007

Fixing the web, Part 1

The XHTML.com site have published the first part of a two part series on what might be wrong with the web, and how we might fix it. The idea is that the first part consists of short opinion pieces by people involved in various ways with the web, and the second part is a collection of responses from anyone who wants to send something in.

Contributors are Chris Wilson, Daniel Glazman, Joe Clark, Doug Geoffray, Roberto Scano, Jeffrey Veen, Dave Raggett, Mike Andrews, James Pearce, Nova Spivack and myself.

In my comments I focused on the need for a better way to create standards:
I actually have a very positive view of the transition to the 'future Web', and I see many of the things that one might say we need for a better Web already emerging. For example, we need the differences between browsers to be removed, but due to the high quality of libraries such as YUI, Dojo, and Prototype this is well underway. We also need there to be a convergence of application development techniques, so that the same languages can be used for the Web, desktop applications, gadgets, widgets, and so on; but with an explosion of interest in HTML, JavaScript and XForms--as well as the appearance of multi-language platforms like Silverlight--this trend also looks set to continue.

So whilst I might say that these things are crucial to the future of the Web, they are also very much underway, and barring a collective overnight loss of memory by the development community, look certain to continue. But there is one thing that to me doesn't look so assured, and that is whether standards are agreed upon around Ajax, and more generally, the whole approach to standards development.

In general, standards (at least those from the W3C) are developed in a kind of 'kitchen-sink' style, where some specification tries to include just about everything that can be devised on a particular topic--and therefore takes years to write. A good example is SVG, and even, to some extent, my own favourite, XForms. In both cases there are many useful things that could be factored out of these specifications to be made available elsewhere as smaller, more manageable components, but it is only recently that this approach has started to be tried. Good examples of the 'bite-size' approach are the 'role attribute', 'access', and RDFa, which provide techniques for adding metadata to web documents as well as making them more accessible. (See also The XHTML role attribute: small and perfectly formed and Using RDFa in XHTML 1.)

A good measure that some are learning that this is a promising approach to writing specifications is the backplane initiative at the W3C. But unfortunately, an illustration that lessons haven't been learned is the W3C's embracing of HTML 5, which is not only a case study in kitchen-sink design--let's throw everything in there--but a case study too of the problems caused by the 'not invented here' mindset.

If the W3C continues to support the creation of enormous specifications that take years rather than months to complete, then it will almost certainly become increasingly irrelevant to the 'future Web'. Of course, based on the W3C's lack of coherent leadership around the whole HTML 5 question, some might not see that as necessarily a bad thing, but it does raise the question as to whether other organisations can fill the space with a better process. For example, can the Open Ajax Alliance take a much more focused approach to standards writing, creating small specifications that can be easily combined? If so, it's possible that the vacuum would be filled, and the future of the 'future Web' would stand on a firmer footing.

Whilst I'm not quite sure where the future of standards creation will come from, the sheer dynamism we see in the Web development space makes me certain that it is imminent. If we can devote even a fraction of that creative energy to evolving a new approach to standards development, the future of the 'future Web' looks very bright indeed. But if we don't, then for the foreseeable future we are looking at increasing fragmentation, and the lack of standardisation in the browser merely being transferred to a 'lack of standardisation in the libraries'.
Anyone is free to respond to the discussion, which needs to be done by September 10th. They'll then use the responses to create 'part 2'.

Labels: , , , , , , , , , , , , , , , , ,

Tuesday, July 17, 2007

Ajax and progressive browser enhancement

Progressive enhancement is an approach to web development that has been around for a few years now. (More on PE at Wikipedia.) At its most basic it suggests building our web pages in a 'clean' and uncluttered way, and then layering functionality onto the mark-up using various mechanisms, such as stylesheets and scripts. The idea is that browsers without the additional capabilities can still render the page in some fashion, but those with additional power--such as scripting support--can add more enhanced features.

This principle actually underpins much of the work we've been doing on XHTML 2, which has involved taking HTML back to its semantic roots. In particular the work on RDFa has been to a large extent motivated by providing more semantic 'hooks' on which to attach increasingly focused functionality. (The work pre-dates the term progressive enhancement, so it isn't generally called that.)

But there is a new phenomena afoot, which I'd like to call progressive browser enhancement. It's something we've been doing for a while with our work on XForms and formsPlayer, but it also has wider applicability.

DOM Events and standards

To illustrate the idea of PBE, let's look at eventing in the browser. Many Ajax libraries have their own eventing architecture, and they are without exception non-standard. This is a shame, since the W3C has a standard for DOM events (DOM 2 Events) which has been around for years and is very clearly defined. Of course, part of the problem is that whilst it has been implemented in Firefox, Safari and Opera, it's not available in Internet Explorer. This meant that when we began our work on formsPlayer, our XForms processor plug-in for Internet Explorer, we had to implement a DOM 2 Events component ourselves.

This does however mean that DOM 2 Events support is potentially available for all browsers, but of course the problem now is that in order to make use of it on IE, people would have to install formsPlayer. And even if we made the module available separately (which we will be doing shortly), the problem for the programmer is that they can't assume that it is installed; there is still an annoying rift between those who have browsers with DOM 2 Events support, and those that don't.

Using JavaScript to enhance the browser

To fill the gap we simply implemented a DOM 2 Events library in JavaScript. Whilst most Ajax libraries went the route of creating their own non-standard eventing architectures, we went the other way and implemented the standard. The advantage of our approach is that if our end-user is running a browser with reasonable standards support such as Firefox, Opera or Safari, they'll get native support for DOM 2 Events, and hopefully a faster experience. Similarly, if our user is on IE and has installed the formsPlayer DOM 2 Events component, they will likewise get a faster experience. It's only the middle group using the scripted version of DOM 2 Events who will have a slightly slower experience.

But the key point is that the end-user now has some control, since they can add a DOM 2 Event component if they want to--most likely as part of some other package--independent of the actual web applications they are using. And for the authors of web applications life is much easier, since they are coding to a standard, rather than being forced to code to the specific event architecture of YUI, Dojo, or whatever Ajax library they want to use.

This is the key idea of progressive browser enhancement.

Threading in Google Gears

Another example of PBE is threading in Google Gears. There are a number of JavaScript libraries around that create simple threading by using the timer in JavaScript. This is fine for a lot of uses but can affect performance. However, we can follow the same approach as I described above for DOM 2 Events; if the function calls that the programmer makes to create a thread are 'standard', then although in normal operation the less-efficient JavaScript version would be used, with no change to the code the web application can take advantage of 'proper' threading, should a user have Google Gears installed. (See also: Ajax makes browser choice irrelevant...but we still need standards.)

Of course, most users are unlikely to be looking for a threading library to install, just as they will not be downloading a DOM 2 Events module. But if these components are small enough to be bundled up with other pieces of software they will gradually find their way onto users machines as part of other, useful, packages.

However the distribution takes place, it's the end-users of our web applications that are in control of progressively enhancing their browser, not the web application programmers themselves.

These principles have underpinned the design of Yowl, a centralised notification system for web applications.

Yowl centralised notification system

Yowl is a way of providing notifications to users of web applications, but in a way that the user themselves can control. It was inspired by Growl, which runs on the Mac, and which is used by applications such as Skype and Adium; by passing all notifications through a central system users are given the power to decide which messages they are interested in, and how they want to see--or even hear--them.

Progressive browser enhancement allows Yowl notifications to make use of advanced features in the browser if they are available, but to fall back to basic notifications if not. For example, we've created a Windows component that allows messages to be displayed above the system tray. The component is invoked and called from script, and works with both Firefox and Internet Explorer:

Yowl notifications in IE with Sidewinder system tray component

However, if the systray component is not installed, the Yowl notification system will simply use JavaScript to show messages within the browser window. You can see a sample of the non-enhanced notifications here--it will run in all the major browsers:

Yowl notifications running in Firefox

The programmer need change nothing in their code to make use of the systray component if it is present, since as with the DOM 2 Events example, or Google Gears and threading, the user is in control of the browser enhancement, not the programmer.

(For more on Yowl see the Yowl open source project page, Yowl: Design principles and how to use it and Yowl: An open source centralised notification system.)

Where does this leave the browser?

There is still plenty of scope for the browser to evolve and make available the kind of features we're talking about here. But the truth is that progressive browser enhancement makes browser evolution far less significant than it has been until now. That's probably not a bad thing, given the chaos that is currently surrounding the development of HTML since the W3C gave its support to HTML 5; the spec itself is becoming increasingly top-heavy, and bears little relationship to the early principles of HTML. Far from having enormous 'kitchen-sink' specifications, PBE looks to add new features in a more nimble way, via small and focused modules.

Labels: , , , , , , , , , , , , , , , , , ,