dan forys


Microformats and me

Posted on 2007-06-11 - Comments

Having attended the greatly motivational @media conference last week, I’ve come away feeling more enthusiastic about the up and coming microformats movement.

Last year, during @media 2006, I was introduced to the concept for the first time. In its simplest form, it’s a way of wrapping web content within a set of standardised class names. The idea being that if everyone uses these class names, it makes it very simple for other systems to parse and pluck out the data. The example given was marking up personal details as an hCard which could then be linked to the Technorati contact generator. The generator will parse the page and offer you a vCard file for download, which you can then import into Outlook or some other address book software.

For example, I may have some contact detail about myself:

Name: Dan Forys , website: http://dan.forys.co.uk/

To “microformat” it, I could do this:

<div class="vcard">
Name: <div class="fn">Dan Forys</div>,
website: <a class="url" href="http://dan.forys.co.uk/">http://dan.forys.co.uk/</a>;

Simply by applying the classes “vcard”, “fn” and “url”, I have specified that the information contained within is in the “hCard” format. Now, the browser, another website, or some other software could read this page and fetch that information – perhaps offering to put it in the user’s address book. Simple!

At the time, I was somewhat skeptical – since it seemed very much a developer-centric technology. Unless the site developer had the foresight to embed something in the page (like a link to the Technorati generator), the user would not be able to do anything with the data. Basically, microformats depend 100% on the site developers marking up their code appropriately.

Fast forward to 2007, and I’d more-or-less forgotten about microformats until the @media talk by the massively intelligent Tantek Celik. This time, I’m much more enthusiastic about the whole idea, this is why:

Firefox 3 will have support for microformats built-in (and this is mainly what’s converted me to the idea…) The Operator extension for Firefox automatically highlights microformatted information and offers you ways of working with it (download a vCard for example) – No linking necessary! There’s a set of open-source tools that can be used to interpret microformats that can be installed on your own site – or you can link to externally hosted ones like at Technorati

There are tools allowing you to perform SQL-like queries on microformatted pages (I guess through XSLT or just parsing the pages…). Although rather complex, it lets developer do wonderful queries on pages to extract exactly the data they require. Effectively, it lets developers mash-up microformatted websites. By using (X)HTML class names, there is no need to change the meaning (semantics) of the markup used – it also lends itself very well to applying CSS to the information (of course, CSS has its own class selectors for you to use…) I firmly believe that once the browser support is there, this is an excellent way of doing things. I wonder if Microsoft will follow Firefox’s lead with this?

Working in an academic institution, microformats are a great tool for us – we have hundreds of personal profile pages for the staff here. It would be great if they could be marked up to let visitors easily add the information into their address book. It just so happens that we’re needing to rewrite the profile pages anyway to better organise them. Also, we have an events system that would lend itself very well to the hCalendar implementation – which we are also looking at rewriting.

In our case, there’s a couple of teeny barriers to implementation that need some thought though:

Staff are (understandably) twitchy about publishing their personal information on the web and microformats make it much easier to harvest that information. In reality though, their information is already on the web – but the perceived problem is that they might get pestered more from the outside world (eek!) Calendar events are not implemented consistently across the major players (MS Outlook, Google calendar etc.). After some initial experimentation, my line manager found that a different event download needs to be offered for each calendar application. Of course, this is certainly not the fault of the microformats spec. Can we have a consistent and quirk-free calendar file format please?

Still, these are relatively minor issues and microformats have been developed to encompass a lot of useful data types (events, contacts, resumes, reviews to name a few). So I think this is something we’ll be incorporating here as much as we can.