Who Dictates The Future of XML?

We are on the brink of a very exciting time. The buzzword-friendly “Web 2.0” is here, and it’s most punctuated by three terms: social networking, AJAX, and RSS. Nothing about these things is inherently new – AJAX existed as an ActiveX control present in Microsoft’s Outlook Web Access long ago, social networking has existed for some time via sites like Friendster, and RSS is just a style of XML, which has been floating around in mainstream tech circles for about 10 years. But Web 2.0 is here, like it or not. The question is, as use of these technologies begins to become more widespread, how are we going to shape these technologies, and who is going to make those decisions?

Some time ago, I started playing with styled RSS on my website. I used standard CSS, rather than the more advanced XSLT (“Extensible Stylesheet Language Transformations”), because all I needed was style, not the amazing transformations that XSL can accomplish, which actually includes introducing a dynamic element that appears to add javascript-like functionality. The purpose was twofold – one, to prove I could do it. This is very often the motivation behind things I code: to do it just to have done it. This is the entire reason my website has an API, just because I wanted to experiment with handling blogger/metaweblog API requests.

The second reason for styled RSS was a proof of concept for an internal application for my company. The idea was a search engine on top of some of our internal databases that returned styled XML (either RSS or Atom) to the client. This way, our other internal apps could query the same exact page, but parse the data reliably. What a novel concept: interacting applications, user friendly search, and half the code for a single interface. Perfect.

Things were coming along nicely, until I tried Firefox 2. I realized very quickly that Firefox 2 decided to override my style in favor of its own stylesheet. As far as I could tell, there was no way to shut it off. So I opened a bug in Mozilla’s Bugzilla database. As you can see, the developers marked it WONTFIX, meaning, they didn’t consider it a bug and won’t fix it. Luckily, after some time and several angry posts, it was reopened.

Now, lest anyone think I’m taking advantage of my status at OSNews to gripe publicly about my personal bug, there’s a point to this beyond this bug, so bear with me.

The developers had made their stance on this bug known and suggested moving the conversation to USENET. Before moving off line, I decided to check in with other popular browsers. I fired up IE6 to test. As expected, IE 6 displays the RSS feed properly styled, as expected. Next was the new Internet Explorer version 7, and found that sure enough, IE7 overrides my style as well. IE7, however, is different in three major ways:

  1. Like Safari, IE7 is a full featured feed reader: you can subscribe in the browser and manage read/unread feeds. This shifts the action from viewing a single feed to a secondary goal of the application, feed management.
  2. IE7 adds additional functionality beyond that offered by simply RSS, such as the ability to search, organize and reorder via date or title, and search within the active feed and the feed history (XSLT can also accomplish some of these things, but not all of them).
  3. Most importantly, IE7’s so-called “feed view” can be turned off in the browser preferences (Tools > Internet Options > Content) and you can view the feed with the intended style.

An RSS feed in IE7

An RSS feed in IE7
IE7’s default feed view, and the same page with the xml-stylesheet applied properly
(note that some CSS2 is still missing, such as :after { content: }).

Next, I checked out Opera to see how it handles XML feeds. Opera is also a full-featured feed reader that allows you to subscribe to and read feeds from within the browser. When visiting a feed, an alert window, much a like a Javascript alert, pops up and asks you if you’d like to subscribe to the feed. In the meantime, the feed is rendered in the background with the XML style properly applied. Unfortunately, if there is no stylesheet, it renders it normally, which is the ugliest of the three. It doesn’t even present a DOM tree for unstyled XML, which was the default behavior of Firefox and IE before their current versions.

An RSS feed in Firefox 2Lastly, I searched for workarounds for this behavior in FF2, and found one that amounts to little more than a dirty hack. Firefox determines a feed by sniffing the first 512 characters of a file and searching for the string “<rss” or “<feed”. If either is found, the XML is presented as a feed with the feed stylesheet. The unofficial workaround is referred to as “byte-stuffing,” and boils down to adding at least 512 characters (spaces included) to your feeds before the opening or tag. In my case, I’ve used an XML comment on my site’s feed, because the default Firefox stylesheet can be confusing. It’s hard to even tell where each entry ends, it can be deceiving: like this feed (pictured at right), for example, if you’re using Firefox 2. Who can understand what’s going on there?

Knowing how feeds are generally handled – which is vastly different from browser to browser, I posted my piece to the USENET group mozilla.dev.apps.firefox. It has become the longest thread in quite some time in this group. There, we’re had some very interesting conversation. The Mozilla developers argue that RSS and Atom is intended to be read by a feed reader, and therefore, presenting a consistent and easy to use interface to users is the smartest path and is an intentional design decision. The opponents argue that Mozilla has overridden developers, deprecated an existing valid XML construct, and in the meantime, prevented any Firefox user from ever seeing styled XML (at least, RSS and Atom) as intended. In short, it now cannot be read by a human with intended style, as the style is not accessible – not via a preference, not via about:config, not via a menu. There is no way a feed can be read without hacks even if the client wants to. Firefox thinks it knows better.

For the record, similar discussions have occurred in different venues regarding IE7. As seen here, it’s been the same conversation, although over on Channel 9, most seem to agree this behavior is not acceptable.

Either way, the USENET discussion appears to be at a stalemate. My final arguments included the notion that as technology becomes popular, the designers often tend to cater to a wider userbase and alienate the very users who helped grow the application to begin with. There are several other respondents who had similar viewpoints. Ultimately, right or wrong, the developer standpoint seems to be that RSS and Atom are not intended for human consumption and should be styled in a manner that makes it easier for a user to subscribe to a feed automagically. All of this is fine, except it breaks the system for anyone who has come to use and rely upon XML as it has been used for the last several years and is still valid according to W3C standards. Sites that have worked for years that are extremely styled will no longer work without a hack.

Which brings us to the underlying question – without regard to this debate – the question is, when common use begins to define a course of action, but there is already a “de facto” standard – that is, accepted as the common manner of use – how can we/should we go about changing it? Furthermore, who gets to do it?

It appears the Mozilla Foundation is going to push XML subscription forward kicking and screaming, but in the process, they are going to anger at least some group of developers and webmasters. In the process, they could potentially thrust a new technology to a a whole new group of users – a net positive, I’m sure we’d all agree. However, it’s equally likely that RSS will never catch on with average users, but the same content providers will remain angry about this type of decision.

Perhaps this toe-the-line kind of decision making is a necessary sacrifice. Perhaps it’s just the price of moving technology forward. I see it as a shame, because technology advocates, an admittedly small group, tend to be very vocal, and more importantly, they are, in large part, the people who delivered the 10-15% of the browser market that Firefox has.

Here’s the catch: if Mozilla developers don’t dictate the future of common XML use, who does? Microsoft? The W3C, whose role is supposed to be writing and maintaining standards? Is it users, who will employ technology as they wish, but ultimately, can’t contribute back code to make things work as they think it should?

An RSS feed in IE7

A padded RSS feed in Firefox 2
Everything aside, there are still major differences in the way browsers handle and display RSS. This is the exact same styled feed in IE7 with feed view off and Firefox 2 with the byte-stuffing hack.

XML is going to play a large part in data exchange in the future. For about 8 years now, the idea of exchanging information over the internet in a non-terse validate-able manner has been growing, and the numerous website APIs and even the mere existence of sites such as digg.com, technorati, and reddit, social bookmarking sites that can drive large amounts of traffic to your site, are proof that data syndication is not only viable, it is where the next wave of growth will likely be. Businesses that are sending spreadsheets around will soon find that XML is capable of helping them automate much of their workflow. However, in doing their best to make XML more friendly, technologies such as Atom have been crippled.

When technical folks are left out in the cold with no solutions, they tend to migrate to a new product or a new technology that can accomplish what they are trying to do without dirty, unreliable workarounds. The future of XML has not been cemented. It remains to be seen whether or not RSS ever catches on in a meaningful way or remains another niche tool for a small group of advanced users. In the meantime, who gets to steer the ship? And moreover, who should it be?

24 Comments

  1. 2006-11-07 4:45 pm
  2. 2006-11-07 5:04 pm
  3. 2006-11-07 5:04 pm
  4. 2006-11-07 5:35 pm
  5. 2006-11-07 5:38 pm
  6. 2006-11-07 6:26 pm
    • 2006-11-07 6:37 pm
      • 2006-11-07 7:56 pm
        • 2006-11-07 8:00 pm
      • 2006-11-07 8:50 pm
        • 2006-11-08 2:08 am
  7. 2006-11-07 6:46 pm
    • 2006-11-07 7:59 pm
  8. 2006-11-07 7:18 pm
    • 2006-11-07 7:26 pm
      • 2006-11-07 8:15 pm
        • 2006-11-07 8:23 pm
          • 2006-11-07 8:52 pm
  9. 2006-11-07 8:20 pm
  10. 2006-11-07 9:05 pm
  11. 2006-11-07 11:22 pm
  12. 2006-11-08 6:43 am
    • 2006-11-08 1:09 pm
    • 2006-11-09 8:15 am