October 5th, 2006
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: firstname.lastname@example.org
Syndication feeds such as RSS and Atom have the power to automate the delivery of all forms of digital content. The word “content” can refer to weblog posts or MP3s, the U.S. President’s last speech or the photos of your last family vacation that you’ve uploaded to Flickr. If you subscribe to a podcast (perhaps using iTunes) then you’re using an RSS feed. Any form of content that can be digitized can be delivered through a syndication feed, and therefore that delivery can be automated, and secondary events can be triggered because of that automation. As such, syndication feeds are enormously important, and the full elaboration of their potential is still in an early phase.
However, the most popular syndication feed is RSS, and its development has been marked by in-fighting among those most interested in it. The conflict has left important issues unresovled, issues which have crippled the speed at which this technology becomes stable. There are real economic costs associated with this in-fighting. For instance, companies such as Netvibes and Bloglines, which offer RSS reading services, have struggled to support the many competing interpretations of RSS. Sam Ruby has documented many of Bloglines frequent failures to handle the variations that an ambigious specification has inflicted upon it. Personally, I’m unable to use Netvibes to read some of my favorite weblogs, because Netvibes chokes on certain kinds of feeds.
These and many other companies are struggling to deal with a chaotic situation. Because the specification for RSS remains ambigious, these companies must spend money trying to develop a patch for every possible interpretation of what an RSS feed can be. The additional expense is delaying the moment when syndication feeds can be viewed as reliable. As such, anyone with an interested in the Web, or an interest in any form of content that can be delivered over a network, should be concerned with the troubles afflicting RSS.
RSS grew up out earlier attempts to offer some kind of online syndication, such as Microsoft’s Channel Definition Format and the ScriptingNews format developed by Dave Winer. The first actual RSS spec was developed by Dan Libby at Netscape in 1999. Almost immediately, Dave Winer realized the potential of RSS, and began to promote it. He was eventually joined by many others, people such as Sam Ruby and Mark Pilgrim who jointly wrote a validator so that developers could test their RSS code. However, Mark Pilgrim and Sam Ruby later fell out with Dave Winer, whose leadership of the RSS effort was eventually called into question.
The most important recent expansion for RSS was the use of enclosures to reference multimedia files from inside the feed. Dave Winer developed the idea in 2001 and incorporated the idea into the 2.0 version of RSS, but the public didn’t become aware of the implications till 2004, when podcasting began to take off (podcasting is an RSS feed that references multimedia files, like MP3s).
In the 2.5 years I’ve been a member of the RSS Advisory Board, three questions have been asked most often by programmers having difficulty interpreting the RSS 2.0 specification:
1. Can an item contain more than one enclosure?
2. What elements are allowed to contain HTML?
3. How do I deal with relative URLs?
I think it’s time that the board answered them.
The doubts surrounding the issues raised by these three questions have lead to buggy, incompatible implementations of RSS. Sadly, whenever anyone, or any organization, tries to resovle these 3 questions, they tend to get sued by Dave Winer. Winer aruges that no changes should be allowed to the 2.0 specification because resisting changes makes it a stable platform. But most programmers ask, how can it be stable if it has unresolved issues that lead to innumerable bugs? As Shelley Powers wrote:
There is a belief that if it weren’t for the fact that the earliest versions of HTML were unstructured – full of proprietary idiosyncrasies and ill-formed markup indulged by too-loose browsers – the web wouldn’t have grown as fast as it did. Somehow, we’ve equated growth with bad and imprecise specifications rather than the more logical assumption that the growth was due to interest in an exciting new medium.
As such, we’ve carried forward into this new era in web development an almost mythical belief in bad specifications. If we wish to have growth, we think to ourselves, we mustn’t hinder the creative spirit of the users by providing overly rigorous specifications. Because of this belief, we’re still battling ill-formed, inaccessible web pages created by a legion of web page designers who picked up some pretty bad habits: namely the use of deprecated attributes and proprietary elements, as well as the use of HTML tables for everything. …
In the end, rather than aid the growth of the web, bad specifications slowed it down as a new generation of web pages had to be created out of the ashes generating by burning the old. …
That’s why I look with some confusion at the backlash against efforts to clarify the RSS 2.0 specification. There is no doubt-none whatsoever-that the RSS 2.0 specification, as currently written, is ambiguous; from what we’re hearing now, in comments and email lists, it is being kept deliberately so. I don’t understand this. This would be no different than to ask Microsoft not to follow standardized use of CSS in the new IE 7.x. Why on earth would anyone want this?
There are, of course, multiple versions of RSS. The most important early split was between those who came up with RSS 1.0 and Dave Winer, who came up with another version of RSS he called “RSS 2.0″. Although the original RSS specification was developed at Netscape, Dave Winer played a large role in the early evolution of RSS and he was the first to copyright a version of it. Winer feels it is unfair that others sometimes use the word RSS to refer to versions of RSS that were not created by him. Mark Pilgrim has carefully documented how this split first arose in the RSS community.
What developers need most, regarding this or any format, is open access to all possible information that might effect the code they write. To achieve that, the development of many formats are often turned over to standards bodies (such as the World Wide Web Consortium), which then guide the development of the format in an open and democractic way. The problem with RSS 2.0 was that all the important information about how it worked was stored in Winer’s head. Every once in awhile, Winer would make a pronoucement, revealing facts about RSS 2.0 that no one had known before. Consider the rage that was provoked when, one day in 2003, Winer revealed that the “link” element should only refer to the weblog post whose text is being syndicated in the RSS feed. Said Mark Pilgrim:
If this is the case, I’m shocked. It’s the first I’ve heard of it, and I wrote the f*ing validator, which is not to say that I’m god, but merely to say that I’ve volunteered to dive deeper into RSS arcana than almost anyone else. And this is news to me.
According to this new (COMPLETELY UNDOCUMENTED BEFORE TODAY) rule, UserLand’s recently-announced New York Times RSS feeds uses LINK incorrectly.
– [Mark Pilgrim]
A few months later (February, 2004), Mark Pilgrim set out to document the many versions of RSS that had sprung up:
I have often stated that there are 7 different and incompatible versions of RSS. This was based on an embarassingly simple formula: I counted the version numbers in use (0.90, 0.91, 0.92, 0.93, 0.94, 1.0, and 2.0) and came up with the number 7. But recently some people have taken to claiming that there are not 7 versions (despite obvious evidence to the contrary), and even if there are, that they are somehow compatible with each other so it doesn’t really matter. So I dug a little further to precisely document the incompatible changes in each version of RSS.
I would like to publicly apologize for my previous misstatements. There are not 7 different and incompatible versions of RSS; there are 9.
Oddly enough, Dave Winer argues that RSS is simple, and that by freezing its development he is defending it from developers who wish to make it complicated. Yet many developers have argued that the ambiguity surrounding RSS actually makes it quite complex, because, not knowing the right way to do things, the developers must try to catch every possibility. Mark Pilgrim, in the comments to a post by Phil Ringnalda, recounts the very complex steps his feed parser must undertake to deal with a guid (that is, a global id) and then remarks:
I hope this clarifies the situation for other aggregator developers. Remember: RSS is simple! Rejoice, relax, and bask in the radioactive glow of simplicity!
In another comment, Pilgrim added:
RSS 2.0 is not “frozen”. It has never been “frozen”. It has been capriciously and backwardly incompatibly changed multiple times without community involvement or discussion. Examples that were in the original version have been silently removed when their existence was deemed politically inconvenient. Exactly one man (who, ironically enough, claims not to “control” RSS) has been responsible for all of these changes.
Mike Davies, in a post titled “RSS’ troubled past, present and future“, lamented the in-fighting that was damaging the community of those interested in RSS, and was saddened to see Dave Winer attack Shelley Powers’s new book:
Dave Winer is running amok again, railing at OReilly because they didn’t acknowledge him as the co-creator of RSS. …
The fractured relationships and emotional handcuffs attached to RSS are damaging its acceptance to the web community. For my part, I see RDF as the basis of the Semantic Web, so counter-productive specifications like RSS 2.0 that scrap the RDF fundamentals re-introduced in RSS 1.0 do more to divide RSS into an unworkable specification.
…Looks like the O’Reilly book in question is Shelley Powers’ Practical RDF – coincidentally a title I’ve already pre-ordered from Amazon.
In the summer of 2003, those who disliked Dave Winer’s leadership on RSS issues decided to start a new syndication format, called Atom. By creating a competitive alternative to RSS, the Atom effort forced Winer to act slightly more responsibly, and to make some efforts towards cleaning up his RSS spec. As Mark Pilgrim put it on July 1, 2003:
Apparently Dave is cleaning up his specs in response to the community pressure created, in part, by the recent effort to create a new weblogging format. This has Dare Obasanjo publicly wondering (1, 2, 3) whether this will cause the new project will stall. I predict it will not, and here’s why:
Whatever Dave wants to do with his specs, that’s fine. In fact, it’s great. The community felt there was a need for errata, and now we have errata. A clear win for everyone who uses those technologies. But I don’t see how that affects this new project.
We’re doing exactly what Dave told us to do last fall: creating the SOAP of syndication (and the SOAP of editing APIs as well). By that I mean we’re making a next-generation format, with a new name, founded on what we hope will be a good mixture of best practices and current practice.
By this point, many developers had grown wary of RSS because they were worried that it was too much under control of Dave Winer, and they did not trust him. Historically, other technologies have arisen that have been controlled by a single individual (Linux is a good example) but in those cases the individual was trusted by the community and so no one minded their power over the direction of the technology. It such situations, it’s important for the individual who holds that power to be honest, humble, and open. Winer gave the impression that he lacked those qualities. As soon as the Atom effort was underway, Google decided to use it, and not RSS, as its main syndication format:
The search giant, which acquired Blogger.com last year, began allowing the service’s million-plus members to syndicate their online diaries to other Web sites last month. To implement the feature, it chose the new Atom format instead of the widely used, older RSS.
The battle between RSS and Atom has divided the blogging world since the summer, when critics of RSS came together to create an alternative format. Since then, a raft of blog sites and individuals have lined up behind Atom, while Yahoo has thrown its considerable weight behind RSS.
The Blogger decision to offer Atom only has angered supporters of RSS, who accuse Google of helping to splinter a wide network of RSS-using bloggers.
And yet, the Atom effort began 3 years ago, and despite the pressure to clean up the RSS 2.0 spec, Winer has stubbornly resisted all efforts to clarify the three main ambiguities that still plague RSS. Cadenhead has tried to resolve the ambiguity that surrounds RSS, but Winer has sued Cadenhead once and may sue him again in the future, which is perhaps odd since it was Winer who first chose Cadenhead to defend RSS and to be Chairman of the RSS Advisory Board. Shelley Powers put this in colorful terms:
In an effort to defuse what could only be termed mutiny in the ranks, otherwise known as the ‘Atom Effect’, Dave Winer turns the copyright of the RSS 2.0 specification over to Harvard, attaching a Creative Commons License reflecting something about share and share alike. The nobility of the act stuns people – well other than those who questioned how much of the specification he was entitled to claim as his copyright. Oh, and those people who kept insisting that Creative Commons licenses were not designed to cover something such as software or specifications.
Accepting the accolades as only what he was due, the Big Dog then anoints a committee of three to watch over our sleeping beauty, the little syndication feed that was. But these caretakers take little care, and run for the hills-whether of gold or sanity, only they can say. Poor little feed lies there, alone and vulnerable, while its bastard cousin, Atom, is fed care and attention and grows up to be a big, strapping specification that can bite through ambiguity and confusion, like Jaws bit through surfer girls.
It is then, when our precious little orangy bundle of joy is at its most aloneness that even Bigger Dogs enter the picture: Apple and Microsoft, seeing the light (or, more likely, seeing a potential new profit stream) embrace RSS and in the process, fracture, bruise, and even somewhat maim it. “The problem is,” the masses cry out, “the specification is too open, too ill-defined.”
Enter now, a new hero: Rogers Cadenhead. Stalwart defender of Popish dignity and bearer of thick, wavy, locks of silver. Big Dog taps Rogers on the shoulder with his sword and says to him, “You shall be my defender, the RSS Champion”.
-curtain closes for intermission, while scenery is changed-
After the intermission, Dave Winer stunned everyone by attacking Roger Cadenhead, and then Winer announced that the RSS Advisory Board was disbanded:
1. The spec is owned by Harvard. 2. The RSS Advisory Board, when it existed, performed a support function. Later, in case anyone was still confused, we disclaimed: “It does not own RSS, or the spec, it has no more or less authority than any other group of people who wish to promote RSS.”
To which Cadenhead replied:
As a member of the RSS Advisory Board for the past 21 months and the current chair, I am surprised to learn that the organization doesn’t exist. I joined the board at Winer’s invitation in May 2004, not long before he resigned. The group operated in private without a charter, and as I said at the time, the reason I joined was to help guide Really Simple Syndication to a public, participatory model like that enjoyed by Atom and RDF Site Summary (a.k.a. RSS 1.0).
Winer didn’t like that so he posted the following email to the mailist of the RSS Advisory Board:
And with that, I am banging the gavel and ending this experiment of Rogers’s.
Tomorrow I will talk individualy with all the corporate members of the “board” and ask them to resign.
Rogers may then wish to propose a new structure, one that is consistent with the “come back to earth” message.
They may wish to join with him, or they may not.
If anyone else decides to join up with him on the terms of the old “advisory board” I will talk with each of them individually, until they see that it serves no purpose.
This process will go on until Rogers gets the idea that it isn’t going to work.
I may at some time send him a bill for all of my time that he is wasting.
Winer often feels that people are unfairly attacking him, or cheating him out of something that is rightfully his. He has had a long standing argument with Tim O’Reilly, who owns the great tech publishing house O’Reilly. On the controversial subject of who invented RSS (was it Netscape or Dave Winer) the O’Reilly company has always said, and written in the books they publish, that RSS was invented at Netscape. Winer has long complained bitterly about this. Said Winer:
Yesterday I received an email from O’Reilly and Associates CEO Tim O’Reilly. In the message he says that if I don’t stop saying things that he considers inaccurate, he will punish me. I responded saying Tim should say what he has to say, publicly, and I will continue to say what I have to say. His threats of punishment do scare me, but I have to go through that fear if I can look at myself in the eye.
To my readers, threats like this are an invitation to probe more deeply. I believe I can substantiate O’Reilly’s corporate involvement in Web syndication standards. At the time I received this email I was trying to work privately with O’Reilly editors on ethical questions relating to this.
Winer is also bitter that O’Reilly doesn’t invite him to speak at the conferences that O’Reilly organizes. O’Reilly defended one such incident like this:
Why didn’t I invite Dave? I was looking for people who I thought would work well together in an unstructured way, without grand standing or insulting other participants if they happen to disagree. My experience in working with Dave is that you never know what you’re going to get. He can be a great contributor, but he can also decide, for no apparent reason, that someone is somehow on “the other side,” at which point he becomes disruptive and abusive.
I know Dave claims he doesn’t like personal statements (except the ones he makes, of course), but he suggested that his readers ask, and you’ve done so. I’ve given Dave this feedback privately, and each time he’s said it’s inappropriate to tell him such things, that he believes his behavior is above reproach, and that I’m out of line for giving him any personal feedback.
When someone reserves for himself the right to “flame at will,” and claims that his flames are only his quest for truth, in spite of feedback to the contrary from many people, he should expect that those people will not invite him to their meetings or discussions. I completely grant that Dave has the right to remain on the outside, to critique anyone he likes, and to crusade for whatever causes he believes in, but if he wants to be included in events that I organize, he’ll have to behave more politely. He may consider that censorship; I consider it etiquette. No one disputes his right to his views–in fact, we all still read him because his views and ideas are so interesting–but I think he needs to recognize that his social habits will, from time to time, lead him to be left out of events and discussions to which he might otherwise be invited.
So, did my personal feelings about Dave (or more precisely, my personal experience working together with Dave in the past) influence my decision not to invite him? Absolutely. Did it influence me to “exclude his voice” — absolutely not. I’ve regularly read what he’s written, passed along his ideas to others, even invited him to write a chapter in an upcoming anthology on P2P that we’re planning to publish. (Note that he turned us down, which I quoted in the linked message.)
In 2003, the fighting intensified, and many people began to feel that it would no longer be possible to work with Dave Winer on the future development of RSS. Tim Bray remarked:
“Dave Winer has done a tremendous amount of work on RSS and invented important parts of it and deserves a huge amount of credit for getting us as far as we have,” Tim Bray, a member of the World Wide Web Consortium’s (W3C) influential Technical Architecture Group, wrote in a June 23 Web log entry. (Bray is also a co-creator of Extensible Markup Language (XML), a (W3C)-recommended language on which RSS is based.) “However, just looking around, I observe that there are many people and organizations who seem unable to maintain a good working relationship with Dave.”
To which Dave Winer responded:
“Why has my personality become the issue? They’re using that to try to get me to shut up,” Winer said in an interview. “I think most people don’t have a difficult time working with me. It’s unfair. It’s untrue. And it’s unbecoming of someone of (Bray’s) stature to make statements like that. You can’t create things with flames–you can only tear things down with flames. If they want to create things, they can’t do it with the dislike of one person.”
It was at this time, the summer of 2003, that a number of people, including Mark Pilgrim and Sam Ruby, decided they could no longer work with Dave Winer. Tim Bray announced the beginning of the Atom specification:
Why Are We Doing This?
Too Many Versions Â· Speaking personally, when I first put [my blog] ‘ongoing’ on the air I got several emails, one or two of them quite aggressive, asking why I was generating RSS 2.0 instead of the “technically superior” 1.0 version. I at least knew enough about the issues to have an opinion; someone who’s not an insider would probably have suffered severe angst over this. And I bet that if I’d started with 1.0 instead, I would have received mirror-image email from the other side. This sucks.
Political Realities Â· Too many versions, you say. Well, why don’t we get together and agree on how to merge them? Except for, the interested parties have a track record of inability to get along and work things out and make progress. To the extent that in some circles “RSS” has become a synonym for “Reliably Spiteful Squabbling.” Kofi Annan and the Dalai Lama might be able to achieve consensus, particularly if they could get Don Rumsfield to credibly threaten peacemaking backup with the 3rd infantry, but life’s too short, I’m tired of it, unless we can get consensus without further argument by this time next week, it may be more cost-effective to start over.
As to who the motivating force behind the new effort was, Tim Bray said:
Although Sam Ruby is carefully scattering sand over his tracks, I get the impression that he’s the one who’s responsible for this flurry of activity. I like this for the following reasons:
* I called up Sam to talk to him about this and said “This might be worth doing, but do you have any idea how much time it could take?” Sam said “I got sign-off from my boss to work on it full-time.” I can’t emphasize how important that is, having someone focused on a project and not fitting it in between their other work. One of the main reasons XML worked out so well is that Jon Bosak’s boss turned him loose to work on it, and I’d just quit my job, and James Clark has never had a job, and Michael Sperberg-McQueen routed around his boss for a few months, so we had four people grinding away pretty well full-time.
* Sam works for IBM, which is fine by me, decause I think IBM is well-positioned to make a ton of money from applications that have RSS in their infrastructure But, I don’t think IBM gives a rat’s ass what the format looks like as long as it works and can’t be hijacked by Microsoft. So to use business-speak, Sam’s corporate motives are well-aligned with the interests of the stakeholders.
Creating a competing syndication format possibly slowed the day when programmers might have an easy to use, universal way of formatting syndication feeds, but giving RSS some competiton may force improved behavior from all participants.
What are the real chances of cleaning up RSS and resolving the ambiguities that plague it? What Cadenhead is suggesting is ambitious, given the likelihood of lawsuits from Winer:
In February, work began on a new, written-from-scratch draft of the specification, with each revision announced and vetted on the RSS-Public mailing list. The main contributors to the draft are four members of the board and one of the lead developers of the Feed Validator: James Holderness, Randy Charles Morin, Sam Ruby, Greg Smith and myself.
The new draft documents the same elements and attributes described in RSS 2.0 (version 2.0.8), the current spec, making no changes to the requirements upon which RSS creators Dan Libby and Dave Winer sparked the incredibly successful RSS boom. No elements have been added or removed.
It does clarify the RSS specification in the three areas mentioned above, based on our interpretation of the current spec and its predecessors:
1. An item cannot contain more than one enclosure. The only RSS element that can be present more than once in an item is category.
2. The only RSS element that can contain HTML is an item’s description.
3. Relative URLs are not allowed. When they’re encountered in an item’s description — which is not recommended — the feed’s link element should be used as the base URL.
What are the chances that this will be approved without more lawsuits? Keep in mind that Dave Winer’s attacks on people are so frequent that a game has been invented about them: Six Degrees Of Dave Winer. Mark Pilgrim introduced the game in this manner:
Paul Erdos was a prolific mathematician who co-authored papers with hundreds of people. The Erdos Number is part of mathematics folklore; it measures the six degrees of separation distance between you and Paul, using co-authorship as the measurement.
In this spirit, and in recognition of the fact that Dave Winer is the center of the weblogging universe, I propose a new designation: The Winer Number.
Here’s how you can determine your Winer Number:
1. Dave Winer has Winer Number 0.
2. If you have been personally abused by Dave Winer, your Winer Number is 1.
3. If you have been abused by someone who has been abused by Dave Winer, your Winer Number is 2.
For the purposes of this designation, personally abused means an ad hominem attack directed at a single person.
While a game such as this is probably an unnecessarily harsh personal attack against Winer, it grew out the reality that Winer tends to himself use a harsh tone when expressing disagreement with someone. For instance, though Winer once said the Sam Ruby was an important part of the RSS community, after Ruby suggested a clarification of the data-types-url portion of the 2.0 specifiction, Winer then said that Ruby was an outside agent manipulating RSS for the benefit of IBM:
Here’s an illustration of tech industry interference with RSS. That’s Sam Ruby, the lead of the Atom working group, an employee of IBM, trying to rewrite the rules of RSS 2.0. Do you understand what he’s saying? I don’t. Assuming he means well, which I think is a stretch (he’s got a huge conflict of interest) he surely doesn’t understand the phillosophy of RSS 2.0. Does management at IBM know he’s doing this, is this part of a strategy to keep their lock on the enterprise software business, which RSS clearly is a threat to? Like Sam, they have a conflict of interest too. In the tech world, I’ve learned that if you think the worst of people’s motives you’re usually right. IBM doesn’t generally go for the high road. In any case, IBM should call him off, now. Atom is fine, let people use that if they want, but if you screw with RSS, we’re going to shine the light on you.
It is odd that Winer felt the need to write “Do you understand what he’s saying? I don’t” since Ruby’s point is intelligble to anyone who has followed the development of RSS, which Winer obviously has.
The economic costs of all this infighting continues to accumulate, but various companies are investing the money needed to fix the most urgent problems that their feed parsers face. Sam Ruby has recently been complimentary of Bloglines efforts:
The only charitable way to put it is that the current parser inside of Bloglines evolved over time. It had (and has) to deal not only with multiple, incompatible, and underspecified specifications, but also with multiple, incompatible, and often non-compliant implementations of these specifications. Previously, I’ve cataloged a few of the most common errors.
Along the way, the Bloglines aggregator has become a part of the feed eco-system. Simply put, many people design their feeds not according to any specification, but rather to make sure it works with Bloglines. To be clear, Bloglines is not unique in this regard, others do the same with NetNewsWire, and I expect many to do the same with IE7.
The inevitable result is calcification. Software that was once, well, soft and pliable, has since become something you daren’t touch as you risk affecting the way that untold millions of feeds are interpreted. You don’t touch it even if it means that Bloglines differs from either the specifications themselves or the way tools like NNW or IE7 handle these same edge cases.
With Atom, Bloglines has decided to pursue a fresh beginning. A beginning free from the tyranny of the past. The new status quo is that if you have a test case which is based on real world usage, and can point to the section of the spec which indicates how this test is to be interpreted, then the Bloglines development team will not only address the issue forthwith, but they will also add a test case to their regression test suite so that the same issue will never reoccur.
A clear spec + a regression test suite + Red/Green/Refactor =>
a parser which is not only maintainable,
but also one that can remain so indefinitely.
This post by Ruby implies two possibilities for the future of RSS. Either a single company, perhaps Bloglines, shall gain such marketshare that it can force a de facto resolution of the ambiguities pertaining to the 2.0 format, or companies will eventually abandon RSS in favor of Atom. Atom, after all, gives companies a chance to start over fresh and do everything right. Of these two possibilities, the latter seems more likely.
There is a third possibility that is not implied in Ruby’s post. Perhaps Cadenhead can succeed in his effort to clarify RSS. Given the growth of the community around Atom, Cadenhead’s current effort is likely the last chance RSS has to see its ambiguities resolved through an open, public process. If these issues aren’t resovled this year, no one will give a damn about RSS a year from now.Source