It’s just data

A New Butler For Jonas

It has been two weeks since Jonas Luster fired his butler.  The primary reason cited was gross insubordination — an unwillingness on the part of the butler do as he was asked.  Perhaps the fact that the butler doesn’t help with of Jonas’s real needs — the ones behind the statement I am busy still trying to sift through all those advertisements — was also a factor.

On the surface, it seems that Google’s AutoLink meets the letter of each of Yoz’s three rules for determining if such a tool is within the spirit of the Web.  However, upon closer examination, it turns out that there may be some wiggle room in the first rule.  Today the Google Toolbar for Internet Explorer is closed source.  The skills and techniques for creating such a thing are well beyond the abilities of most users.  Heck, they are black magic to most seasoned programmers.

Simply put, if you can’t see the definition (and furthermore don’t trust the author to the point of being willing to write what amounts to a blank check), then how can a closed source toolbar ever pass the “completely understood” test?

What if such a tool were open source?  Furthermore, suppose the tool were well commented, and structured in such a way that people who have even the most tenuous grasp on the concepts of HTML could reasonably modify it to remove the links and content that they don’t want, and add or reorder the links that they do want?

Today Mark Pilgrim made available a new butler that is all this and more.  Made available under the GPL.  Share and enjoy.

Now, Jonas, would you fire this butler too?


That’s a very nice application of evil genius.

Posted by Rogers Cadenhead at

Mark’s Butler is a nice little app — much more useful than the Google Toolbar.

Shame it fails rule #2.

Posted by Billy at

Why is Mark not announcing this himself?

It seems to me that if you’ve publicly retired from blogging, you shouldn’t then privately slip out sly jabs by backchannel.

Or is that just me?

Posted by James Kew at

Stipping google adds seems to be a popular use for greasemonkey!

@Kew
It would seem to me that if you retire from blogging does not imply that you retire from computing. Now is he responsible for other people’s blog posts ?

Posted by Darryl at

Sam: Thanks for the link. I note you italicised my phrase “within the spirit of the web”. I can’t tell whether you mean it earnestly or sarcastically, but either is fine, as the phrase sucks. I was feebly groping around for a phrase to communicate my concept of “validity” on this topic, and haven’t been bothered to come up with a better one. If anyone else has an easy one that springs to mind, please leave it in the comments on that entry.

Billy: You appear to be right. However, I think this is mostly due to unclear writing on my part rather than malice on Mark’s. I will endeavour to rewrite and clarify later on tonight unless I... ooh, something shiny.

Posted by Yoz at

Yoz: Actually, I wasn’t trying to imply malice. It really is a nifty hack — I have no use for it, but I can see where others might. Sam seemed to be saying that, if we use your rules as a baseline, Butler is analogous to AutoLink only better because the GPL makes it more compatible with rule #1. My point was the analogy doesn’t hold because Butler fails rule #2 — it’s on automatically from the time you install it. I have no doubt that Mark could fix that in about five minutes.

Even if he did, I don’t think Sam’s point is valid. Sam asks how a closed source toolbar can pass the “completely understood” test. My question is, if being able to read the code is a requirement for understanding, how can an open source toolbar pass the “completely understood” test for the average user? The answer is, it can’t, and we’ve now redefined rule #1 so that, for the average user, all content modification is evil.

As far as I’m concerned, neither Butler nor AutoLink is good or evil. They just are. They’re tools, given freely to users to do with as they see fit.

Posted by Billy at

Sam, your flame bait hrefs are broken. They should be “/blog/flamebait.html” instead of “flamebait.html”.

Posted by Gary Burd at

I “announced” Butler on the Greasemonkey script repository, which is linked from the Greasemonkey homepage.  It seemed appropriate at the time; I’m sorry that doesn’t meet with your approval.

Posted by Mark at

Even if he did, I don’t think Sam’s point is valid. Sam asks how a closed source toolbar can pass the “completely understood” test. My question is, if being able to read the code is a requirement for understanding, how can an open source toolbar pass the “completely understood” test for the average user? The answer is, it can’t, and we’ve now redefined rule #1 so that, for the average user, all content modification is evil.

Does anyone read the code to every application they use?

The advantage of open-source is that you have good reason to assume that someone has read the source code and can pass on their understanding of how the program works.

Whether you, yourself, have the ability (or inclination) to plow through the source is not an insuperable barrier to obtaining the desired level of trust and understanding.

Conversely, with a closed-source program, it’s hard to see how you could ever fully trust it.

Posted by Jacques Distler at

Does anyone read the code to every application they use?

Theo does.

Posted by Mark at

Gary: fixed.  Thanks!

Posted by Sam Ruby at

My point was the analogy doesn’t hold because Butler fails rule #2 — it’s on automatically from the time you install it.

Yes, I realise that. But then, if a user specifically wishes to have something run automatically on every page they visit, I don’t think there’s anything wrong with that - what matters is that the user specifically requests every action the tool takes. But they should be allowed to request all future actions at installation time. The “automatically” of which I’m scared is the one where the tool does something without the user’s choice, either because they didn’t know it was going to do that action or because, when they installed the tool for a different feature (such as, in this case, a handy search bar) they could not disable other automated features which they had not chosen (hence the "in isolation").

(I realise, by the way, that my rules probably still have enough holes to drive a truck through. IASNAL. I purely created them to describe my view of a situation, for one blog entry. I’m amazed they’ve lasted this long in people’s minds, but I’m glad they’re useful)

As far as I’m concerned, neither Butler nor AutoLink is good or evil. They just are. They’re tools, given freely to users to do with as they see fit.

Don’t really buy this one for much larger moral reasons. If someone created a button that ran a DoS on a random victim’s site, it would just be a tool that did nothing until activated, but it would still be evil (in my eyes, at least). I realise this is a rather contentious belief. The evils that my rules are intended to rule out, however, are more in the spyware/malware domain, where the user’s consent is highly questionable at best.

Posted by Yoz at

Greasemonkey and applets like Butler are brilliant - more so for the hint of what is at hand rather than specific functionality.

Butler is a statement - and an example - of why the “Simple” World Wide Web must continue to be simple , generally understandable, and OPEN.

USers (capitalization intended) win - we retain control, which is no small feat in a world where micro-marketing strives to exploit our every foible whether its picking cereal for breakfast or choosing politicians.

Posted by Mike Watkins at

Jacques: I agree that open source can be trusted more than closed source. But, rule #1 isn’t about trust, it’s about understanding. Unless I’m missing his point, Sam seems to be asserting that Butler fulfills rule #1 better than AutoLink does because it’s GPL. Let’s read the rule again:

has a definition that is completely understood by the user

“...understood by the user.” Not by the user’s geeky friend, but by the actual user. So, the question isn’t whether anybody can read the code, it’s whether an individual user can read it and whether or not that helps him or her understand what the program does. For the average user, it doesn’t. It increases trust, but not understanding. So, being open source does not give a tool an advantage when it comes to rule #1.

Of course, all this arguing hinges on us accepting Yoz’s rules as law, and there is no particular reason that you have to. I think it’s a fine definition, but I don’t have a problem with AutoLink, so I’m probably biased.

Posted by Billy at

It would seem to me that if you retire from blogging does not imply that you retire from computing.

He’s clearly being sucked back into the blogosphere. It reminds me of how the other prisoners felt when Cool Hand Luke was recaptured.

Posted by Rogers Cadenhead at

Yoz: By “tools” I meant “content modification tools that fulfill Yoz’s rules” like we’ve been talking about. I should have been more clear. Your DoS button example would definitely not fulfill rule #3.

Posted by Billy at

what matters is that the user specifically requests every action the tool takes

In which case Butler is evil; it adds a banner to the top of all Google search results pages which doesn’t seem to be mentioned anywhere in the feature list and, as far as I can tell, can’t be disabled by any means other than editing source code (at which point you don’t really have the same tool anymore). It’s not actually a malicious bar but it is a little irritating :)

Of course, I don’t actually think that Butler is evil but, then again, I don’t think that Google’s Autolink is even remotely deserving of the controversy it has caused. Which isn’t to say that there aren’t content modifying programs that do qualify as evil but maybe to suggest that it’s difficult to sum up the criteria for evilness in a succinct set of rules; a good categorisation should be based on the benefit to the user rather than on the method of invocation of the program.

Posted by jgraham at

If we’re going to judge Butler on the good/evil scale, the place where it is most nefarious — aside from removing all ads — is on the Google Toolbar main page.

It adds an entire paragraph that appears to be part of the page, aside from a star bullet at the beginning:

You may also be interested in these Google-related Firefox extensions:

SearchStatus displays PageRank and Alexa information in status bar

BetterSearch adds thumbnails and other features to search results

Search Keys adds keyboard shortcuts to search results



Posted by Rogers Cadenhead at

The advantage of open-source is that you have good reason to assume that someone has read the source code and can pass on their understanding of how the program works.

I think this is a highly questionable and even dangerous assumption.

For something like OpenBSD, where you have hundreds of volunteers continually re-examining code, it’s pretty damn safe - though there’s still an incredibly small chance that they either might miss something or that Theo & Co. are part of a massive conspiracy to spread malware.

Butler is hosted on Mark’s website. It may be under GPL, but the only person who can alter the code on Mark’s website is Mark, no matter how many other people find problems with it.  If, while browsing Mark’s website, I come across Butler and decide to install it, at no point has anyone else’s opinion of the code been allowed to intrude without Mark’s permission. So I have three options:
1. I can examine the code myself.
2. I can search the rest of the net for external opinions about it.
3. I can just throw caution to the wind and install without doing any checking.

Option 1 depends on my technical abilities. Option 2 depends on a sufficient number of other people having been bothered to check the code, which is far more likely to happen with, say, pf than it is for a three-page browser extension.

As it is, by the simple matter of being referenced by several A-list bloggers, it’s already been more thoroughly examined in a single day than at least 50% of the source code on the net, and way more than most of the contributed extensions on Mozilla Update, hardly any of which (AFAIK) went through any kind of audit process by Mozilla staff before they were made available on the public site. (This has already been raised as a problem in several places, I believe)

In theory, open source Mozilla extensions are a lot safer than Google Toolbar. In practice, the opposite is true:
In summary: slapping an open source licence on code does not automatically remove any malicious features. Yes, the “with many eyeballs” saying does apply, but there is a continual tendency to over-estimate the number of eyeballs available.

Posted by Yoz at

Arse, my bullet list got eaten. (Sam - try doing bullet lists without indenting the asterisks. You’ve got a greedy bug there)

Let’s try that again...

In theory, open source Mozilla extensions are a lot safer than Google Toolbar. In practice, the opposite is true:

Oh, and just to fend off anyone who thinks I’m launching an attack on open source: I’m a dedicated Firefox advocate running all kinds of freaky extensions, mainly because I think the risk of malicious code, while higher than with Google Toolbar, is still low enough for me. That’s the kind of reckless, carefree daredevil I am!

Posted by Yoz at

We’ll discuss the salary later…

I just hired a new butler, he’s not as much of a snob as the old one. And he speaks a language I speak. He’s a bit rude, and I might have to train him to be nicer in the future, but all in all, I think this is a better way to [...]...

Excerpt from jluster.org's webvergnügen at

And yet... the Google Toolbar phones home over SSL when you click the AutoLink button. It’s done so ever since it was first released, I’ve read a ton of ranting about the Toolbar, and I’ve seen not one word about what it says when it phones home. I tried to find out myself, but I’m not smart enough to be able to catch it unencrypted. Of course, since Google’s Toolbar auto-updates, just knowing what it said on one person’s computer on one day doesn’t mean it will say the same thing everywhere for all time.

Posted by Phil Ringnalda at

Yoz said:

what matters is that the user specifically requests every action the tool takes

"It looks like you’re trying to view a web page with images.  Would you like the Web Browser Assistant to prompt you for each image?

You were saying?

Posted by Mark at

Phil: Oh bugger. I guess that makes the snooping rather harder, then. (But still not impossible. Just, um, way way hard.)

Mark: I’m probably arguing myself into a hole here, but what I said on my original entry still applies: The point of the “rules” is to mark out what is definitely good/valid/"in the spirit of the web" rather than what is definitely not. So even if the user doesn’t request every action, it’s not necessarily bad. Also, I did say that the user could request all such actions in advance, and one could argue that knowing that this is what a web browser is meant to do is part of that consent. This is why I think Butler fulfils rule #2 okay.

We’re already heading into areas of defining “user consent” where I was incredibly wary to tread.

Posted by Yoz at

Butler shows thay the AutoLink “controversy” is inane. How people can get so worked up about users marking up stuff in their browsers is crazy.

Posted by pb at

I don’t think Butler shows that at all; it just demonstrates that the issue is bigger than the toolbar. The Document Object Model’s one big fat API for modifying web content when it arrives in the browser.

I didn’t realize this until I tried to write an HTML obfuscator, just to see if it would confuse autolink. It failed. I wasn’t thinking of my web content as an object that arrives at a client to be transformed at will.

If you think there are any legal or ethical boundaries that should apply to user agents and what they can do to web content, I think DOM’s the big threat, not Google, although autolink encourages everybody and their brother to get into the page rewriting business.

Posted by Rogers Cadenhead at

The stupidity continues

Dave Winer points to an anti-AutoLink article by Danny Sullivan, where the lack of awareness continues full throttle: All-in-all, Butler is just the latest example of the “mess” AutoLink created when it was released, as I wrote earlier....

Excerpt from Smalltalk Tidbits, Industry Rants at

If I buy a book from Amazon. If, when I receive that book, I open it up and find that someone from Amazon went through and scribbled a bunch of stuff in the margins (e.g. references to other books, ads for other products available through Amazon, etc) I would be quite ticked off and justifiably so. The book publisher and author would also rightfully have something to say about it and Amazon would/should get in trouble for doing such evil things. However, if after receiving the book I decide to scribble notes in the margins, underline passages, rip out pages, etc, that’s a perfectly acceptable thing to do. It’s not evil.

(more here: [link])

Posted by James Snell at

Rogers wrote:

“If you think there are any legal or ethical boundaries that should apply to user agents and what they can do to web content, I think DOM’s the big threat, not Google, although autolink encourages everybody and their brother to get into the page rewriting business.”

You say that like its a bad thing, Rogers.

Posted by Brian Carnell at

Maybe it isn’t. But I have trouble believing that most web publishers would be motivated by the view that their work is just raw material to be reprocessed by one or more scripts, toolbars, or browsers — especially if any of those agents makes money from doing so.

Posted by Rogers Cadenhead at

I have trouble believing that most web publishers would be motivated by the view that their work is just raw material to be reprocessed by one or more scripts, toolbars, or browsers

Allow me to show you my latest invention, sure to revolutionize the industry.  It’s called a news aggregator.

Posted by Mark at

I have trouble believing that most web publishers would be motivated by the view that their work is just raw material to be reprocessed by one or more scripts, toolbars, or browsers

But, if I’m applying some transformation to content from your site, I’m probably doing it because it makes that content more useful to me. There’s no sensible reason to object to one’s audience adapting your material so that it’s in a form that they find more useful except, perhaps, where it affects your business model. But web users showing little regard for business models when they have a negative impact on the browsing experience has a history; everybody blocks popups, people disable animated gifs, use tools like flashblock or even (shock horror) don’t install flash to avoid the most irritating ads, and do dozens of other content-altering  things all the time.

If anyone believed that html provided any sort of assurance that the content the user sees is exactly what the author would have them see, they’ve been totally blind to the reality of html client applications for much longer the two weeks  that this has been a hot topic. Possibly some of those people will decide that html authoring’s not for them. I’m not sure it will  be a great loss.

Posted by jgraham at

People are so quick to write off content creators who are disturbed by [InsertDisruptiveTechnologyHere]. But sometimes you do lose things when those people go away.

Allow me to show you my latest invention, sure to revolutionize the industry.  It’s called a news aggregator.

Funny. But there’s an important difference. An aggregator is opt-in. Anyone who wants out can stop providing their feeds.

Where does someone opt-out of the DOM?

Posted by Rogers Cadenhead at

“Where does someone opt-out of the DOM?”

Where does someone opt out of popup blocking?  Or voice browsers that don’t display banner ads?  Or users making printed copies?  Or users copying text into an OpenOffice document and editing it?

Usually, when someone attempts to “opt out” of allowing users to control content on their own computers, it’s called DRM, and it doesn’t work.

Posted by M. Brubeck at

Rogers,

Last time I looked, no one forces anyone to install the Google Toolbar.  last time I checked, it only worked in IE.  Futher, it can be uninstalled.  I fail to see any force here. 

Furthermore, say you get Google to agree on some opt out tag you can use.  Do you seriously believe that the avalanche of custom Autolink like things that get written over the next few months will respect it?  If you want real opt out for this sort of thing, there’s only one path:  a fully DRM aware browser that can prevent you (the end user) from doing things the producer doesn’t want you do.

Next time you hit a PDF file that turned copy off, and you are ready to yell - bear in mind that you now claim to want that.

Posted by James Robertson at

<i>Next time you hit a PDF file that turned copy off, and you are ready to yell — bear in mind that you now claim to want that.</i>

I’m not claiming anything. Ever since I took a better look at the DOM to understand what Google’s toolbar is doing, I’ve been backing away from the ledge I was going to throw myself off of when autolink hit full release.

I think we’re stuck with beloved and not-so-beloved butlers, whether Google dumps autolink or not.

Posted by Rogers Cadenhead at

Rogers, my point above was that these problems have no special relation to the DOM.  They are the same with any format that can be modified, period.

Plain text RFC?  Use a client-side Perl script to cross-reference it with another document.  MP3 podcast?  Run it through a batch ‘sox’ command to cut off the annoying intro.  PNG stock market graph?  Write a web app that uses libgd to serve up cropped versions of the images, with alternate captions inserted.

This is not a problem with the DOM; it’s a problem with all formats that aren’t protected by unbreakable DRM.  What you are (were?) asking for is exactly what the MPAA wants: the ability to control what people can do with the content after they receive it.

Posted by M. Brubeck at

Will

Trying to watch that Googley SSL conversation, aren’t you? Previous versions of Fiddler did show HTTP/S, but that was removed in the current release....

Excerpt from phil ringnalda dot com: : Comments at

I Am User, Hear Me Roar

Apparently some folks are in a tizzy over Google’s autolink doohickey. Phil Ringnalda pointed out a substantive difference between this thing and the Microsoft thing, SmartTags, over which everybody got their knickers in a twist when IE6 was in...

Excerpt from Cox Crow at

Comment on My year in 12 copy-and-paste comments by: Mark

January Pedants! February I’ve been active; you’re just not looking in the right places. March “It looks like you’re trying to view a web page with images. Would you like the Web Browser Assistant to prompt you for each...

Excerpt from phil ringnalda Comments at

Add your comment