intertwingly

It’s just data

Business and Open-Source

David Shields: having worked for almost five years now as a member of the team that manages IBM’s open-source strategy and its execution, I can claim some expertise in this area.  There are no vast secrets here, no grand plan. Here is the strategy as I understand it, and as I have worked to implement it.


The Right Ones in the Right Order

Leonard Richardson: If you liked RESTful Web Services but thought the words were in the wrong order, you’ll like Services Web RESTful.


Blogger in Draft Support for OpenID

David Recordon: Awesome, see their post! OpenID commenting as a beta feature on Blogger, way to go guys! I just tried it out and as you’d expect, it works great. Also really nice to see another site first accepting OpenID instead of providing it.

Seconded.


1984 + 4K

Damien Katz: Yay! CouchDB has an official IANA port number … This must be how Steve Martin felt in The Jerk when he got his name in the phone book


HTML5 needs a CarterPhone

Brendan Eich: Standards often are made by insiders, established players, vendors with something to sell and so something to lose. Web standards bodies organized as pay-to-play consortia thus leave out developers and users, although vendors of course claim to represent everyone fully and fairly.  I’ve worked within such bodies and continue to try to make progress in them, but I’ve come to the conclusion that open standards need radically open standardization processes.

The W3C HTML Working Group needs a CarterPhone.  Clearly, Brendan is talking about ES4, but the issues he brings up are general.

...


Bash Here

I’m posting this in case I’m not the last person to realize this.  While I’ve used the Unix sh longer than DHH has been alive, I either never realized or have long since forgotten that it supports here documents.  Example:

ruby <<EOD | sort | uniq -c | sort -n
  Dir['*'].each do |name|
     puts name.split('.',0)[1] || '<null>'
  end
EOD

Full Text Search — SQLite

SQLite is part of android and gears.  Despite being under development for over a year, and part of an actively developed code base, and included in gears, full text search isn’t integrated into the build system just yet.  Because of its non-standard “virtual” tables, it can’t be used directly with encasulation layers like Python DB-API, but can be used directly using minimal wrappers like APSW.  Such a build also requires some manual steps, but the end result is a single shared library that contains SQLite, FTS, and APSW.

patch | build steps | demo


Meme Tracker in IronPython

Dare Obasanjo: My weekend project was to read Dive Into Python and learn enough Python to be able to port Sam Ruby’s meme tracker (source code) from CPython to Iron Python. Sam’s meme tracker, shows the most popular links from the past week from the blogs in his RSS subscriptions.

More recent code can be found here.  Fetches titles from HTML, handles etags, matches both www. and non-www. versions of a URI.  Handles people who point to things multiple times.  Allows you to group people who tend to all “vote” in bulk.  Note: I consider the alternate link to be a vote too, which gives a small bump to people who post original content vs links.

I’d also recommend that you invest some time into converting from a simple regular expression to a real HTML parser.  You’ll need it anyway for titles.


Deconstructing Facebook Beacon

Jay Goldman: On November 6th, 2007, Facebook launched a series of new tools to help advertisers target the 54 million people now regularly using their site. They’re still throwing around a 3% weekly growth rate and have a target of 60 million active users by the end of the year, so it’s not hard to picture the day in the not-so-distant future when hospitals Facebook babies before handing them over and the little bundle of joy comes with a neural implant that pokes their parental units when the diaper is full. [via Simon Willison]


DEX File Format

Michael Pavone: I’ve started another little reverse engineering project. Google hasn’t released any documentation on their new VM so I decided to get some the hard way. Well, hard is relative here. A decompiled Java class is a bit easier to read than a disassembled 68K binary. Anyway, I’ve managed to write some documentation on the dex file format used by the VM. I hope to have some documentation on the actual instruction set used by the VM in a few days


RFC: FeedBurner Namespace Documentation

Does anybody know when I can find any documentation on the http://rssnamespace.org/​feedburner/​ext/​1.0 namespace?

While the URI appears to be owned by a domain squatter, luckily the domain is owned by the company that became feedburner.

I’d like to make feedburner a namespace known to the feed validator without marking any of the elements as undefined.


Whitelisting

From time to time, the subject of whether to use whitelists or blacklists come up.  As an example, originally when Mark Pilgrim wrote How To Consume RSS Safely (way back in 2003!), he described a list of elements that needed to be blacklisted, and mentioned — almost in passing — that whitelisting may be a reasonable alternative.  Over time, Mark came to realize that there really isn’t any contest: A Whitelist is the best way to validate input.  It basically comes down to a sense of what kind of errors you are willing to tolerate.

...


Astral-Plane Characters in Json

In Characters vs. Bytes, Tim Bray mentions the Gothic letter faihu.  Whether such a character will display properly in your browser depends on what operating system you use and what fonts you have installed.  Whether or not you can handle such characters programmaticly, however, depends on what programming language you use.

...


Phantasmagoric

Kevin Lawver: This was a surreal experience... Dan led a sing-along with a bunch of W3C folks, including Tim Berners-Lee, the inventor of the web, and lots and lots of folks who invented important pieces of it (like CSS, HTML, XHTML, etc). Fun, fun, fun.

Not quite as surreal as finding out that you were one of the subjects.


Dark Side of Postel’s “law”

Simon Fell’s weblog contains the following line:

<link rel="alternate" type="application/atom+xml" title="Simon Fell > Its just code" href="http://www.pocketsoap.com/weblog/feed.atom">

feedfinder.py, atomautodiscovery.py, and feedparser.py version 4.1 will fail to pick it up.

...


Gate

com.​google.​android.​xmppService.​IXmppService.​createXmppSession: Creates a XMPP session to the server, using username and password for the login. createXmppSession starts a new XMPP session if there isn’t one for the username, connects to and logs into the GTalk server. If there is already a running XMPP session for the username, then createsXmppSession just returns the running session.

Why can’t username contain an @?


Out of the Frying Pan

Don Box: I have to say that the authentication story blows chunks.  Having to hand-roll yet another “negotiate session key/sign URL” library for J. Random Facebook/Flickr/GData clone doesn’t scale.  Personally, my dream stack would be ubiquitous WS-Security/WS-Trust over HTTP GET and POST and tossing out WSDL

I’d suggest that the root problem here has nothing to to with HTTP or SOAP, but rather that the owners and operators of properties such as Facebook, Flickr, and GData have vested interests that need to be considered.

...


Making Rights Declarations Easier To Find

Planet CreativeCommons is based on Venus.  Unsurprisingly, given their mission, they visibly highlight the license under which each of the entries are published.  The Universal Feed Parser and Venus take great care to ensure that license and rights information is present in the Atom feeds that are produced, but this is the first time that I’m aware of this data being exposed in the HTML page itself.

...


TCO

Mark Pilgrim: What follows are instructions for building and installing MySQL 5 on Ubuntu. These instructions should work perfectly on both Feisty (7.04) and Gutsy (7.10).

Priceless.

From what I hear, people have had trouble with Leopard and Vista.  By contrast, and like others, I found that the default font for Firefox wasn’t to my liking on one of the three machines I installed Gutsy on.


Caja: Capability Javascript

Ben Laurie: I’ve been running a team at Google for a while now, implementing capabilities in Javascript. Fans of this blog will remember that long ago I did a thing called CaPerl. The idea in CaPerl was to compile a slightly modified version of Perl into Perl, enforcing capability security in the process.

Hopefully like the work of Douglas Crockford [via Patrick Logan], the parser itself is (or will be) written in Simplified JavaScript.

This could be a useful, as an option, for CouchDB.  I don’t yet see the value for allowing even a sanitized subset of scripts through the UFP to Venus.


SSE+5005

Steven Lees: We will remove the “unpublished” element from the spec, i.e. we will remove sections 1.2.9, 2.6 and all of section 4. We decided that the concept of unpublished belongs at the application level, rather than the base SSE specification. We will include information in the SSE implementer’s guide that describes how applications can implement “unpublished” behavior on top of SSE.

It seems to me that SSE + RFC 5005 complement each other.  RFC 5005 can help you identify which entries have changed, and SSE can help you identify what changes were made to those entries.


If It Hurts When You Do That...

Sanjiva Weerawarana: Are you smart enough to build a RESTful application? … Programming XML in Java still sucks

Patrick Mueller: Moved my content over with a simple matter of programming.


Pluggable Feed Format

Sometime yesterday Jay Young's default feed switched back to RSS 2.0.  The world didn’t end, and not everybody cares about such minutia, but Jay clearly does.  Jay may be a minority, but this enhancement would enable Jay and others like him to simply drop in a plugin such as this one, activate it, and be on their way.

The patch does not change the default feed format from RSS 2.0.  Perhaps that could be considered for a release like WordPress 4.0, and a plugin could be provided at that time to enable users to select the venerable RSS 2.0 feed format, but in any case such a change would require a separate ticket, as this patch does not do that.


Matryoshka Dolls

Tim Berners-Lee: HTML is a big community, but there are others communities. Smaller communities are more in need of uri-extensibility than bigger ones.


bzr-feed updated to support bzr 0.90.0

My branch is here.  If all goes as it should, this change should be reflected shortly in the global bzr-feed feed.

...


Poisoned Cache

For the past month, eight feeds hosted by blogs.sun.com were not updated on planet.intertwingly.net, a victim of a poisoned httplib2 cache.  A victim of a permanent redirect.  The evidence can be found here.  Eventually, such feeds would have been viewed as inactive for 90 days, but luckily in this case I caught the problem earlier.

...


ECMAScript round-up

Round-up of ES4 discussions for the past few days: fragmenting, supersetting, civility, and secrecy.

Did I miss anything?

...


MonkeyPatch for Ruby 1.8.6

There is a bug in Ruby 1.8.6 that affects documents with a default namespace (even a vestigial one, like those sported by WordPress weblogs) which prevents non-namespace qualified attribute names from working in XPath expressions.  The following monkey-patch fixes this:

...


Apache2, https, and Gutsy Gibbon

Ideally, reconfiguring your Apache installation under Ubuntu to support TLS/SSL (a.k.a. https) would be as easy as:

sudo a2enmod ssl
sudo apache2ctl restart

Unfortunately, there are additional steps involved.

...


Nebulous Recalcitrance

Brendan Eich: The small-is-beautiful generalization alternates with don’t-break-the-web, again without specifics in reply to specific demonstrations of compatibility.

It is interesting how the don’t-break-the-web meme means different things to different organizations: Mozilla, Microsoft.