intertwingly

It’s just data

Porting REXML to Ruby 1.9

Unicode changes:

Other language changes:

REXML changes:

Outputs of running bin/suite.rb:

["3.1.7.2", "1.9.0", "2007-12-31"]
REXML version = 3.1.7.2
Loaded suite REXML
Started
............................................................................................................................................................................................................................................................................................................................................................
Finished in 12.893064488 seconds.

348 tests, 1252 assertions, 0 failures, 0 errors
["3.1.7.2", "1.8.6", "2007-06-07"]
REXML version = 3.1.7.2
Loaded suite REXML
Started
............................................................................................................................................................................................................................................................................................................................................................
Finished in 34.733291 seconds.

348 tests, 1252 assertions, 0 failures, 0 errors

Ticket, patch, Update: revision


Ruby 1.9 Strings — Updated

My confusion from yesterday was due to a bug, which was promptly fixed — test case, fix.

Now that I understand what is intended, the situation is a lot clearer.  The net result is that any sequence of operations that produce a runtime exception in Ruby 1.9 would also produce a runtime exception in Python 3.0.  Some use cases that are entirely safe will not produce an exception in Ruby 1.9 when they would in Python 3.0.  Such an approach is entirely consistent with a dynamic language.

...


3 + 1 = 2

I’ve got portions of HTML5lib working on Ruby 1.9, enough to pass Mars's unit tests.  My initial reaction to Ruby 1.9’s support isn’t favorable.  I definitely like Python 3K's Unicode support better.  This feels closer to Python 2.5.  In fact, I think I prefer Ruby 1.8’s non-support for Unicode over Ruby 1.9’s “support”.

The problem is one that is all to familiar to Python programmers.  You can have a fully unit tested library and have somebody pass you a bad string, and you will fall over.

...


Two Steps Forward...

Another version of Ruby, a different set of REXML bugs.

Test case.

Sigh.

...


No Tweets

Russell Beattie: The stampede of unsubscribers has begun! The unbearable pain of Twittering is too much for them.

I have my own approach.


Addressable

Installation:

sudo apt-get install libidn11-dev
sudo gem install idn addressable

Usage:

...


Truthiness

Rafe Colburn: Rogers Cadenhead looks at Dave Winer’s long bet with New York Times executive Martin Nisenholtz on whether blogs or the Times would reign supreme by 2007. The winner: none of the above. Wikipedia outranks them both.

Perhaps the requirement to cite sources trumps the requirement to provide credentials.


Yet Another Planet Refactoring

A little over a month ago, I outlined how I would like to see the feed parser reorganized.  I’ve now put a little meat on the bones, in the form of running code.  Not just for the feed parser, but also for Planet.  I also did it all in Ruby, so I named this little experiment Mars.  Warning: this version is 0.0.1.  It just barely runs end-to-end.  Feed it real data, and it will choke on some of it.  But it can now produce partial results.

All in all, I’m pleased with how compact this code is.  If anybody wants to join in on the fun, it is a bzr repository and there are plenty of test cases ready to be ported.

...


Standards that Matter are Standards that Ship

Fundamentally, Microsoft’s strategy is sound.  Ignore standards that you find inconvenient, and focus on producing and enabling the production of content people want.  While my humble site can’t compete with the likes of Jackass 2.5, I do have a few people who follow my site.  I’ve switched my front page to HTML5 despite the fact that this means that MSIE7 will therefore ignore virtually all the CSS styling rules that apply to the page.  The page validates modulo an acknowledged bug in the validator.

...


Eventual Consistency

Amazon SimpleDB [via Simon Willison].  Erlang.  Schemaless.  Cool.

Amazon seems to really get the Getting Started should be free aspect of the web, and is clearly targetting the “She has the idea on Wednesday and gets the script working next Monday, and one quarter later, either gives up on the idea or is incredibly rich. Both are good outcomes” developer market.

Meanwhie, Mark Nottingham of Yahoo! is proposing standards for caches which prefer to serve slightly stale content fast in lieu of providing late or broken results.

Update: Keith Gaughan: That ain’t REST. They screwed up the “REST” interface for it exactly the same way as they screwed up the one for SQS and FPS.


REXML and Mangled Text

Rick Blommers: ReXML seems to escape items very nicely when setting values.  But it doesn’t unescape the values with REXML::Document.new( … )

A bare minimum amount of functionality that one would expect from an XML parsing library is the ability to round-trip data.

The two things One thing I have yet to find is where I can SVN checkout the latest code, and how to run the exiting set of tests.  I would like to submit new tests which expose the problems I have found so far, and patches to correct these issues.  Ideally in time for 3.1.8.

...


HTML5 As Viewed By IE8

Dean Hachamovitch: You will hear a lot more from us soon on this blog and in other places. In the meantime, please don’t mistake silence for inaction.

I’d love to see a screenshot of this page using MSIE8.  While supporting HTML5 would be nice, the fact that MSIE7 won’t allow CSS rules to apply to any of the new HTML5 elements will significantly inhibit adoption of this standard.


phpMyId 0.7

Since I last looked at phpMyId, it has progressed from version 0.3 to version 0.7.  A number of changes occurred.  Here’s how I dealt with them.

...


FeedSync

Steven Lees: Today we published the final v1 spec for Simple Sharing Extensions, under a new name, FeedSync. The new name is a little simpler than the old one (kind of ironic!) and it captures the intent pretty well.

James Snell has some good comments.


HTML5 Deployment Considerations

Lachlan Hunt: HTML 5 introduces and enhances a wide range of features including form controls, APIs, multimedia, structure, and semantics

In the interest of getting practical deployment experience with these specifications, I plan to explore exploiting these new tags on both my weblog and my planet.  Two issues immediately come to mind, and I’m sure I’ll encounter more.

...


Resource Oriented Registry

Paul Fremantle: fundamentally the approach we have taken is to build a registry/repository based on REST concepts. And as we looked at the REST space, we kept noticing how close the Atom Publishing Protocol (APP) is to our needs, so we’ve made that the public remote API to access the repository. Of course, if you are just browsing the registry, you only need a browser - APP is mainly there to support updating resources.  Of course, using Atom and APP gives some really nice benefits too - like being able to subscribe a feed of new resources that meet your search criteria.


Little Details

Anil Dash: We announced Beacon support on both LiveJournal and TypePad as initial launch partners. But we worked really hard with the Facebook team on one really important detail — making sure our implementations are completely opt-in.  Not to put too fine a point on it, but this was kind of a no-brainer.

Kudos to 6A on getting the more important “unfixable” detail correct.

...


DIS29500 Comments

Alan Bell: To get the data into the site the documents were opened in OpenOffice.org Writer, then copied to Calc, tidied up manually (merged cells are evil) then imported to Lotus Notes 8 via the built in Symphony spreadsheet (I had been working on some code to import Calc into Notes so this was easy) exported to XML then imported to WordPress. The import file was just over 2Mb in size. ... The main difficulty in importing the data was smartquotes and em dashes that Word had autocorrected.

Ad Hoc, Situated Software at it’s finest.  There even are feeds for comments on the comments.

Rob Weir has an analysis of the resolutions proposed to date.