MonkeyPatch for Ruby 1.8.6
One of the joys of Ruby and HTML5 is that one can easily extract data from a web page with an XPath expression. For example, the following extracts that URI of the RSD document from a weblog that supports RSD:
require 'open-uri' require 'html5/html5parser' doc = HTML5::HTMLParser.parse(open(ARGV[0])) rsd = doc.elements['//link[@type="application/rsd+xml"]/@href'].to_s
Unfortunately, there is a bug in Ruby 1.8.6 that affects documents with a default namespace (even a vestigial one, like those sported by WordPress weblogs) which prevents non-namespace qualified attribute names from working in XPath expressions.
The following monkey-patch fixes this:
require 'rexml/document' doc = REXML::Document.new '<doc xmlns="ns"><item name="foo"/></doc>' if not doc.root.elements["item[@name='foo']"] class REXML::Element def attribute( name, namespace=nil ) prefix = nil prefix = namespaces.index(namespace) if namespace prefix = nil if prefix == 'xmlns' attributes.get_attribute( "#{prefix ? prefix + ':' : ''}#{name}" ) end end end
As I am bound to hit this issue frequently, I’ve added it to my monkey_patches file:
export RUBYOPT='-rubygems -r/home/rubys/bin/monkey_patches'