Last twenty weblog entries which:
Last 20 with a comment by me:
http://www.intertwingly.net/blog/?q=//atom:feed[contains(atom:entry/atom:author,%20'Obasanjo')]
Not bad. It seems your XPath engine doesn't support multiple boolean expressions in the predicate. I was tried the following query but kept getting 404s
Wow, nice stuff. You're doing this with a lots-of-little-files XML repository?
Keep doing this, and I might just have to clean up my blogging data and start playing with this. :)
Les: yes, lots of little files.
Danny: care to identify a tangible limitation? I like a good challenge...
Thanks for the info Sam.
To pick up you challenge to Danny about relational vs tree:
How about: show all the entries that have the same first word (i.e., without specifying what that word is).
This is a kind-of relational (recursive) join query.
But, in general, I bet that searching for matches on blog entries probably isn't really a case where the relational / tree structure limitations can be really explored.
The blog entries are, in one sense, essentially a single table (see my "table" syndication format view of my blog at: http://icite.net/blog/200309/really_tabular_synidcation.html ).
Jay, I may be misunderstanding what you are suggesting, but that sounds to me like something XSLT excels at. In pseudo-code, what one can do with XSLT is:
foreach entry $id=id $word=substring-before(entry.content,' ') foreach preceding::entry(word=$word) print match($id,id)
Sam,
I'm trying to figure out why your query is so complex why not
for $e in collection("atom-files-directory")//atom:entry,
$word = substring-before($e/atom:content/text(), ' ')
where $word = $id
return $e
My XQuery is a little rusty since I haven't kept up with the spec drafts but that should work.
PS: You aren't accepting posts sent to your blog via the CommentAPI. Is this a bug on your end or mine?
Dare, I may not fully understand Jay's example, but he did indicate that a join was required.
P.S. I just tried a few test posts via the Comment API, and they appeared to work. Can you capture a wire trace?
Sam, your XSLT pseudo-code looks like it will work for what I was thinking it wouldn't work for, so I was wrong about this as being an example showing a limitation with a tree structure.
For my example, I was thinking of a query in SQL like:
select id from entries a join entries b where substr(a.entry,0,locate(a.entry,' ') = substr(b.entry,0,locate(b.entry,' ')
And I was thinking that this couldn't be expressed in a single XPath statement. But, SQL vs XPath is not the same issue as graph vs tree anyway.
OK, I see where I misunderstood his example. The XQuery should be
for $e in collection("atom-files-directory")//atom:entry,
$word = substring-before($e/atom:content/text(), ' ')
where
for $e2 collection("atom-files-directory")//atom:entry
$word2 = substring-before($e2/atom:content/text(), ' ')
where $word = $word2
return true()
return $e
The short answer is: if you try it, I'll block your ip address. ;-)
The longer answer is: there are many ways to do a DoS against my site or any site. As you note, I do have a cache, so I can easily easily put a per day cap on the number of unique queries I will serve (effectively disabling new unique queries for a day or so) enabling the rest of the site to be served.
The DOS issue is a little overstated. The example that was given for Syncato exploited a bug in Pathan that results in an infinite query execution time. This is fixed in a newer Pathan release that the site hasn't yet been upgraded to. Syncato also has a cache that would mitigate any repeat requests on the same query (assuming it's not exploiting a bug). Regardless it is always possible to DOS a site that generates content dynamically.
XPath may be slightly worse then previous tools, but this should not in any way dissuade anyone from exploring its potential. I can think of dozens of reasons why you "shouldn't" be doing this kind of thing, but I made an explicit decision to shove those aside and focus on the exploration of what power this kind of thing brings.