Thu 27 Apr 2006

Hankering

I have an inexplicable desire to play Syndicate Wars. Solid gold from 1996.

(Add to the list, as I think of them, Magic Carpet 2, Lander, Terra Nova….)

Posted at 2006-04-27 15:32:50 by RichardLink to Hankering
Comments, trackbacks.

Vim 7

I'm now using Vim 7 from MacVim.org. I use the ps_color scheme. Unfortunately, it seems that there's no paren match colour in ps_color, so I found out a little about Vim's color system:
  • To list all set highlight colours, try :highlight.
  • Thusly, the paren match colour is (obviously enough) MatchParen.
  • A good one for me: highlight MatchParen guifg=#e0e0ff guibg=#3040a0.
I'd also like to have a play around with lispParen09 sometime….

Posted at 2006-04-27 13:10:57 by RichardLink to Vim 7
Comments, trackbacks.

Nasty bug

Last night and this morning I fixed a very sly little issue in twinql. As with most bugs of its nature it was caused by a subtly invalid assumption. This particular assumption was that at any stage in the evaluation of a query all of the bindings would have resulted from the same touched variables. twinql maintained one hash (termed a ‘touched’ hash) of variables that had been encountered in the query, which alters its behaviour when evaluating triple patterns.

This assumption holds just fine in almost every instance — every normal, nested, or filter pattern. However, it's technically possible for a UNION pattern to ‘fork’ the query such that some of the returned results have ‘seen’ a variable, and some have not.

Given the query:
PREFIX ex: <http://ex/>
SELECT * WHERE {
{
{ ?x ex:p ?r }
UNION
{ ?y ex:q ?v }
}
{ ?s ex:s ?v }
}
and the data
  ex:one ex:p "one"
ex:two ex:q "two"
ex:thr ex:s "two"
The UNION pattern yields two solutions: ((x ex:one) (r "one")) and ((y ex:two) (v "two"))
xrysv
1:ex:one"one"
2:ex:two"two"
We then move onto the next graph pattern, { ?s ex:s ?v }.
For the first UNION binding, ?s and ?v are unbound, so we have the pattern
(nil ex:s nil)
to match:
xrysv
1:ex:one"one"ex:thr"two"
2:ex:two"two"
The second binding bound some variables used in the second pattern, yielding:
(nil ex:s "two")
and binding to:
xrysv
1:ex:one"one"ex:thr"two"
2:ex:twoex:thr"two"
twinql works up until this point:
 (run-sparql "PREFIX ex: <http://ex/>
SELECT * WHERE {
{
{ ?x ex:p ?r }
UNION
{ ?y ex:q ?v }
}
{ ?s ex:s ?v }
}
")
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result2">
<head>
<variable name="x" />
<variable name="r" />
<variable name="y" />
<variable name="s" />
<variable name="v" />
</head>
<results ordered="false" distinct="false">
<result>
<binding name="x"><uri>http://ex/one</uri></binding>
<binding name="r"><literal>one</literal></binding>
<binding name="y"><unbound/></binding>
<binding name="s"><uri>http://ex/thr</uri></binding>
<binding name="v"><literal>two</literal></binding>
</result>
<result>
<binding name="x"><unbound/></binding>
<binding name="r"><unbound/></binding>
<binding name="y"><uri>http://ex/two</uri></binding>
<binding name="s"><uri>http://ex/thr</uri></binding>
<binding name="v"><literal>two</literal></binding>
</result>
</results>
</sparql>
T
… but we can break it by adding one triple:
  ex:fou ex:s "fou"
Our UNION patterns give us:
xrysv
1:ex:one"one"
2:ex:two"two"
Apply the first solution to the second pattern:
xrysv
1a:ex:one"one"ex:thr"two"
1b:ex:one"one"ex:fou"fou"
Apply the second solution to the second pattern:
xrysv
2:ex:twoex:thr"two"
You'll see that in the first solution, ?v had not been touched, leaving two possible binding pairs for ?s and ?v. The second UNION branch touches ?v, leaving only one possible binding for ?s. If ?v had been bound to another value other than "two" or "fou", no results would be survived from the second UNION branch.

Unfortunately, twinql returns (or, rather, returned) a spurious binding:
xrysv
Spurious:ex:twoex:fou"fou"
This comes out because the evaluation of the second pattern cannot be simultaneously aware of ?v being touched and not being touched, depending on the set of results. An earlier version of twinql shared touched hashes between UNION branches, so horrible cross-mojination took place. The version in which I encountered this error forks the touched hash for each fork, which meant that variables encountered in UNIONs were out of sight of later patterns. Not good.

This spurious result is obviously impossible, because
(ex:two ex:q "fou")
does not appear in the graph. The definition of a solution to a SPARQL query is a set of bindings such that the query pattern, filled in with those bindings, appears in the graph.

How can we solve this issue?
  • Remove touched hashes, and recalculate the touched hash from each binding hash as we go.
  • Cluster binding hashes with their associated touched hash (optimisation).
  • Go through a refinement stage, removing all hashes which are invalid.
The first is the right solution. As we operate on bindings, we check their touched status. When evaluating individual triples, we check that none of the bindings are covered. Simple! Remember: premature optimisation is the root of all evil, and it turns out that touched hashes were premature optimision, though I didn't know it at the time.

The interesting thing about software like query engines is that it's difficult to thoroughly test them — the functions are very dependent on their input data, rarely actually breaking, instead returning subtly wrong answers. This instance illustrated another difficulty: you can use a query engine for months without hitting one of the ‘wrong’ cases, with correct results being produced for all of your other queries. I only noticed this one because I was investigating a problem I thought was caused by a UNION clause in a query, and I had to write out a situation and deliberately try to break twinql in order to expose the bug.

Posted at 2006-04-27 10:41:42 by RichardLink to Nasty bug
Comments, trackbacks.

Early

I like getting to work early (even 8am is fine when you're an engineer). It's quiet when the building's that empty, and if I put my headphones in soon enough it's like I can catch the quiet inside my head.

Posted at 2006-04-27 09:51:07 by RichardLink to Early
Comments, trackbacks.

Google
Web holygoat.co.uk
  • richard is: