twinql — a SPARQL processor for Wilbur

Current version

The current version of twinql is 0.1.4, released on 2005-08-04. Download it, or preferably use ASDF-INSTALL.

Introduction

twinql is a parser and query engine for the SPARQL RDF query language, built on the Wilbur Semantic Web toolkit for Common Lisp. It is capable of the following:

twinql has been developed against the rq23 grammar, with one minor change to remove ambiguity: dots after query elements (such as triples) are not optional. This restriction will hopefully be removed in future.

twinql is developed by Richard Newman (FOAF), who henceforth will refer to himself in the first person!

You are advised that twinql is under development, and as such might not work, return the wrong results, or even fail to load on any machine but mine. Bug reports are welcome; patches more so. Support for SELECT statements is generally good; the other verbs will work from s-expressions, but the parser elements are untested. Filters, ordering, and limits are not finished, so don't expect them to work!


Examples

twinql can be used to generate s-expressions to be run, or simply to execute queries directly. For example:


(parse-sparql 
  "PREFIX foaf: <http://xmlns.com/foaf/0.1/>
  SELECT ?x ?name 
  WHERE { ?x foaf:name ?name . 
          ?x foaf:knows ?y . }")

(SPARQL :SELECT
        :VARS (|x| |name|)
        :WHERE
        '(GRAPH-PATTERN (TRIPLE |x| !"foaf:name" |name|)
                       (TRIPLE |x| !"foaf:knows" |y|)))

One can also directly run a query (using run-sparql), returning the results in the desired format, or indeed query using the s-expression syntax:

;; Load my FOAF file and get all the people who know me, 
;; returning their names and mailbox hashes.
(sparql :select :distinct t :vars '(name mbox) 
		:from "http://www.holygoat.co.uk/foaf.rdf" 
		:where 
		'(graph-pattern 
		   (triple x !foaf:knows !rich:RichardNewman)
		   (triple x !foaf:name name)
           (triple x !foaf:mbox_sha1sum mbox)))
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result2">
  <head>
    <variable name="NAME" />
    <variable name="MBOX" />
  </head>
  <results>
    <result>
      <binding name="NAME"><literal>Stephen Charles Cook</literal></binding>
      <binding name="MBOX"><literal>8a8d159b39d8789fa93c9d000b75f395a723343d</literal></binding>
    </result>
    <result>
      <binding name="NAME"><literal>Timothy Archer Millea</literal></binding>
      <binding name="MBOX"><literal>e9e511fb6acd12db93f899c9e28249b782c73f8a</literal></binding>
…

;; Describe me and get a DB back.
(sparql :describe :targets '(!rich:RichardNewman) :out-format :ignore)
#<DB size 73 {41540381}>

Documentation is below.


Packages

twinql is dependent on the following packages:

CL-YACC is redistributed with twinql. The others are available through ASDF-INSTALL. I highly recommend that you extend Wilbur's support for querying literals, or any queries against literals will not work. This file is included and loaded by default; you should see no difference in your system, except that queries against literals will work.

The next version of Wilbur will intern and compare literals correctly by default.


Download and installation

twinql is available through ASDF-INSTALL. You can also download a tarball here. A changelog is available as part of the distribution.

If you don't wish to use ASDF-INSTALL, you can manually load the ASDF system definition, or just load the files in the appropriate order (which should be obvious by inspection, or by looking at the ASDF file).

There is currently no public version control repository for twinql, though I hope to remedy that soon.


Documentation

Once twinql has been loaded, the following functions (and a number of others — see defpackage.lisp and the sources) become available to you. I'll document the others over time.

function parse-sparql (string &optional (default-prefixes nil) (default-base nil))s-expr prefix-hash base-uri
Parameters
string
the string to parse.
Return values
s-expr
the parsed s-expression.
prefix-hash
a hash table of namespace prefixes.
base-uri
the base URI for the query (often nil).

Take a piece of text and parse it as SPARQL according to the rq23 grammar. If parsing fails, the parser will signal a yacc:yacc-parse-error condition.

parse-sparql optionally takes a hash of prefixes and a default base URI. These are overridden by the contents of the query itself.

function run-sparql (sparql &key (result-format :sparql-xml) (stream *standard-output*) (default-base nil) (default-prefixes nil))results-list | t
Parameters
sparql
the string to parse and run.
result-format
the format in which to return results. One of :ignore, :sparql-xml, or :plain.
stream
the stream on which to serialise the results (only applicable to :sparql-xml and :plain).
default-base
the base URI to use for relative URIs in the query. This should be a string, and is overridden by any BASE declaration in the query body.
default-prefixes
a hash of prefix to URI mappings (both represented as strings). The contents of this hash will be added to the prefix lookup table before the query is parsed; the query PREFIX declarations take precedence.
Return values
t
If serialising to a stream, run-sparql returns t.
results-list
If :ignore is provided as the result format, the list of binding hashes is returned.
function run-sparql-to-string (sparql &key (result-format :sparql-xml) (default-base nil) (default-prefixes nil))string
Parameters
sparql
the string to parse and run.
result-format
the format in which to return results. One of :ignore, :sparql-xml, or :plain.
default-base
the base URI to use for relative URIs in the query. This should be a string, and is overridden by any BASE declaration in the query body.
default-prefixes
a hash of prefix to URI mappings (both represented as strings). The contents of this hash will be added to the prefix lookup table before the query is parsed; the query PREFIX declarations take precedence.
Return values
string
run-sparql-to-string returns whatever is printed by the presentation engine, given the requested result-format.
function build-matcher (sparql var &key default-prefixes default-base)function
Parameters
sparql
the string to parse.
var
the symbol used in the query to which node (see later) should be bound.
default-prefixes
a hash of prefixes to URI mappings (both represented as strings). The contents of this hash will be added to the prefix lookup table before the query is parsed; the query PREFIX declarations take precedence.
default-base
the base URI to use for relative URIs in the query. This should be a string, and is overridden by any BASE declaration in the query body.
Return values
function
build-matcher returns a function of two arguments, (node db), that substitutes node into the compiled query, runs it against db, and returns the number of results, or nil if there are no results.

Take a SPARQL query and produce an anonymous function. This function can be saved and funcalled with a Wilbur node to test if that resource matches the query.


Support

I am often to be found on #lisp and #swig on irc.freenode.net as rich_holygoat. You can also use the contact form on my homepage to get in touch, or figure out one of my email addresses.

Feedback is incredibly welcome. This software will only continue to improve with motivation, fixes, bug reports, and good wishes. Do send them along!


License

I haven't settled on a license yet; I have yet to fully examine dependencies. However, I'm most likely to use my good old BSD-with-modifications, just like the rest of my code.

As always, if these terms are not acceptable to you, get in touch and I'll see what I can do.


Acknowledgements

Thanks go to Andy Seaborne for his useful and stimulating replies to my emails; to afa and others for discussion, to Ora for Wilbur and rewarding dialogue. Further thanks, of course, go to the developers of the libraries on which twinql depends.

Thanks to Leigh Dodds for forcing me to stumble across the FROM bug that is fixed in 0.1.4.