Sat 02 Apr 2005

SMS parsing

I'm back on my old hobby-horse, parsing arbitrary SMS files (though in future I'm not going to use C!). This time I'm having a go at the ones generated by the Series 60 app MessageStorer (aka MsgStorer). This (rather good) program outputs text files with a few annoyances:
  • The date given is that of the SMS file, not of the message — a message that is delivered late will have a time that's between a few minutes and a few days too late. There's nothing I can do about this unless I can fetch the files off the device.
  • The times aren't given with time zones or even AM/PM indicators, which is troublesome.
  • The names are as shown in the Messaging application (e.g. “Newman Richard”), so they need to be looked up.
  • The file is formatted in an arbitrary plain-text way.
As I enjoy a challenge, I threw the excellent CL-PPCRE at the problem, and now have a tool that will parse the file, convert the dates, prompt me to link up names to foaf:Persons, and hand me back a Wilbur database full of RDF that describes the messages.

This is a rather specialised application (number of Mac-using Lisp hackers who want to parse SMS messages from text files produced by MsgStorer: 1), but I might put the source up anyway. It taught me a lot about Perl-compatible regular expressions, and it's a practical application of CL-PPCRE and Wilbur.

Anyway, that's another datasource for my Semantic Web stuff!

Posted at 2005-04-02 14:58:30 by RichardLink to SMS parsing
Comments, trackbacks.

Bowie and Eno

Last night I ran across another marvellous interview with David Bowie and Brian Eno, dated September 1995. It's an interesting and amusing read.

They both guffaw. In my transcript, “laugh” is the word that will occur most often, after the obvious ones like “I” and “the” and “concept”.


Posted at 2005-04-02 01:27:43 by RichardLink to Bowie and Eno
Comments, trackbacks.

Google
Web holygoat.co.uk
  • richard is: