SMS parsing
I'm back on my old hobby-horse, parsing arbitrary SMS files (though in future I'm not going to use C!). This time I'm having a go at the ones generated by the Series 60 app MessageStorer (aka MsgStorer). This (rather good) program outputs text files with a few annoyances:- The date given is that of the SMS file, not of the message — a message that is delivered late will have a time that's between a few minutes and a few days too late. There's nothing I can do about this unless I can fetch the files off the device.
- The times aren't given with time zones or even AM/PM indicators, which is troublesome.
- The names are as shown in the Messaging application (e.g. “Newman Richard”), so they need to be looked up.
- The file is formatted in an arbitrary plain-text way.
This is a rather specialised application (number of Mac-using Lisp hackers who want to parse SMS messages from text files produced by MsgStorer: 1), but I might put the source up anyway. It taught me a lot about Perl-compatible regular expressions, and it's a practical application of CL-PPCRE and Wilbur.
Anyway, that's another datasource for my Semantic Web stuff!
Posted at 2005-04-02 14:58:30 by Richard • Link to SMS parsing
Comments, trackbacks.
