, , , ,

Some of you may be aware I’ve started programming for the iOS system in recent weeks. The stuff I’ve been doing, a native iOS interface into the Perseus Online Latin Word Tool, has really been a warm-up (to build up my Objective-C and Cocoa Touch chops) for the thing that I really want to build. I won’t go too far into that because it would be too boring to explain it in detail. Let’s just call it, “The Livy Electronic Reader”. Think of it as an iPad app that allows you to build your own translation and commentary of Livy (or some subsection of it), and as an ancillary, publish the data out to a shared Dropbox directory (or, maybe iCloud, or possibly a shared publishing mechanism, perhaps something like, a “Livy wiki”). My plan, once I’ve done enough for Livy, is to perhaps extend it to other authors, e.g. Caesar and Tacitus. I picked Livy because that’s my research interest. I’m really building a tool for myself to use for my PhD.

However, if anyone can answer the following questions about Perseus, its data format and URL scheme, or know where I can find answers, I’d be much obliged.

The first question I have, is why does the XML interface behave inconsistently in the data it returns? Can it be made consistent? More importantly, can it be made predictable and therefore computable?

Here are some examples of what I mean.

First, something that behaves reasonably predictably. These first links are to Caesar’s de Bellico Gallico

Clearly, the “document name” is “1999.02.0002” and by adding arguments :book=n :chapter=n and :section=n you can select more or less of the content as you wish. Perfect! Give me a reference to Caesar and I can retrieve the text in an easily transformable XML format.

In contrast to to the former logical behaviour, consider these following links to Weissenborn and Muller’s 1898 edition of Livy’s text.

However if you look at those two links, you’ll see the text is not the same text. That’s because what I really labelled above a “1.1.1” is really book 1 praefectus 1. You can access it directly with this URL:

OK, so maybe the Livy text is thrown by the presence of the special “preface” chapter.

Additionally, if you want, say, book 22 chapter 1, you might predict that this could work:

But sadly, no. That’s not even a valid document. I guess the later books are in different editions, and thus to get to book 22, it’s an entirely different document URL.

Let’s try to get the whole book contents:

Predictability is one of the greatest virtues for URI schema, and this seems to break it. The data inside the documents suggest it is broken into book, chapter, section, and the URL retrieval scheme suggests it can be retrieved as such, but there is different behaviour depending on the document content (Livy or Caesar).

So it looks like that in my app I’ve have to build a static tree of the document URLs, rather than being able to compute them on the fly, which is a much better way of doing things, usually.

Does anyone have any insight to this behaviour of Perseus? Suggestions? Comments?