Welcome to part one of my multi-part series “Building a Library Online Public Access Catalog (OPAC) with Drupal 7”.
The first chapter in this page-turner of a saga is all about connecting to an existing ILS or Integrated Library System. Most libraries employ a somewhat monolithic ILS—a system that handles nearly all library functions: maintaining patron accounts; handling circulation operations like checking items in and out; cataloging and making searchable the collection; and cutting purchase orders for new items. I need to connect to the ILS for two main reasons: 1) to harvest item availability and bibliographic metadata and 2) to query and update patron balances as part of a fine payment feature.
What follows is basically a summation of my research so far and I hope it helps others who dive into this strange and magical world of library systems.
If you are working with an ILS that was released in the last five years, chances are it provides a web services API that exposes nearly all the functionality most projects will ever need. Most vendors implement their web services RESTfully or using SOAP, making it a real cinch to grab item data or update patron accounts. Because these services are exposed over HTTP, you can communicate with them securely using SSL without the kind of special setup and considerations you would have to employ with other methods (see below).
One downside to the ILS API’s that I have worked with is that the methods for querying the catalog generally employ a search paradigm: they are really just the machine equivalent of typing in a keyword and getting back pages of results. If your plan is to grab all the item information and implement your own search engine in Drupal (using Apache Solr, for instance), you need to find some way to discover and grab all the records in the catalog. Since most APIs don’t have functionality for spitting out the unique IDs of every item in the collection, you’ll probably need to do one of two things; you can either use a batch to crawl a range of sequential IDs from X to Y essentially guessing at possible IDs or you can crawl a known set you’ve extracted through some other, possibly more manual means. The second option is clearly more efficient but it does require that you have some access (or have participation from someone who has access) to the ILS’s underlying database.
NISO Circulation Interchange Protocol (NCIP)
NCIP or the NISO Circulation Interchange Protocol attempts to apply some standard methods to allow applications to “exchange data about library users, the items they wish to use, the owners of the items, and the relationships among these three entities”. In the latest iterations it provides a “schema”—including not just data definitions but method/transaction specs as well—for all sorts of ILS information. NCIP has gained traction and a number of ILSs support it, usually as a web service.
In this world of tight library budgets, it may be that your institution’s library management system is getting a bit long-in-the-tooth and doesn’t have a fancy web services API. Or perhaps it does but you can’t afford the expense of another proprietary plug-in. Luckily, even many older ILS’s provide some sort of web-based, search-able portal. While it may feel icky, you can always query the portal using PHP as a web browser and extract the data you need from the resulting HTML. There are good PHP libraries for parsing HTML (and even better ones for Ruby), even imperfect HTML, and since the HTML generated by these portals is pretty basic and consistent, it is completely possible to harvest a catalog this way. The Millenium module for Drupal 6 works this way with decent results. Unfortunately, scraping doesn’t help at all with Patron operations.
Standard Interchange Protocol (SIP2)
For patron account (recording fines, placing holds) and circulation operations (checking items in and out), there’s the tried and true SIP2 protocol. (Cue the ol-timey music and black-and-white film reel), originally developed by 3M in the late 90’s to support connecting their self-check systems to various library backends, SIP2 continues to be supported by a number of proprietary ILS vendors. Typically, you connect to a SIP2 responder using basic sockets, sending and receiving fixed width text messages with control character prefixes and suffixes (see 3M’s SIP2 specification [PDF]). SIP2 definitely resembles traditional EDI and thus working with it would be a daunting task if it weren’t for some good resources for using PHP with SIP2. Jason Sherman has published this Drupal sandbox project that uses SIP2 for authentication. It leverages this GPL’d PHP-SIP2 class which wraps all the esoteric socket messaging into easy-to-use methods.
There are some architectural things to consider when using SIP2, since it was never meant to be a web backend. First, you should consider limiting, queing and batching your messaging to the SIP2 responder. It may not be able to handle a flood of messages. Second, you’ll need to implement some security. SIP2 does allow for encryption between host and client but its up to the ILS vendor to provide this capability and then for individual libraries to implement it. You may consider sending all your requests from your web server to the SIP2 server through an SSH or VPN tunnel.
Direct SQL Database Connections
If you are a real nerd like me, you may think you’ve lucked out if you have a secure and workable way to connect your Drupal site directly to an ILS’s own SQL database. You’ll be able to grab any and all the data you want, relatively efficiently using the methodology and protocol you’re probably already very familiar with. But don’t get too excited about this option. While it’s good for grabbing that known set of IDs I discussed above, you don’t really want to have to learn and figure out some new and esoteric schema. I mean, it’s about as good an idea as writing your own SQL queries against a Drupal database with no knowledge of the application (good luck). And of course, there’s no way you would think about actually trying to write to ILS’s database. Make sure you exhaust all the other options before resorting to this.
In the end, I have had to cobble together a solution that relies on both web services and SIP2. How I got them to work well together, in a very Drupal-ly way, will be the subject of my next post. I will also discuss how I hope our work dovetails with existing Drupal OPAC projects like Alex Arnaud’s D7 OPAC sandbox project and the eXtensible Catalog project for Drupal 6.
This post was written by former Isoveran Kelly Lucas.