ALA TechSource Logo

ala-online-v1.jpg

 
curve Home spacer Publications spacer Subscribe spacer Blog spacer About  
    

How OPACs Suck, Part 3: The Big Picture

Submitted by Karen G. Schneider on May 20, 2006 - 9:57am

In my two (Part 1 here, Part 2 here) earlier pieces on this topic, I focused very narrowly on some fairly obvious limitations with online catalogs, limiting my discussion to weaknesses in OPAC searching from the user's point of view.

A tag cloud generated by this post.

There are other issues with online catalogs much bigger and more problematic than search results—problems that can't be addressed by improving relevance ranking or adding spell-check (however valuable those features are to OPACs).



The fundamental problem with today's library catalog is that it suffers from severe literalism. Even with a few bells and whistles, today's OPAC is a doggedly faithful replica of the card catalog of yore. This isn't a failure of any one vendor; by and large they're delivering what librarians think they want. It's a larger failure of vision.



First Literalism: The OPAC Is a Citation Index
One major problem is that the online catalog is merely a citation index. It doesn't index the book itself—only a mere handful of terms in its metadata. As librarians, we're accustomed to this. But our users aren't. The user of tomorrow grew up in a full-text world. For that user, the limitations of the online catalog make no sense.



In the pre-digital days, it was logical that the catalog was strictly a citation index. But no matter how much some might wish it otherwise, every book published today is born as a digital object. A book moves from creation to publication as a computer file. The final product—a paper-based book—is an elegant but technically unnecessary anachronism.



I'm not debating the future of paper-based books—that's a discussion for another time—but I am saying that the content of many of the books in online catalogs—including, I would have to think, nearly every book published in the last fifteen years—have digital correlatives that should be essential to the foundation for retrieval in a modern library catalog. In other words, make it easy to search "good night mittens" in an OPAC. It's not dumbing it down—it's smartening it up to where users are today.



Second Literalism: A Data Set for Every Library
In the pre-digital era, every library maintained its own card catalog. That was absolutely logical: how else would users find a book in the library? But in the digital era, every library still maintains its own card catalog—at least in the sense of having a unique database with a unique record for each bibliographic entity. Step back and ponder how really odd that it is. It would be as if Amazon created a local data set for each customer, with a separate installation of Amazon as well.



Librarians might rush in to tell you how much value customization has for their local records. First, tell me: just how much do you have to tweak out a record for The Da Vinci Code to meet local needs? Even if you do want to customize a record, wouldn't it make more sense if the record stayed central and the customization stayed local? Think about every other Internet database you interact with—YouTube, MySpace, Bloglines, just about any online bookstore, even ALA's nifty new Booklist Online. Why is the OPAC different?



It's not as if that information isn't available centrally—particularly with OCLC gobbling up, er, I mean, merging with, RLG. Speaking of booksellers, the American Booksellers Association sells software to bookstores that is used to search, view, sell, and manage book inventories. If booksellers can use a central database, why can't we?



Third Literalism: It's Still about Books
Also, the online catalog is still about books or book-like objects such as DVDs—in other words, full-length physical manifestations. Everything else gets thrown on top like pineapple on a pizza. There have been some philosophical strides forward since the days when many librarians refused to catalog "non-print" media (yes, some of your ancestors scorned anything that wasn't a book), but for the most part, the central software used to manage the library's content falls down when the content is electronic and/or far more atomized—such as your typical journal article. It doesn't matter how good your search function is if your content is elsewhere.



Fourth Literalism: The OPAC as Central Library Function
Have you ever owned a tool that did too many things, none of them well? Perhaps the greatest literalism of all, with respect to integrated library systems, is that they are one continuous product to begin with. Lorcan Dempsey, VP of OCLC, has been making the case on his blog that the next-generation integrated library system should be dis-integrated. He points out that the modern ILS weds an inventory system with a discovery system—in the end doing poorly at both.



The online catalog functions reasonably well for locating known book-like items cataloged within the library's collection, which complements the traditional roles of the library. But library usage is changing, with libraries reporting higher foot traffic and more use of meeting rooms and special space, even as book checkouts remain relatively stagnant.



Often librarians find themselves shoehorning user authentication, tools for journal article discovery, network and equipment usage management, and payment tools onto ILS software—as if the ILS were the car driving library services and these other tools its accessories. So much attention is paid to adding yet more features to monolithic software that we forget the software's function in the first place—to help the discovery process.



Not only that, but the online software, freighted as it is with so many duties, cannot be lithe enough to quickly adapt to the revolution in networked software that is changing how people function on the Web and increasingly in their lives.

On such sites as Amazon and Flickr, the user is not simply interacting anonymously for simple transaction functions, but as users rate, tag, collect, review, save, bookmark, e-mail, comment on, subscribe to, and share content, they are creatively engaging with the software and its content—transforming it, adding to it, improving it, participating in it. Some catalog vendors try to keep up, but most additions or changes to catalog software force an expensive and time-consuming "ripple effect" of modifications through the ILS.



Where to?
The catalog of the future wouldn't be a catalog; it would be a series of standards-compliant Web services that could be mixed and matched. A small library could use Library Thing to catalog its inventory, tack on a nice search interface courtesy of some search vendor, and buy acquisition and serial modules from yet another vendor. Vendor modules could mingle with open-source Web services, empowering upstart library programmers to add new services now, not three years from now, while allowing vendors for other modules to focus on their core products.



The local record would be obsolete. Libraries would use global records, locally modified as the spirit and the budget moved them. Discovery would be a rich, satisfying experience that would leverage the potentially powerful combination of library-generated metadata, user tagging and other user interactivity, and full-text Web discovery.



Perhaps the library application would continue to be a visibly separate database; but without the complicating anachronism of local records and a search experience far removed from where our users are—the Web—the online catalog might not only be dis-integrated into separate services, but re-integrated seamlessly into the Web—and among other library services, such as journal databases. Web services such as Open Worldcat technically make library holdings native to the Web; perhaps search-engine widgets could further enhance discovery. A simple button built in to every major search engine could lead users to our services.



This is a good time to atomize the catalog and start over. For years we've been feeling depth charges rumbling throughout this profession: MARC, some say, should die, and traditional cataloging ain't lookin' too good, either. It's time to dis-integrate the catalog, weave it into the Web, and push forward to the future. Technorati tags: book, books, catalog, citation, discovery, index, librarians, libraries, library catalog, oclc, online catalogs, OPAC, opacs, open source, software, Web

Posted in

Comments (25)

Al, I really like PPL's

Al, I really like PPL's implementation of Endeca and think it blows the others out of the water. I can talk more off-blog if you're interested.

Karen, I have been following

Karen, I have been following your articles for some time in regards to this topic, and I agree with most of your comments regarding the lack of usability of OPACs. Phoenix Public Library has launched its implementation of Endeca, which shows how this search engine can really adapt to the philosophy of any library. In this case, Phoenix decided to follow a retail based approach (similar to Amazon, Barnes & Noble or Indigo). I would love to hear any comments. Thanks for your input. (www.phoenixpubliclibrary.org)

Marie, it would be great if

Marie, it would be great if these records cross-reffed themselves, but in honesty, when I need to look them up I google 'how opacs suck.'

[I thought I'd try posting

[I thought I'd try posting this again with another try at formatting my paragraphs and line breaks...] So, after having read all three posts on this topic, I find myself wondering what the metadata should look like, then? I am a cataloger, and the concept of going from the needs of the user, backward, is immensely appealing. So much of the time I feel like I am spinning my wheels, frankly, when I create catalog records according to the current rules. (Though, in case anyone is wondering, I still always follow the rules, efficient or not! Until things change, that will remain the culture, and I'm with that.) I am an inveterate and frequent user of both Google and Amazon. I am continually impressed by them. Not by their perfection (which they are not and which I think some people expect), but by their really-goodness. Most of the time I find what I am looking for when I use them, and when I don't find what I need, I use my expertise to dig for information in additional ways. I don't think that being a librarian makes me even necessarily better at others in searching for deep information, by the way. Anyone who knows how to call on an expert (such as a librarian) is as well off as I, I'd say. To get back to my original point: what should the metadata look like? Karen has said that superimposing a good search engine on the current catalog is a good solution, and it's a good start. But we are surely adding lots of data to the catalog record that are superfluous in this new environment. Has anyone surmised/analyzed/etc. what fields are most important? I am thinking that they would be: title and alternate titles, author/s, date/s, language/s, class and/or call #. Beyond that, there should be some kind of description of the item, such as pagination and/or size. Finally, there could be additional data points such as table of contents listings (for additional relevant keywords), a summary, and possibly LCSHs. What do you think, Karen?

So, after having read all

So, after having read all three posts on this topic, I find myself wondering what the metadata should look like, then? I am a cataloger, and the concept of going from the needs of the user, backward, is immensely appealing. So much of the time I feel like I am spinning my wheels, frankly, when I create catalog records according to the current rules. (Though, in case anyone is wondering, I still always follow the rules, efficient or not! Until things change, that will remain the culture, and I'm with that.) I am an inveterate and frequent user of both Google and Amazon. I am continually impressed by them. Not by their perfection (which they are not and which I think some people expect), but by their really-goodness. Most of the time I find what I am looking for when I use them, and when I don't find what I need, I use my expertise to dig for information in additional ways. I don't think that being a librarian makes me even necessarily better at others in searching for deep information, by the way. Anyone who knows how to call on an expert (such as a librarian) is as well off as I, I'd say. To get back to my original point: what should the metadata look like? Karen has said that superimposing a good search engine on the current catalog is a good solution, and it's a good start. But we are surely adding lots of data to the catalog record that are superfluous in this new environment. Has anyone surmised/analyzed/etc. what fields are most important? I am thinking that they would be: title and alternate titles author/s date/s language/s Beyond that, there should be some kind of description of the item, such as pagination and size. Finally, there could be additional data points such as table of contents listings (for additional relevant keywords) and possibly LCSHs. What do you think, Karen?

Stacy, I wrote about the

Stacy, I wrote about the Endeca implementation at NCSU in part 1 of this post (and elsewhere). Note that the catalog remains the same; it has a search engine layered on top of it. Still, a great solution!

Beth, I agree with you! I

Beth, I agree with you! I think many catalogers feel that way. Barbara, Part 1 and 2 of this topic are in the March and April archives:

http://www.techsource.ala.org/blog/2006/04/

http://www.techsource.ala.org/blog/2006/03/

Take a look, browse around! Next time I'll be a little more link-friendly and put the links in the post!

Where can I find part 1 and

Where can I find part 1 and part 2 of this topic?

I feel like Alice in

I feel like Alice in Wonderland. I have been in the library 'game' since 1962. Every time I pick up or log onto a library source, the same arguments are there. This is a classic. First the old book cats weren't up to date then the 'new' computer operated cats weren't detailed enough, and now they aren't up to date.
We librarians seem to live just behind 'the curve' where all light is blocked and only what washes over the 'wave' of 'the curve' is available (or affordable) to us.

Karen is absolutely right

Karen is absolutely right about the levels of description available in OPACs. There's no (good) reason why the Library of Congress should not expand subject headings into the scores or hundreds, with hierarchies and relational links as appropriate.

I attended a local

I attended a local conference recently and saw a demo for the new catalog for North Carolina State University. The library worked with Indeca to create a new kind of catalog and everyone watching the demo was really impressed. You can check it out for yourself at
http://www.lib.ncsu.edu/catalog/

I generally agree with you

I generally agree with you on most points in this series. I just went to the Innovative Users Group meeting and they are releasing a new opac interface which will include many of the things on your checklist -- such as relevance ranking and spell check. For a long time I have wished that catalogers (and I am a cataloger) could take the time needed to create quality controlled access points to materials JUST ONE TIME in one place and all share the data without editing. Having the ability to add things locally while sharing the main data would be great in addition. All the duplicate work and editing is the problem here which leaves catalogers less time to create high quality original cataloging data to share. No matter what the search interface -- having quality controlled data and access points as well as the full text to search is the key to high relevance and recall in searches.
My two cents
~Beth Geesey Holmes
Cataloging Services Librarian
Univ. of Georgia Law Library

Andrea, I would be willing

Andrea, I would be willing to argue that a central catalog would have fewer mistakes, for two reasons: fewer records, and a redirection of labor toward those records.

'Patrons spoiled by

'Patrons spoiled by Google'--that's an interesting angle. I guess I don't see patrons 'spoiled' so much as enlightened.

None of this is easy or within reach, but if we start from the assumption that our patrons are spoiled and the work's too darned hard anyway... well, we end up with what we have.

'make it easy to search

'make it easy to search 'good night mittens' in an OPAC'

well, maybe if it were easy to digitize all of the books in our collections. Ask the people running Google Print what a piece of cake that is. Cheap, too.

I agree that the unavailability of full text book search within an OPAC is a disappointment to patrons that are spoiled by google, but putting it down to a 'a larger failure of vision' is off the mark. It's not as if libraries and ILS vendors (for all of their faults) have been ignoring this vast treasure trove of full text book content just because they couldn't see that it would be useful.

The work done by google and amazon will undoubtedly make it easier to proceed in this direction. Then we will need to secure the necessary permissions and content from publishers in order to make it a reality. Where can we go today to buy, even at an inflated price suitable for some central catalog, the 'digital correlatives that should be essential to the foundation for retrieval in a modern library catalog' ? Making those correlatives from scratch is not going to happen, unless you have the budget & clout of google.

A central catalogue has its

A central catalogue has its advantages, but it lacks the positive effects of redundancy: when I search for a book in meta-OPACs like the German KVK, it's evident how many mistakes there are in various catalogues. An educated guess or some counting then helps greatly to find the 'most probable' right data. In a single-catalogue system, a mistake could stick there pretty much forever unless the system has lots of wiki-style elements or other workarounds. Even worse, relevant data that is missing in a central catalogue will hardly ever emerge from obscurity.

Ed, the system would look

Ed, the system would look much different if we used a central database. I\'m not concerned about the storage requirements for local records as much as I am by the design of a system where an ILS--and the labor to create and maintain it--is duplicated for every library or small group of libraries. Yes, local records would be replaced by tagging.

Incidentally, the local ILS is, at best, an inventory system. It\'s terrible for discovery. We should start from the user and work backwards.

Replace all our local

Replace all our local records with a central data store? Color me skeptical. You'd still have to maintain a local inventory system and do all the work of tagging local holdings. At the end of the day, you'd only be saving the storage cost of the bib records, and storage is cheap.

There are business functions

There are business functions that the monolithic OPAC provides that would still be needed, such as acquisitions and circulation. However, the creation of a suite of tools that leverages Open WorldCat data to create such local database/functionality is doable. [concept patent pending :-) ] We just need for a few forward thinking library administrators to get together and create a resource sharing network to develop such a suite using open source/open standards. The resulting cost recovery from not having to invest in and maintain commercial OPACs would be significant.

Oy... MARC, Dewey, OPACs...

Oy... MARC, Dewey, OPACs... let's just admit their limitations and start fresh...

Oh, and Tom--time x energy =

Oh, and Tom--time x energy = money :-)

Lynn, I've heard cost

Lynn, I've heard cost raised as an issue for most of my career. What I haven't seen is a really solid analysis of what it costs to do things the Olden Waye. How much are we really saving--or spending?

Nice post. I agree about the

Nice post. I agree about the card catalog legacy. I think many of the design decisions in the OPAC made sense back when people were migrating from cards to online. I think we are at the point now, though, where the legacy prevents it from moving forward. It may be time to lose the catalog motif.

Interesting post, but one of

Interesting post, but one of the biggest problems the library world will have in addressing you second literalism and creating global records is the issue of cost. I have worked fairly closely with 18 public libraries of various sizes and only two of them (not including the largest) were part of OCLC (or anything similar). Why? Cost. Who is going to host this global database and how much will they charge for access? Given LC's recent grumbling about costs and their loss of people - I somehow doubt it will be them.

Another great post! I think

Another great post! I think many librarians are terrified of centralizing bibliographic records for fear that it's the first step toward farming out all library services to web conglomerates. However, given the time and energy local librarians spend entering and maintaining local data that is virtually identical to that maintained by other libraries, centralization could free these same librarians up to focus on fine-tuning the OPAC interface in a way that makes the data much more accessible.