ALA TechSource Logo

ala-online-v1.jpg

 
curve Home spacer Publications spacer Subscribe spacer Blog spacer About  
    

How OPACs Suck, Part 2: The Checklist of Shame

Submitted by Karen G. Schneider on April 3, 2006 - 2:02pm

Karen G. Schneider head shotIn my first article in this series, I wrassled with the biggest bear in the forest: how most online catalogs lack relevance ranking. That's one big hairy bear, but as some readers pointed out, it's a little forced to pick on relevance ranking, out of the context of all the other important features most online catalogs don't offer—or are features implemented so badly that librarians disable these features rather than further confuse the poor user, who just wants to find a book or DVD, for crying out loud.

So rather than plunge into another specific feature, I'm back tracking just enough to give you the Checklist of Shame—key features common to most search engines (even the least expensive), features often missing in online catalogs. Even this is an abbreviated list; the search-engine test instrument I've developed for My Place Of Work (MPOW) is seven pages long.

I agree with Eric Lease Morgan's comment on my last piece that librarians tend to ask for esoteric features at the expense of core functionality. I continue to be surprised at the people who tell me how a catalog "should" work but haven't done a lick of user analysis, forensic, heuristic, academic, or otherwise, to back their theories.

But here's a rule of thumb: in general, if the 800-pound gorillas, such as Google and Ask.com, offer a feature (like default setting), you should mimic the gorillas and offer the same feature—and give that feature priority in your considerations. Furthermore, it's common-sense usability practice that you should offer that big-gorilla search-engine feature the way the gorillas offer it—because users will come to your catalog with user behavior learned from such search engines as Google and Ask.com. (Don't ever rely on help files to "teach" people. In last year's usability testing at MPOW, the only person who read our help files, out of a group of techies, librarians, and academics, was the 25-year-old soccer mom.)

I also list features used primarily by aficionados. This group—ranging from in-house librarians to information super-users—can be influential, and they are often engaged with your catalog at a level that can prove hugely informative. So many search engines support aficionado features that it's easy enough to support their preferences. Furthermore, in my experience, aficionados will also tell you when an esoteric feature is completely pointless, even for them. Just don't let the aficionado input drown out common-sense decisions.

Features Your OPAC Wishes It Had

  • Relevance ranking—As I explained earlier, on TF/IDF (term frequency/inverse document frequency), relevance rank is the essential building block to ensure the most likely search results rise to the top. Every search engine on the planet relies on relevance ranking. Many online catalogs don't offer it ("system sorted," anyone?) or implement it bizarrely. (I agree with comments that relevance ranking and online catalogs can be hard to do well, but I disagree that adding relevance ranking cannot be done at all; the NCSU catalog makes that clear.)


  • Stemming—To steal from a couple of good Web definitions, stemming is "a method by which Search Engines associate words with prefixes and suffixes to [a] word stem to make the search broader," such as returning the same results for "applies, applying, and applied."

    After relevance ranking, stemming is arguably one of the most important search features for an online catalog, where search success hinges precipitously on searching the relatively scanty metadata of MARC records. Yet even huge search engines (such as Google) with the luxury of massive amounts of full text to improve matching, use stemming. (I've watched Google turn stemming off and on and tinker with it—clearly they think about stemming a lot.)


  • Field weighting—First runner-up for second most important feature in a search engine. You can tweak field weighting to give more or less prominence to fields. For example, titles are often given more importance, allowing the first few hits for the search term million to retrieve books with million in the title.


  • Spell-checking—Essential, not because people are dumb, but because people make mistakes. If anyone gets snobby with you when you bring up spell-check, just tell them Jane Austen was a notoriously bad speller; she misspelled one of her teenage works as “Love and Freindship.” (Thank goodness for edditors editors.)


  • Refining original queries—If you type in a term such as butterfly, after viewing the results, you may want to tweak that search to add a term such as conservation. A good search engine will present the search terms in the search box or otherwise make it very easy to view and modify the original search.


  • Support for popular query operators—For example, supports + and – for "required" and "not." It's also okay to offer older query operators, such as and, for backward compatibility to people who have been searching your catalog since Melvil Dewey was a circ clerk, but those older query operators are not substitutes for what people are using today. For that matter, things change over time, so the ability to add a new query operator synonym is valuable.


  • The Boolean bag o' goods—Can the search engine support quoted searching ("declaration of independence"), wildcard searching (appl*), proximity searching (cheese near cheddar), or give preference to case (AIDS versus aids)? Most people don't use these features, but your aficionado users will look for them, and nearly all search engines, even the entry-level products, offer these features. Any vendor who moans these are difficult and expensive to offer is blowing smoke in your ear.


  • Flexible default query processing—Basically, can you decide that search results will be "anded" (meaning that all terms must be matched) or "orred" (meaning that any term must be matched)? Google changes its features over time, but Google's settings might not be the best choice for your catalog (something to keep in mind if you evaluate the Google Appliance). You'll only know through usability testing, and the search engine shouldn't make that decision for you.


  • In-line query limiters—The ability to search in-line by a field, the way in Google you can limit your searches, for example, with site: edu. This is a capability that will be used by a tiny fraction of your users. I wouldn't trade it for relevance ranking and field weighting, but then, every search engine I've evaluated this spring offers this feature. Extra credit for being able to select and label the limiters any way you want.


  • Duplicate detection—This is an interesting search-engine feature to discuss for online catalogs. It raises the issue of FRBR (pronounced FER-ber)—Functional Requirements for Bibliographic Records—which is, to be grossly reductive, duplicate management for online catalogs, so that a user isn't stumped by five records for what is essentially the same item. But in a search engine, duplicate detection simply flags multiple records for the same item and ideally gives you control over how to handle search results when duplicates are detected.


  • Sort flexibility—You don't want to overwhelm users with options for sorting search results, but can you at least offer them the capability to switch between relevance and date? Also, can you offer other sorting that might be a nice local option (the way some store Web sites offer sorting by price or user rating)? Even more crucially, can you control where the search engine pulls its date information—ensuring that the indexed "date" comes from a locally controlled field, rather than simply the HTTP header?


  • Character sets—Although most search engines offer flexible support for other languages, many online catalogs can barely handle one character set. I recently observed ALA Council debating a resolution on non-Roman characters in online catalogs that was ultimately shot down because it didn't come from ALCTS—a classic example of NIH (Not Invented Here). Forget ALA subcommittees: the pressure needs to come from you, gentle reader.


  • Faceting—This is a "21st-century search engine" feature that some search engines grew up around and that older search engines are scrambling to add. Faceting manipulates search results to make it easy to browse by category. Search the NCSU catalog for the phrase civil war, and browse by LCSH or publisher; search Landsend.com with the term pants, and see choices arranged by size, cost, and other metadata.

    Avi Rappoport, search guru extraordinaire, explains faceting thoroughly in www.searchtools.com/info/faceted-metadata.html. Online catalogs offer such rich metadata that it's a shame not to offer faceting.

  • Advanced search—My favorite chimera! In most search engines, most notably Google (www.google.com/advanced_search?hl=en ), the "advanced search" page is largely a "junior" search page that walks the user through fielded and Boolean searches. At MPOW, we shamelessly stole their page for our own (http://lii.org/pub/htdocs/adv_search_home.htm). There's nothing wrong with that, and the "advanced search" page can be convenient place to offer popular date-searching options or other nice tweaks. But users should be able to perform most truly advanced features through inline operators in the search engine's basic search box, so that the handful of hopeless nerds like me who think it's bang-up fun to do a search such as wine-cheese site: edu won't have to plod through a fielded page to do so.


  • Easily customized search-result pages—The word easily should be understood to refer to people with respectable HTML skills, not to people who pay people to do that kind of work (for about the same reason I don't give myself root canals). Still, good search engines provide strong templating systems for developing search-results pages that integrate well with your overall design. Extra credit for default templates that validate to published HTML standards and meet Priority 2 accessibility requirements.


  • Human suggestions (also called "best bets," etc.)—Can you force an item to the top of search results? (Can you then charge publishers for premium results? Just kidding, just kidding…) This smart discussion of best bets www.steptwo.com.au/papers/cmb_bestbets/ has a great screen capture of this feature in action. Best bets are particularly nice when you have good search analysis to indicate what people are searching for most frequently, which brings up…


  • Search logging and reports—You need to know what's working for your users and what isn't. Your basic transaction logs (how many hits to the server and where the hits come from) aren't adequate for this. A good search engine will, at minimum, log top queries by frequency and top queries with no hits. Also look for trend reports you can use to tweak the search engine, for example, by adding terms to records to make them more findable (the way I saw librarians add Brokeback Mountain to the notes field for records for Annie Proulx's short story collection, Close Range www.worldcatlibraries.org/wcpa/ow/e4d1df37de10d114a19afeb4da09e526.html


  • Well-rounded administrative interface—Does every tweak to the search engine require begging some techie to tweak a feature, observing the results, and then begging some more until it's right? Are the search engine's features hidden in largely undocumented mystery meat? Is it impossible to determine the settings at a glance, or at least through intelligent perusal of the administrative section? (Yes, this is a roman à clef…one of several drivers in our search for a better search engine at MPOW.)

These are just the high notes of search functionality, and it doesn't cover how well, or badly, vendors provide these features (or how well or badly customers implement them)—topics I'll tackle in future sections in this series. After all that, this checklist doesn't address the much more difficult problem to solve: the sparse, hard-to-search nature of citation indexes. People are now accustomed to full-text searching. Can we make them like an OPAC, no matter how much we fix its search functions?

But think about your own catalog: are these features available? It may well be, as some users wrote me privately, that the OPAC (as separate software purchased by local libraries) is near death's door. I think that's very likely. But if so, anything else we use for a catalog—who's betting on Open WorldCat?—will need good search functionality as well, or it too will suck, only more consistently and on a much larger scale. In the end, as uber-librarian and user champion Marvin Scilken told me many times, the bottom line is public service.Technorati tags: library, library catalog, library catalogs, Online catalogs, OPAC

Posted in

Comments (50)

Thanks for this article. it

Thanks for this article. it is very useful for me.

kolera fan sitesi

kolera fan sitesi

Going back to online

Going back to online catalogs... this is a great discussion, and I am eager to jump into the specifics on some of the features I discussed above.

i think that even in a

i think that even in a Search tips abstract, or even a better viewable link to search tips might had been useful. Even people who are on web for more than 5 to 6 years are not using the advanced features. If you take a bunch of them they may be quite new to the features like site: , define: etc. To get more out of the google, we want them to be listed there.

Thanks for the great work

Thanks for the great work and the informatice article! I agree with your most arguments - unfortunately, my English is not as good as necessary to get the point in all statements

Nice one, I'll use this for

Nice one, I'll use this for my upcoming essay for Information Retrieval class in university.

i think that even in a

i think that even in a Search tips abstract, or even a better viewable link to search tips might had been useful. Even people who are on web for more than 5 to 6 years are not using the advanced features. If you take a bunch of them they may be quite new to the features like site: , define: etc. To get more out of the google, we want them to be listed there.

As a non-librarian, who has

As a non-librarian, who has stumbled across this entry six months after the original post, I'd like to add one more way that OPACs suck: their web interfaces don't allow you too bookmark things reliably. With Google, I can bookmark a search to come back to it later. With Amazon.com, I can bookmark searches and drill-down navigations and product detail pages. But the King County Library System's current OPAC (as an example; http://catalog.kcls.org/) expresses its search detail pages as "item N, page M in query results for Q", which means your bookmarked item will change out from underneath you as items are added or removed. And their new OPAC (AquaBrowser; http://explorer.kcls.org) hides all page navigation from you beyond the initial top-level search. Did you apply restrictions? Sorry, have to enter them again. Interested in a re-finding a particular book you saw? Hope you wrote down the restrictions you followed originally.

My apologies. Ms. Picnally

My apologies. Ms. Picnally Camden's email address is:

bethpc@pobox.upenn.edu

Thank you very much, Karen,

Thank you very much, Karen, for this thought provoking discussion about the shortcomings of our present-day online catalogs. I would like to provide a little additional information that may help move the discussion forward as it relates to multilingual cataloging information.

A membership resolution on equal access to resources in non-Roman alphabets in libraries was brought to ALA Council at the 2005 ALA Annual meeting. The thrust of the resolution was to ensure that non-Roman cataloging information would be created and communicated to library patrons on an exactly equitable basis as cataloging information for Roman alphabet language materials.

The resolution implied a number of things that the mover continues to insist as being true:

1. That it is not possible to record non-Roman headings in their vernacular script. The fact of this is that both AACR2 and the MARC 21 format permits this. There are a number of different approaches to accomplishing this laudable goal. The current solution as embodied in Library of Congress Rule Interpretations and the Descriptive Cataloging Manual Z1 section does not, unfortunately match the model that the resolution’s mover prefers. There are other possible implementations, and as technology advances, better approaches will no doubt be identified.

2. The resolution implied that the MARC format does not currently support the Unicode character set. This statement has since been repeated in this blog. Unfortunately it simply is not true. The MARC 21 format does currently support the Unicode character set. It is a system implementer decision as to whether MARC 8 or UTF-8 character sets are used, and how exactly they will be used. The general trend among system developers and data providers is in fact to build new systems and retrofit existing systems around Unicode encoding. If there weren’t hundreds of millions of MARC 21 records using MARC 8 character encoding that were being maintained and accessed on MARC 8 character-dependent systems, this would be largely a non-issue. The format is not the problem. The problem is what character set(s) can be supported by local systems and bibliographic utilities.

3. The resolution implied that nothing was being done to address access to non-Roman alphabet information, and that there is only one possible approach to addressing standards and implementation issues. This point is incorrect on both counts. The American Library Association through ALCTS, LITA, and RUSA, have been diligently working on these issues for many years. The problems associated with solving them are complex and require extended discussion to achieve consensus on implementable and pragmatic solutions. These discussions have included our colleagues in Canada, the United Kingdom, and Australia, whose outlooks on these issues have sometimes diverged from the American position. The net result is that some of ALA’s attempts to resolve non-Roman alphabet access have taken longer than might have been wished, and that the solutions often look significantly different from the original approach recommended by ALA. ALA would be making a major mistake to insist that our Canadian, British, and Australian colleagues must accept standards revisions that only reflect the American outlook.

The resolution failed in Council, not because, as has been suggested by some that it was “NIH (Not Invented Here)”. It failed because Council recognized that the issues were far more complex than could be reasonably discussed by 180+ non-experts in three minute time segments.

Underlying this discussion is a consideration of the cost of cataloging, as well as costs associated with supporting multi-lingual access to library catalogs. Particularly given some of the discussion currently taking place elsewhere about what cataloging should look like and how much it should cost, it is likely that decisions about non-Roman alphabet cataloging may be made by library administrators, rather than library standards developers.

Given the events noted above, the Association for Library Collections and Technical Services (ALCTS) formed a task force after the 2005 Annual meeting to explore access to non-English information. It was felt that, particularly noting the large and growing Spanish-speaking community in the U.S., limiting our exploration of the issues to non-Roman alphabet information would miss a very large segment of our user community. This task force is chaired by Beth Picknally Camden of the University of Pennsylvania and has broadly solicited input from the various stakeholder communities. The task force welcomes comments from the field. These comments should be sent to Ms. Picknally Camden at .

Bruce Johnson
ALCTS President-Elect

Going back to online

Going back to online catalogs... this is a great discussion, and I am eager to jump into the specifics on some of the features I discussed above.

I tried to formulate a

I tried to formulate a reasonable approximation of a 'real-world' search in a resource whose purpose should be 'to connect readers [customers] with books [merchandise], period.'

I just tried the same search in an OPAC, and the results came back on four screens - with a link to each of the four. If there were 40 screens and I could at least take a stab, based on the sort, at finding it the minute I had another clue. Even in Google, I've saved myself time more than once by making an intuitive leap into the middle of the results.

I'm absolutely in favor of enhancing our ability to locate information by appropriating ideas from anyone and everyone. I just feel the process is more productive if the expectations of user input are consistent, no matter what result I'm trying to demonstrate.

that search is huge in

that search is huge in Amazon, but it does bring up works by the author. How is that worse than a catalog, given how little information we have? Someone gets killed in every Christie mystery... so the question would need to be narrowed.

Jim, you make a good point

Jim, you make a good point about how the Council debate stirred interest, which led to action. That's interesting. I'd like to see that TF do more work transparently--for example, to report out on a blog or some other less hidden medium so all could read and discuss.

Mark, you say 'Digression one: open source ILS products are a recreation of the boutique library vendors, circa 1985 to 1995. I am not sure this is a step forward.' I wouldn't call this a digression; I had wondered if in fact the open source ILS was anything other than a closed-source ILS without a maintenance plan. Interesting!

Sunday, April 16, 2006

Sunday, April 16, 2006
I was asked what good an ALA resolution would do to further my goals--improved/equitable access to nonroman materials via original script headings. Well, even in defeat my resolution resulted in formation of the ALCTS Task Force on Non-English Access. As I said its report should appear at ALA in New Orleans this June. I cannot predict what it will contain but more exposure can't hurt. Other suggestions or actions are welcome.

Jim, what will an ALA

Jim, what will an ALA Council resolution do to further your goals? The ALA Council can't get a pot hole filled in front of ALA headquarters - what on earth could it do to further non-Enlgish access?

'If the set is large I

'If the set is large I narrow by category (which is Amazon's faceting), and most of the time what I'm looking for is on the first results page, usually the top result.'

Please, walk me through this. I want a book by Agatha Christie. I can't remember the title, but somebody gets killed. A search for Agatha Christie in Amazon (Books - that's a 'category') nets 1,623 results. How do I make what I want come out on top?

In 'how OPAC's suck, part

In 'how OPAC's suck, part 2' I was glad to see your comment on 'character sets' as a feature OPAC's wish they had. I authored the Membership resolution ALA Council shot down last summer in Chicago. While NIH (Not Invented Hers) was a factor, so was inaccurate testimony by resolution oppoentss. Among other things they said:
1. AACR2 allows nonroman access points/headings. It still does not but some catalogers make them anyway.
2. MARC allows all Unicode characters. It still does not but some systems allow them anyway.
3. ALA was working hard on these issues. It wasn't.
But this is history. ALCTS now does have a 'Task Force on Non-English Access' whose report is due at ALA in New Orleans. One can only hope it will urge an end to exclusive reliance on romanized access and instead the giving to those seeking nonroman materials in all scripts vernacular access equity via authoritative headings in their original scrpts.

You are lucky to only have

You are lucky to only have such esoteric OPAC complaint. My library has Dynix - the biggest thing thing I would like to see is the inclusion of call numbers in the information. What good is it to know what the library has if you cannot find it.
The system in general is very hard to navigate and is not user friendly.
I like to say that if the user had the pwychic powers to use the catalog they would not need the catalog - their psychic powers would tell them where the books are. Heck one wouldn't need the books either.

Not very much, but its hard

Not very much, but its hard to say. A quick Google search shows:

'library automation' 'open source' 'java enabled' 2007 - 14 hits
'library automation' 'open source' 'java enabled' 2006 - 22
'library automation' 'open source' 'java enabled' 2005 - 87
'library automation' 'open source' 'java enabled' 2004 - 44
'library automation' 'open source' 'java enabled' 2003 - 45
'library automation' 'open source' 'java enabled' 2002 - 56
'library automation' 'open source' 'java enabled' 2001 - 48
'library automation' 'open source' 'java enabled' 2000 - 102
'library automation' 'open source' 'java enabled' 1999 - 49
'library automation' 'open source' 'java enabled' 1998 - 43
'library automation' 'open source' 'java enabled' 1997 - 39
'library automation' 'open source' 'java enabled' 1996 - 39
'library automation' 'open source' 'java enabled' 1995 - 76
'library automation' 'open source' 'java enabled' 1994 - 37
'library automation' 'open source' 'java enabled' 1993 - 29
'library automation' 'open source' 'java enabled' 1992 - 30
'library automation' 'open source' 'java enabled' 1991 - 30
'library automation' 'open source' 'java enabled' 1990 - 30

You might expect fewer hits as you go back in time, but the postings say more about what's indexed than anything.

My reasoning is simple: vendors go where the money is. Asia is a growth market compared to North America, which is flat. The market penetration of 3G and 4G devices in Asia is deep. so you better support those devices.

Digression one: open source ILS products are a recreation of the boutique library vendors, circa 1985 to 1995. I am not sure this is a step forward.

Digression two: the perpetuation of primitive wireless networks in North American leads me to say 'we wuz robbed.' The Big Men are busy churning public bandwidth. When its clear all the money's been made, well then we'll get some table scraps.

Digression three: Its hard to imagine a vendor getting excited about winning 'east doo-da community college,' but if you peruse the press releases, the flacks are making the most of slim pickins. Most vendor funnels are filled w/overseas business and who ever they can pick off of the competition. Its a buyers market for an ILS, but once you buy, well, like they say in 'Tank Girl' 'Its been swell, but the swellings gone down.'

Rank speculation: I bet they'll be more mergers. Endeavor (sic), Ex Libris, what's left of Geac, and VTLS might be likely candidates.

III is the market leader world-wide - that's not going to change, unless the runner stumbles. SirsiDynix has great depth all-around - LOTS of customers, LOTS of experienced staff, several mature products.

If open source ILS product succeed, you'll see what happened to MySQL, RedHat and SuSE - good open source products that go commercial. I suppose you could see an ILS equivalent to, say, PostgreSQL, but that will take much longer than going commercial; developing an ILS is a ton of work - what's the incentive for doing it for free? The library automation business is a heart-breaker, to say nothing of how long Rob McGee's necrology is. McGee's list of former CEO's of ILS companies is certainly a merry band of pirates.

I admire the gall of the Open-ILS folks, but it remains to be seen what the taxpayers of Georgia will get for their investment.

Re the value of metadata,

Re the value of metadata, however it is generated, here's a great quote from the BSTF report: 'undifferentiated keyword indexing of our enormous information space can result in chaos and noise without the categorization and summarization that can be enabled through quality metadata.'

I don't know that it has to be--or affordably, CAN be--LCSH for the long haul. But that's not to say structure isn't good.

Mark, what are the open-source ILS people doing in this area?

I've been looking around

I've been looking around for info on the development of 3G/4G wireless broadband networks. Have a look at this almost-4G network that Motorola is preparing for (see http://www.motorola.com/content/0,,5918,00.html). The moniker Motorola is using is MOTOwi4.

My point is the 'pac of the Very Near Future is going to do some of what Karen wants, look something like NCSU's OPAC, have content aggregation and transmission features something like MyLibrary & uPortal - and will need to run on broadband-enabled, portable devices, in addition to our existing range of toys. Who in libraryland is thinking about this?

In response to Jon, no,

In response to Jon, no, people would NOT do 'better' searches in Google if they knew all the features. They don't care. They don't want to. Nor do I, for that matter, unless I have to. When you get to where you want to go 90%+ of the time on the basis of a couple of keywords, why would you not start with the quick and dirty approach? It's when THAT doesn't work, that you have to have the capacity to refine searches -- which is where we come in. There is value in the precision you can get in a traditional library catalog, particularly for known item searches. There is value in subject cataloging and classification systems. Difficult searches are going to need different tools, many of which have been developed by the library profession. We can't, however, dismiss the majority of user behavior and the advantages that Google has.

Mark, you and I are on the

Mark, you and I are on the same page. (Ouch, did I *have* to use a bookish analogy?) I was reading Lorcan Dempsey's blog today and noticed him writing about the OPAC not being the 'sun' of library services.

But we have fetishized the ILS--I was very surprised when I learned that 'automating' a library meant putting its book records in a database--and devoted much money and attention to this one tool at the expense of other services, or even just more books and comfy chairs. My point in this series is to underscore that as designed, OPACs are too doggone expensive for what we get from them--that, in part due to priorities we have pushed on vendors, and in part due to our own disinterest in the real world of findability, we spend lavishly on a system that is lame at best, even when implemented correctly.

To the readers who have objected to the word 'suck': Andrew Pace is actually responsible for using the term in his own discussions about OPACs, but I'll gladly take the heat.

However, I must correct the implication that the origin of 'suck' derives from the lingo of 'gangsta rap.' According to the Oxford English Dictionary, 'sucks,' in this usage, can be traced back at least to 1971, and in 1978, esteemed writer Mary Gordon used the term in her well-regarded book, Final Payments: ' All the hotels have the same pictures. The last one, the food sucked.' This usage predates 'gangsta rap' by more than a decade. It is certainly unfair to the genre of 'gangsta rap' to assume it is the origin of a term you find coarse or inappropriate.

But yo, after this series, you want me to use another term? I'm fly with that!

Re wildcard searching (right

Re wildcard searching (right truncation, not internal): hmmm, so far, I've seen this supported by Endeca, FAST, i411, Dieselpoint, Verity Ultraseek, Thunderstone, and several others I'm forgetting. So Google may not offer it, but that doesn't mean it's not widely supported. Widely supported, but not widely used: that's more like it!

Regarding this statement:

Regarding this statement: 'While I agree with most of the criticisms of the opac made here, we need to recognize that the purpose of the library catalog is not only to retrieve, but to retrieve based on organizing principles such as assigned subject headings and classification that cannot be replicated by keyword searching no matter how powerful. OPACs need to do a better job of implementing these features, especially searching by classification within the catalog.'

I disagree. The purpose of the OPAC is to connect readers with books, period. That we have created an empire around our technology, rather than our outcomes, is a major obstacle to change.

On the spell check issue:

On the spell check issue: Check out Lucien from Jaunter.com. A product developed by a librarian that uses Google's spell checker for catalog searches. We have implemented it at Skokie Public Library--very inexpensive and it works!

You say search engines all

You say search engines all have wildcard searching. This is NOT true; almost NO internet search engines have wildcard searching. Take a look at Google's advanced searching cheat sheet:
http://www.google.com/help/cheatsheet.html

No truncation here! but almost every database and online catalog has it. I agree with almost everything else you say, except some catalogs do have some of these features.

Take a look at NCSU's catalog; they are doing most of what you're talking about. They are using the Endeca interface as a public interface with their regular library system.

Only librarians would have a

Only librarians would have a conversation like this. OPACs are a baby only a mother could love. Our users certainly don't get dewy (no pun intended) eyed about them - why should we?

We've got a bucketful of mis-matched citation and full text databases in libraries the world over; our beloved catalogs are only one tool among many. They have their place and that place is small.

ILS systems do about the same thing the same way they always have. It’s not like there's some fundamental advance in circulation control that will set the market on its ear. After 30 years of nearly constant abuse at the hands of librarians, the few remaining vendors have tolerable systems -systems that do a reasonable job of automating obscure, back-room operations most of humanity doesn't give a fig about. The 'net is still wreaking havoc with vendor business plans, and has all us librarians wringing our hands, save for those with the gall to actually build something useful (thanks Andrew, Eric and Mack).

Here's a secret I'll go public with: I've been a librarian for 20 years. That includes 5 years of public service work and IT project management for a public library, 9 years with the vendors (INLEX, McGraw-Hill School Systems and DRA), and having used or tried every ILS and OPAC I could get my mitts on. And 5 years at a mental health agency (no jokes please).

I am a big library booster, but not much of a library user. Why? Why would I go to a dirty box full of old stuff when I could go to a clean, well-lit store with current material of interest? In all formats, too. And, better yet, do it from the comfort of my home via the 'web?

'pac use studies are useful to the extent they help us analyze and improve tools that library customers ACTUALLY USE. Shoot, let’s use Google and maybe the ISI databases to see who's using the classic IR studies from the 60s and 70s, by folks like Salton and Sarajevic. Google has clearly been paying attention here.

When I search for fun & profit, I start searches in Google. I use Amazon for bibliographic verification and to maintain a reading and selection list. What I can't buy I borrow from libraries, and that doesn't happen very often. I use 'pacs for known item searches. Since tools like ILLiad and WorldCat (for ILL) are not available to the unwashed, I searched catalogs in my area (sometimes Z39.50-based union catalogs, not that anyone cares much about those anymore) and send citations on to libraries to get obscure stuff via ILL.

I'll give Karen props for this much: relevance ranking is important - in the tools I actually use. Google has it, so other tools don't necessarily need too.

As for the vendors, they do what companies MUST do to survive. They follow growth markets, period. If librarians are going to crow lack of features in 'pacs, its time to put up or shut up. You want relevance ranking in the 'pac? Pay to have it added - or use the market, and buy a system with the features you want. The vendors - what's left of them - may get the message. However, I doubt that's there their focus is.

I am more concerned about turning libraries into adaptive organizations that respond to changes in information seeking behavior. I don't need a 'pac as much as I need a JSR168 portal that will push relevant content to kids with MySpace accounts and 3G/4G phones. This is where the savvy vendor's attention is placed. Perhaps libraries and librarians ought to take a page from their book?

While I agree with most of

While I agree with most of the criticisms of the opac made here, we need to recognize that the purpose of the library catalog is not only to retrieve, but to retrieve based on organizing principles such as assigned subject headings and classification that cannot be replicated by keyword searching no matter how powerful. OPACs need to do a better job of implementing these features, especially searching by classification within the catalog.

As an editorial comment, permit me to express my displeasure at the author's use of the slang word 'suck' in this sense in what should be a professional communication. Using the language of 'gangsta rap' does not make one look cool or hip. It makes one look vulgar and stupid.

Another powerful Google

Another powerful Google feature missing completely from most OPACS is the IMAGE search. A user can not only find both text and image sources using the same search term(s), but can actually see the images displayed. Many OPACS, even if they include image records, only display the words about them, not the images themselves.

You know, if Amazon has help

You know, if Amazon has help files, I wouldn't know where they are, because I have never waded page by page through its results sets. If the set is large I narrow by category (which is Amazon's faceting), and most of the time what I'm looking for is on the first results page, usually the top result. Statements such as 'I rarely use Amazon' should be weighed against the evidence, which is they are not going broke, and that many of us use Amazon as a de facto book catalog.

OPACs have the potential not to suck, but I would be hard pressed to point to a feature in current widespread use that other databases should emulate.

In the interest of fairness,

In the interest of fairness, how about a look at the other side:'Features common to OPACs that would help other sites suck less?'

I rarely use Amazon, for example, because I can find no way* to wade through a thousand (or a hundred or fifty) results except page by page, sequentially through a sort.

*Or should I just try reading the help files? :)

One feature that I've been

One feature that I've been wishing for since the last century (!) is the ability to sort results by call number. Often I want to send a patron to a section to browse, and it always takes too long to go through a list of results and figure out where most of the books are on the shelf.

Wouldn't it be wonderful to have these and the other features Karen mentioned??

Hey, have fun! Challenge

Hey, have fun! Challenge them, expand them, contradict them... they're really just a start.

We (myself and a member of

We (myself and a member of the student chapter of ASIST at Simmons) are thinking of taking the ten points and making them the focus of a series of student pseudo-hacks looking at each of these in turn, considering feasibility and technical barriers, finding what has been done, and what it would take to do it. It's a very exciting prospect - thanks!

'But I think there is

'But I think there is significant middle ground between Google's 'search uber alles' approach and the methods we now use in librarianship.'

As a cataloger, I would actually agree with this statement (LCSH, LC Class., etc. are not exactly perfect), but what I find so frustrating is that those management types who are in love with Google/keyword searching 1) don't seem to want to recognize its faults and 2) don't have any ideas of what to use in place of the subject headings and classifcation that isn't google. Regardless of the limitations of cataloging as we have it, until there is something truly adequate to replace it, it's the only game in town. And having checked out what the Endeca/NCSU partnership has done with it, I think it is way too soon to say that it serves no purpose.

Which is a long way to say that I agree with your response. I'll try to make a point of reading the BSTF report. I've certainly heard of it, but not had any time to give to reading it.

The NCSU catalog uses its

The NCSU catalog uses its Endeca search engine to leverage all available metadata. In the case of library records, traditional cataloging provides the bulk of the metadata. In that sense, the NCSU catalog gets a lot of bang out of the metadata buck.

However, you are asking a different question--an important question. How do we maximize the findability of library items? Should we continue to rely on traditional cataloging?

The managers who say that they don't want any cataloging need to explain just what the catalog should be searching in the first place. But I think there is significant middle ground between Google's 'search uber alles' approach and the methods we now use in librarianship. Without denigrating the excellent work of catalogers, I believe we all need to recognize the limitations of our approach to date for item findability and find more affordable and user-oriented methods for organizing information. The University of California BSTF report I wrote about earlier talks about this as well.

Yes, item status should be obvious in any catalog--and should be a facet to display or hide as the user desires. And 'check shelf' doesn't cut it for me ;-)

I do have a question about

I do have a question about this series -- almost all of your comments have to do with the search engine capability and interface for the user. My confusion comes from the fact that the enthusiasm I hear from library management has to do with cutting out cataloging and using simple keyword searching from minimal records. Have you any thoughts there? My feeling, as a cataloger, is that the work of a cataloger is even more important when we have this kind of searching capability. The NCSU interface is a wonderful example of how the cataloging is being used to great advantage for the user to find things more quickly. And since a catalog is, by definition, a collection of metadata, not full text documents, cataloging is vital to make these searches even more efficient.

One other desirable capability that is peculiar to library catalogs (I picked this up from some of the CIL2006 presentations) is the ability to note what is immediately available vs. what may take time to get your hands on.

When we implemented a

When we implemented a spellchecker at My Place Of Work in 2002, we found it significantly improved retrieval.

I second the need for spell

I second the need for spell checking -- we recently started logging the usage of some of our OPAC tweaks and the 'did you mean?' suggestions got by far the highest usage, followed by the 'people who borrowed this, also borrowed' suggestions.

Mona, I think those are

Mona, I think those are useful indicators.

On the citation analysis

On the citation analysis front, I would like to see cited-by numbers and OCLC holdings in the search results. I realise these are not definitive factors of quality, but it does help in the decision-making process.

Amazon does a great job

Amazon does a great job wooing the reader. Compare the non-starter review function in Open Worldcat with Amazon's review function. Amazon knows that nobody does anything for free; they want something in return--recognition, the chance to interact, etc.

What do people think of

What do people think of AquaBrowser?

@interactivity: I'd even

@interactivity: I'd even put it like this: the biggest resource that any library can rely on, and could harvest, is the immense knowledge of their thousands of, often expert, users. Users that actually read books, i. e. full texts. Users that actually know about obscure historical figures, about which place is meant by a non-unique placename in a title etc. (An example from the bookselling world: Amazon does use their customers' information, given away by them more or less for free, and became a winner by that. In Germany, the large cataloguing project VLB by the national organization of publishers and bookstores is still very much like any old OPAC, and these online shops are just not as good as amazon's. I know it, I own one. And still use amazon's search functions, for their typo-friendly search and user's reviews.)

Whew, nice thinking and

Whew, nice thinking and threads! A coupla thoughts:

When I write, 'Don't ever rely on help files to 'teach' people,' I mean exactly that: never assume someone will learn a feature with a help file. Should you provide help files? Yes. Among other qualities, help files document how your system works, leading to an informal agreement with the public at large and with other developers, and gives the aficionados guidance about maximizing use of the system. But your system should be fundamentally usable without resorting to help files, because *people don't read them.* Bruce's point is that a smart system is simple to use.

Andreas, you bring up a wonderful point: the lack of interactivity of most OPACs. I hadn't even gotten into 'and can ya do tagging or comments or recommendation big boy, huh, can ya can ya?'

Wade, I think we're on the page in many ways: OPACs won't change unless we make them change. It's up to us. Maybe it's time for me to write about what a good ILS would look like...

Could agree more on

Could agree more on spell-checking. There's nothing more annoying than getting 0 results for a simple typo (ex 'a seperate peace' or 'baekgoods').

Another feature that is

Another feature that is lacking nearly everywhere: offer the users a possibility to report errors everywhere and quickly, without the hassle of finding e-mail addresses, FAQs, etc. If you think users won't find errors anyway, and furthermore won't bother to correct them, you'd better remember the success story that Wikis are. I search OPACs a lot, and more often than not I detect smaller or larger major errors.
This could and should be driven even further, especially where not the 'raw data', but categories and indexes are concerned, i. e. let your users help with the 'tagging'. (It does not have to be wiki-style, but even tagging by users 'on moderation' would be a huge advancce.)

Bruce, there's a problem

Bruce, there's a problem with your use of Google being 'intuitive'. I've seen people search in Google, and many aren't aware of the features like phrase searching and the like. I've seen many, many people just type words in the box when quotation marks would make a better search. Just because they can do a simple search doesn't mean much. Perhaps if Google had easier to find help pages, more people would be doing things like phrase search. Help pages are almost always needed.

I see no issue with complexity in OPACs if both a 'quick' or 'easy' search is built in and also a more advanced interface. Heck, I'd probably go one step farter and let them enter in a text string in a query language.