should we subscribe to the OCLC QuestionPoint Global KnowledgeBase?
For over a year now, we've been offering librarians and patrons access to the OCLC QuestionPoint Global KnowledgeBase, a collection of thousands of questions answered by librarians all over the world. Many of the answers come from the Library of Congress or the New York Public Library. Not that any librarian's answer isn't top notch, but the answers from these libraries I've seen are always thorough. It sounds like a great tool, right?
Even though we are no longer QuestionPoint subscribers, we still have the opportunity to subscribe to only the KnowledgeBase.
Since September 2007, visitors to our website have performed 2,085 searches on the KnowledgeBase. But did they find what they were looking for?
We recorded the search terms and the page the patron searched from (patrons can search from our front page, about page and find page, and I want to see where the service is being used). We also recorded the time they searched. We didn't record the patron's IP address or cookie or anything like that.
I used my favorite sample size calculator and figured out that if I pulled a random sample of 92 searches, I could get a pretty accurate picture of the knowledgebase's effectiveness within a 10% rate of error.
I randomized the search terms and picked the first 92, then searched each one against the QuestionPoint Global KnowledgeBase. I recorded how many results in the knowledgebase there were for each search, and if there were results, objectively decided if the patron was likely to have found a useful answer on a scale of 1 to 4.
I rated the knowledgebase a 1 if the patron would not have found an answer, and assumed a patron would only browse the first couple pages of results (search results are ten to the page). "Library card", for example, had 798 results and I rated the KB a 1 for this search.
I rated the knowledgebase a 2 if the patron might have found an answer. Usually, this was because the query is not clear to me the objective observer. The KB scored a 2 on "ancient rome" because there was a lot of info to browse (40 results) and the patron's query was very general.
I rated the knowledgebase a 3 if the patron probably found an answer. "who is the president" scored a 3 - I am fairly confident that the patron would have found the answer if she looked hard enough, but assuming a U.S. Context, the correct answer was the 11th results, on the second page, and I'm not sure she would have made it that far.
I rated the knowledgebase a 4 if the patron most definitely found an answer. "columbus" (109 results), "age of the earth" (40) and "lightning" (26) were the only three searches that the knowledgebase scored a 4 for.
Of the 92 searches, 38 (41%) had at least one result.
Of the 38 with at least one result, I rated the knowledgebase:
1 - 19 times (21% of total)
2 - 8 times (9%)
3 - 8 times (9%)
4 - 3 times (3%)
The average rating for the knowledgebase when search terms had results was 1.9.
In general, the more hits there were for a search, the harder it was to connect the question to an answer.
Overall, the number of hits ranged from 1 to 798, with a mean of 73, median of 17 and mode of 1.
The mean number of hits for rating 1 was 92, for 2 it was 58, for 3 it was 49 and for 4, 58.
Anyone who wants a look at our data or wants to do a study for themselves is welcome. I am attaching a file of the search terms. I managed to botch my ratings by sorting the wrong columns (sorry!), but the first 92 in the list are the ones I examined, and the first 55 I got no hits for (I did the study this afternoon, just now so as the database grows).
Now for some opinions:
The idea that only 3% of patrons searching the knowledgebase would find what they are looking for is appalling.
So it is a good thing we have a margins of error. The knowledgebase actually rated 3 or higher for about 12% of the terms, plus or minus 10%. 1 in 8? Take any other random collection of questions and answers and you could do a lot worse. Ooh - who needs tenure?
Certainly, some of the non-hits are the result of the patron not understanding what they were searching or how to search. I don't think there is a tool we could create that wouldn't have this problem, however.
At the same time, if even 4 out of 5 people are not finding what they are looking for, what are we offering them? Does it erode the library brand to promise answers (or "knowledge") and fail to deliver at least 78% of the time? At the very least, most people have an unpleasant experience with the library, and I don't like that.
I believe that libraries should offer self-service reference tools, and I believe those tools can be built out of librarians' work and knowledge, but I don't believe the QuestionPoint Global KnowledgeBase does the job. Other people still might change my mind, though.
QuestionPoint staff have been talking up the knowledgebase lately on their blog, so if they are thinking about how to improve it, here are some suggestions:
A knowledgebase is going to be most helpful when records are part of a larger corpus of documents, not as a tool to be searched itself.
To that end, leak the records into the open web, attribute the answers to the Ask-A-Librarian services. When people find the answers on Google or Answers.com, they'll also find a library.
Some library folks still might like to search just the knowledgebase sometimes, maybe as part of a bigger collection or federated search. If we were going to do that, I'd want to get results back from QuestionPoint and format them on our own library website. It would be great to have an API to search, and REST or OpenSearch would do nicely.
If the search were on our own site, I would also add a message to the effect that the librarian was online to help even if these answers were no good.
"Local library" questions are helpful to have answers to, but not in a "global" context. OCLC should encourage libraries to not submit knowledgebase records in the Global KnowledgeBase that only answer a question for a specific library.
Last, when I am logged into QuestionPoint as a librarian, I have a lot of "advanced" search options, but the search we can offer patrons only "ands" searches together. I think a different search method as default for patrons would reduce the number of times that no hits were retrieved, and even give the right answer sometimes.
For example, "How many people live in Oregon?" gets no results, but there is one result when I remove the question mark from the phrase. There are 6 for "oregon population", including one I would rate a 3. The knowledegbase contains the answer to the question "How many people live in Oregon?", or at least a good link for where to start, but you won't find it searching those terms.
It's apparent that the search algorithm is searching for knowledgebase records that match 'all the words' and that some basic data normalization isn't happening correctly. A better way to search would be to rank results based on what documents are "most like" the search terms.
With these improvements, I would definitely want to subscribe to the product, and would even encourage librarians to contribute to it. But for now, I'll be taking the search box off of our site next week.
| Attachment | Size |
|---|---|
| qpkb-searchterms.txt | 64.79 KB |

Comments
Paula, I definitely didn't
Paula, I definitely didn't know about the advanced search options for patrons, that's great. Keep in touch!
As a member of the
As a member of the QuestionPoint product team and a strong advocate for the knowledge base feature, I appreciate your study, Caleb, of the QP Global Knowledge Base (GKB) and sharing the results here. They are, if somewhat disappointing, enlightening nevertheless.
The problem with something like the knowledge base—some might call this its strength, as well—is that it becomes better as it is used. Librarians must submit their work to it in order to begin to reap the benefits from such a community built resource. That is perhaps a slower process than we might like, for the workflow must be natural and easy and each submittal reviewed and “scrubbed” by an editor and tags and keywords added—work that is all done, I might add, by volunteer librarians. Only over time and with the conviction of more and more librarians of the value of their work as reusable research can the GKB grow into a database that might answer the questions posed to it at the Oregon L-Net site.
Improvements can indeed be made, and your suggestions are appreciated; they are good ones. Surfacing the Global Knowledge Base on the web in such a way that its contributing libraries are the benefactors of the increased visibility is a goal well within our sights. It is encouraging to have someone like you give credence to that model. And offering the records through an API for a more localized, federated search is another great idea! I hope we can begin to have discussions for development in that direction in the near future.
I do want to point out that the current search engine does allow for advanced searching from another page; a link to that page should appear for the user who wishes to do a “New search.” Direct access to that page is also available as a link that libraries can place on their web sites. Advanced searching allows for OR’ing terms together, and for NOT’ing them out, as well as applying a few limitations to results. Not that there still isn’t room for improvement!
Being the knowledge base fan that I am, I must add that a KB can be used for much more than factual answers. And whether these uses become apparent or not is probably at least partly subject to how the KB is presented—both at the point of access and at the system level (again, room for improvement here). It could be a resource of topical information, a bibliography (or pathfinder) of subject material, a source of further reading, ideas for exploration, strategies for research.
So . . . having said my piece, I look forward to welcoming you back for a second look at the QuestionPoint Global Knowledge Base when the time seems right. In the meantime, we’ll miss your insightful input to the QP service.
--Paula Rumbaugh, Product Manager, QuestionPoint