Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research

News & Events > NewsBreaks
Back Index Forward
Twitter RSS Feed

The Wisdom of Crowds of Librarians Is on the Way—In Time: Reference Extract
Posted On November 24, 2008
Google can sleep easier—for a while, at least. A flurry of press coverage suggesting that "radical, militant librarians" —as the FBI refers to members of this profession—were heading its way turns out to be a little previous. While experts from three top library and information science institutions have begun a process that they promise will lead to a new search engine with a new infrastructure designed to emphasize authoritative content, the process is at very early stages yet. According to R. David Lankes, Ph.D., director of the Information Institute of Syracuse ( and associate professor at Syracuse University’s School of Information Studies, a product roll-out for Reference Extract ( is not expected to take place until sometime in 2010. The other two institutions involved are the University of Washington’s Information School ( and OCLC ( The MacArthur Foundation ( has provided a $100,000 planning grant, which should lead to a full proposal in 2009.

The press release announcing the new project triggered a spate of articles from Ars Technica, as referenced from the Huffington Post ("Using Crowdsourced Librarians to Outsmart Google," to the digital news services of the Library Journal ("Too Late or Just Right? OCLC, I-Schools Announce Reference Extract Web Search Project," and The Chronicle of Higher Education ("Librarians Want to Out-Google Google With a Better Search Engine,"

Some coverage made it seem as though the arrival of the new search engine was imminent. Some coverage—or at least the titles of articles covering the topic—made it look like the project was targeting Google. Not so, however. According to Lankes, some of the "let’s get Google" language may have come from "leftover adversarial language," stemming from initial conceptual work that challenged contributors to come up with ideas of what Google would look like if librarians had built it. However, Lankes did admit that the press release was "also a Rorschach test on what people think. We wanted to see reactions—either ‘Go get ’em’ or ‘You’ll never catch them.’"

Actually, Lankes says they would love to build services that work with Google search results. "We would love to partner with Google directly. Our ultimate goals are one, to improve the finding of credible information and two, make librarians the keystone to building credibility in the community. We want librarian-built, -run, and -owned tools integrated widely so that people don’t have to come to a library to use them. This requires a technical and physical infrastructure. Reference Extract will only be one of many network level services we can build and support. Increasing the knowledge of our user communities is our great goal. We are talking now on how to capitalize on the power of librarianship, which increasingly is not centralized in borders and waiting for people to come and get it. Reference Extract is just one of many different ideas on how to look at credibility issues and how librarians can be recognized for doing that. The fact that the press coverage shows the idea has gotten picked up outside of library literature is encouraging; we’re not just being dismissed."

Though planning is still underway, one clear difference has emerged between this and past attempts to transform the expertise of librarians into useful web source answer services. Most of such services (e.g., the excellent Intute,, formerly the Resource Discovery Network, a consortium of seven U.K. universities) rely on overall, general evaluations of web-based sources contributed by librarians. As Jeff Penka, director of QuestionPoint services at OCLC, points out, such a contributory approach requires librarians to stop what they’re doing and take on a separate assignment. But even relying on existing lists of good sources from the library community does not always precisely answer the questions "good for what? Good for whom?"

In developing Reference Extract and assorted services, the new project will tap into the expertise of working reference librarians answering "real" questions on specific topics. It will use collected data on reference transactions from library "AskA" services, in particular OCLC’s QuestionPoint. Penka points out that QuestionPoint has "both thrived and grown. It is now used in 32 countries with the help of around 13,000 librarians in over 2,000 libraries. It has an annual traffic of over 1 million reference transactions." Transaction records for QuestionPoint represent reference interviews as well as source guides. Penka explained, "These are actual artifacts of experience. If we have a hypothesis, we can test it against actual data representing years of QuestionPoint experience. We can make sure a source is utilized effectively and as appropriately as can be."

However, it is not just QuestionPoint that the new project will tap. Both Lankes and Penka stressed the openness of the system, not only to ideas from the library community but to new sources and new bodies of content. Penka stated, "We are talking with others. This is going to be broader than digital reference operations. We are reaching out to the community, people with any library reference services and their constituencies. We want people to contribute on appropriate issues for building the infrastructure. We are starting with the locus of an idea and some heft, but it’s not just preexisting research plus data on practical experience. We’re feeding it all into a central point."

Over the next 2 months, the team will conduct a variety of meetings and solicit comments with a blog on the website. They will release news and notes, hold webinars, appear at a national conference, and even stream a video blog. All this is aimed at creating a proposal which, according to Lankes, they "hope to implement next year, building it and running it by people and then rolling out real services sometime in 2010." The broad proposal will initially be aimed at the MacArthur foundation, but Penka pointed out that finding "other partners is also a purpose of the planning grant. MacArthur has no claim to exclusivity."

According to Lankes, the current planning effort needs to figure out a technical architecture, a partnership model, a library community project, and a business model that can be sustained over time. It needs to build on the taxonomy of question-and-answer. We want to be careful in thinking how to tie things together to make the most relevant set. We do not want to produce one source and then say, ‘Here. It’s all done.’" He intends to use semantic data processing of reference transactions to build a sophisticated system. Penka refers to what they hope to create as a "credibility engine."

For a view of what the Reference Extract team (minus OCLC) has already done, go to the Credibility Commons ( and check out the software and other tools it now offers to assist in handling credibility issues.

The new hakia semantic search engine ( has issued a clarion call to "librarians and information professionals to participate in a new program to unlock credible and free web resources to web searchers." Initially it will concentrate on health and medical sources but intends to expand coverage to all topics. It even offers contributing librarians some dollar incentives.

However, just as I was writing this NewsBreak, the news came out that Google is launching a SearchWiki that enables searchers to customize and adjust their search results. It will even let users share their customization with other searchers. (See That was quick. Looks like those "radical, militant librarians" may have made the Big G nervous.

Barbara Quint was senior editor of Online Searcher, co-editor of The Information Advisor’s Guide to Internet Research, and a columnist for Information Today.

Comments Add A Comment
Posted By MIKE CASSETTARI12/1/2008 2:07:02 PM

This is an interesting article, and similar to what we are seeing in the social libraries space. When searching for documents in the organization’s shared hard drive or information repository, the quality of the results are a direct function of the quality of the repository. If the repository largely contains accurate, relevant data, the search results will largely yield accurate, relevant data.

At the heart of managing this is the librarian. He or she creates a credible repository by weeding out outdated data, and adding vetted information. Like the Credibility Commons project, this is a community-supported model. Domain experts use social technologies to provide their knowledge on a subject and enhance the vetted information.

But a credibility engine, as Penka refers to it, can only thrive in a secure environment where content and social media can be controlled, as well as a place where the community can both enhance and develop content.

Today’s librarian knows that search and content development is a team effort because knowledge is more than vetted white papers, books, or image repositories. And it’s more than social communications and networking. It’s the combination of the two, and only when they are integrated and properly managed, does an organization achieve a true social library.

Mike Cassettari

              Back to top