With all the press coverage lately about the Google Print project, as well as our two NewsBreaks this week, which cover the Open Content Alliance and Microsoft MSN Search book digitization projects, it looks like books may have achieved the status of "the next big thing," as Barbara Quint suggested. But, as we talked about these recent developments, she and I agreed that many other worthy book search and access projects seem to be lost from view. So, here's a brief roundup of others that deserve recognition, including homegrown and commercial efforts.
The volunteer effort Project Gutenberg (http://www.gutenberg.org ) has been around since 1971—it's the oldest producer of free e-books on the Internet. On its site, it now reports 17,405 e-books (that are in the public domain) and is averaging 250 new books per month this year. Books may be freely downloaded. Project Gutenberg is a participant in Yahoo!'s Content Acquisition Program. This provides a search of book metadata (author, title, brief description, keywords). Google provides a search of approximtely the first 100 KB of the full text. Michael Hart, founder of Project Gutenberg, has estimated that "there are already well over 100,000 eBooks already available free for the taking on various Internet sites, perhaps 50,000 of them from the various Project Gutenberg sites."
Other book projects include the California eScholarship Initiative (http://www.cdlib.org/programs/escholarship.html), the Electronic Text Center at the University of Virginia Library ( http://etext.lib.virginia.edu), and the Humanities Text Initiative at the University of Michigan (http://www.hti.umich.edu ), to name a few. The Online Books Page is a book index of some 20,000-plus free Web texts edited by John Mark Ockerbloom; it's hosted by the University of Pennsylvania Library (http://onlinebooks.library.upenn.edu ).
There are also several European library and publisher initiatives. Reuters recently reported that the German association of book publishers plans to build a network by next year that will allow the full texts of the publishers' books to be searched online by search engines, but it will not provide the texts to Google and the other engines.
The European Commission adopted an initiative in June titled "i2010: European Information Society 2010" in which digital libraries are a flagship goal. On Sept. 30, 2005, at a meeting in Brussels, Belgium, the commission unveiled a strategy for making "Europe's written and audiovisual heritage available on the Internet." It presented a first set of actions at the European level intended to feed into a proposal for digitization and preservation for presentation in June 2006. At present, several initiatives exist in the member states, but they are fragmented. To avoid creating systems that are mutually incompatible and that duplicate work, the commission proposes that member states and major cultural institutions join the EU effort.
Other companies serving up access to digital books include NetLibrary, ebrary, and Knovel, as well as major publishers like Elsevier, McGraw-Hill, Oxford University Press, and others. Services aimed at the library market tend to focus on providing many value-add services and tools for users—and don't forget these are all available free to library users with their library card.
OCLC's NetLibrary recently chose Autonomy as its technology partner to provide academic, public, corporate, and special libraries with improved search and retrieval functionality. Autonomy's technology allows NetLibrary to index e-books, e-journals, and other content types regardless of format and/or location and make them available through a single search interface. Additionally, NetLibrary is using several other Autonomy features, such as cross-linking of files, content summarization, content suggestions, and spell-checking. These and other features will be part of a major site enhancement planned for this fall, called NetLibrary 4.0. NetLibrary currently provides customers with access to more than 95,000 full texts of reference, scholarly, and professional e-books, journals, and audio files.
ebrary has a growing selection of more than 60,000 full-text titles from more than 200 leading academic, STM, and professional publishers. More than 40,000 of these full-text titles are books. ebrary also offers users tools like highlighting, notes, bookmarks, copying, and printing. The ebrary Reader delivers pages to a patron's desktop page-by-page, eliminating cumbersome downloads. InfoTools gives every document word-level interaction to link to additional information.
The bottom line is that all of these projects and products are complementary. Users benefit by having book contents searchable and available, no matter what the source. In fact, content that's not digital could be in danger of extinction. We're clearly moving to a digital information world.
Here's what James Hilton, University of Michigan associate provost and interim librarian, said in a statement about the Google Print project (http://www.umich.edu/news/index.html?Releases/2005/Sep05/r092105): "In the future, most research and learning is going to take place in a digital world. Material that does not exist in digital form will effectively disappear. We need to decide whether we are going to allow the development of new technology to be used as a tool to restrict the public's access to knowledge, or if we are going to ensure that people can find these works and that they will be preserved for future generations."