Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
PRIVACY/COOKIES POLICY
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



News & Events > NewsBreaks
Back Index Forward
Threads bluesky LinkedIn FaceBook Instagram RSS Feed
 



Google Book Search Has a Busy Week
by
Posted On June 18, 2007
Well, if it's not one thing it's another. The first week of June saw Google Book Search (http://books.google.com) add a 12-university consortium covering all the Big 10 schools and The University of Chicago, representing a potential 10 million-volume expansion that will include in-copyright material. Meanwhile, at a book publishing industry conference, the chief executive of a leading international publisher stole Google's laptop computers in a kind of anti-GBS protest. And, if that wasn't enough, a university in Georgia has decided to out-Google Google by starting its own mass digitization project accompanied by a revenue-producing, print-on-demand service administered by Amazon.com. What next?

The Committee on Institutional Cooperation (CIC; www.cic.uiuc.edu) is a consortium of 12 universities: The University of Chicago, the University of Illinois, Indiana University, The University of Iowa, the University of Michigan (an original Google Book Search Library Partner and the primary source of in-copyright books for GBS), Michigan State University, the University of Minnesota, Northwestern University, Ohio State University, Pennsylvania State University, Purdue University, and the University of Wisconsin-Madison (already participating since October 2006). In toto, the CIC libraries hold more than 75 million volumes with some highly prized special collections, such as Northwestern's Africana, The University of Chicago's South Asia holdings, the University of Minnesota's Scandinavian and forestry collections, Michigan State's extensive holdings in agriculture, Indiana University's folklore collection, and the history and culture of Chicago collection from the University of Illinois-Chicago.

Mark Sandler, director of the CIC's Center for Library Initiatives and formerly a participant in the launch of the GBS Library Project at the University of Michigan, saluted the new initiative: "We value the legacy collections built over the long histories of our libraries and want to ensure they remain accessible and discoverable in a digital age. We have a remarkable opportunity not only to preserve what easily could be lost, but to make the entirety of our print collections more accessible than ever through a simple computer search."

An estimated 10 million volumes will be made available to Google digitization, though that number is only an upper-limit estimate and does not include the volumes already contracted for with the University of Michigan and the University of Wisconsin-Madison. The contract between Google and CIC extends for 6 years with an option to renew, but this does not mean that CIC expects Google to digitize all 10 million volumes in 6 years. According to Sandler, that number constitutes a theoretical upper limit. No one knows how long it will take, but if they need to renew the contract, both parties are open to it.

At present, the CIC is just starting its operational discussions. Currently, only five of the 15 GBS library partners allow access to their in-copyright as well as public domain books: the University of California, the University of Michigan, Stanford University, The University of Texas, and the University of Virginia. That number now jumps to 16, because the CIC intends to also allow digitization of in-copyright materials. It will also—again following standard Google Book Search Library Project practices—make public domain materials available for viewing online, searching, and downloading in their entirety from the Google site.

In a unique arrangement, a shared digital repository will collectively archive and manage the full content of public domain works held across the CIC libraries. The repository will be operated at the University of Michigan, which already provides full-text books in its MBooks series. Under the Google Book Search Library project, participating libraries usually receive PDF files for individual pages, along with the OCR (optical character recognition) files needed to enable searching. Sandler indicated that the CIC repository would have whole documents. For those interested in downloading public domain books, the best general source will still be Google Book Search itself, although, according to Sandler, the MBooks service does offer a backdoor route to downloadable books now. Initially, Sandler expects CIC will follow the familiar MBooks model of their host service, but he expects there may be some differences in time. The shared repository, available only to CIC members, will allow faculty and students across the consortium to access material housed in separate locations and currently connected only by online catalogs with interlibrary loan policies and reciprocal borrowing arrangements. It will contain only out-of-copyright, public domain material.

Sandler hailed this shared repository as accomplishing "the long-held dream of bringing our collections together for users. While this was never a compelling vision for faculty in the print world, it's much more realistic in the virtual world." Sandler also mused on the impact for all research libraries: "What does this suggest for the future of these top-tier ARL libraries—for cataloging, for services, for administration even? We could be moving toward much higher levels of aggregation as the only way to meet the ever increasing expectations of users."

As for how Google and its CIC partners will handle in-copyright material, a new development has arisen and has already begun to build into a controversy among library bloggers. Instead of returning the digital copies for in-copyright books to the libraries as in the past, Google will maintain them in a "hosted solution" escrow service, until the material's copyright status clears. For example, if the passage of time moves content into the public domain, or if a library can prove licensed access to a file, or if Google wins its lawsuits with publishers and authors, or if Congress changes the copyright laws, or … whatever, but until then, the content will remain at Google.

Sandler pointed out that all the different contracts Google has with its Book Search library partners have some differences. "The good thing is that they reflect the needs or perceptions of each campus. I give Google some credit for trying to be responsive. No one wants to infringe on rights holders. Though the escrow language is new to the agreements, we saw it as a very good thing to stipulate their managing and holding the content on our behalf. We saw it as a benefit."

This new project represents a major undertaking for the 50-year-old consortium. Barbara McFadden Allen, director of the CIC, stated: "The initiative is not entirely without controversy—no great undertaking ever is. But our universities believe strongly in the power of information to change the world, and in preserving, protecting, and extending access to information. We have carefully weighed and considered the intellectual property issues and believe that our effort is firmly within the guidelines of current copyright law, while providing some flexibility as those laws are tested in the new digital environment in the coming years. Here in the CIC, we don't just talk about collaboration. It's part of the way our universities do business together. And this project is just one more example of the ways our universities work effectively to share expertise, leverage campus resources, and collaborate on innovative programs."

‘Gimme That!’

Google Book Search seems to have introduced the company into a whole new community—the book crowd. And not all the new folks they're meeting seem friendly, even ignoring the ongoing lawsuit by the Association of American Publishers and The Authors Guild. For example, at this year's well-attended BookExpo America's meeting, held from May 31 to June 3 at the Javits Convention Center in New York City, the CEO of one of the world's leading publishers took time out of his busy schedule to steal laptop computers right out of the Google booth. Richard Charkin of Macmillan, Ltd. not only did the deed, he bragged of it and described "The Heist" in detail—and with pictures—on a June 2 posting at his blog (http://charkinblog.macmillan.com). The "reliable contrarian," as his blog describes him, described the theft as a protest against Google Book Search's Library Project that includes digitization of in-copyright material. He said, apparently referring to Google's only allowing publishers to stop books from being digitized by filing some formal "opt-out" notice with Google:

Our justification for this appalling piece of criminal behaviour? The owner of the computer had not specifically told us not to steal it. If s/he had, we would not have done so. When s/he asked for its return, we did so. It is exactly what Google expects publishers to expect and accept in respect to intellectual property.

‘If you don't tell us we may not digitise something, we shall do so. But we do no evil. So if you tell us to desist we shall.'

Of course, Charkin waited until absent Google staff members returned to the booth and gave them back their computers. If he hadn't, he might have needed the services of a book publishing attorney advertising on his blog. By the way, the ads are supplied by Google.

On the other hand, not all the natives were hostile. One panel at the conference—sponsored by Google—had four publishing executives from the Publisher Partner side of Google Book Search, all singing the praises of the service. Simon & Schuster's Kate Tentler pointed out that usage statistics show that the program has markedly increased "discoverability" for their books and brought users to the Simon & Schuster Web site who browse longer and buy books directly from them. Evan Schnittman of The Oxford University Press enthused over the 37 million page views for OUP books in 2 years with 321,000 clicks on the "buy the book" button. Not all those clicks meant sales, but it still means a lot of found money. Paul Manning from Springer said that they log from 600,000 to 1 million book clicks a month, and 75 percent of the Springer titles in the program have had "buy the book" clicks. Patrick Durando from McGraw-Hill cited more than 60 million views. Schnittman expected some day that publishers would get royalties from Google for their books "under an ASCAP-like model." If it ever came to pass, it could mean that Web users could pay a subscription fee that covered all their book needs.

Imitation: The Sincerest Form of Flattery

When a "Long Tail" starts swishing, it can attract a lot of attention. It can even inspire some other tails to get in motion. Who knows how much influence the availability of Google Book Search downloads of public domain books may have had on the recent re-re-emergence of electronic books? But one can definitely connect Google Book Search to a revival and expansion among library-based or -targeted digitization programs.

Last week, Emory University in Atlanta announced a program of digitizing out-of-print books in its Woodruff library that would make them available to scholars for online browsing as well as supplying print-on-demand bound copies. The project relies on digital scanning technology supplied by Kirtas Technologies, Inc. (www.kirtastech.com). The Kirtas robotic book scanner can digitize as many as 50 books per day. According to Patrick O'Grady, digital content librarian, policies for handling the book scans have not yet been fully set. Whether they will make them available to the open Web or through Emory connections only is still unknown.

Initially, Amazon.com will handle the sale of the print-on-demand book through its BookSurge subsidiary (www.booksurge.com). Emory hopes revenue received from the sale of digitized copies will help the university recoup its costs but plans to make the POD books very affordable. In total, the university now has more than 200,000 out-of-print volumes, some of the rarest of which and those focused on the university or studies of the South are already being digitized. By fall, items from that pilot project should be available for print-on-demand. Current material could also enter the program.

Rick Luce, vice provost for libraries at Emory, said, "The Emory libraries plan to use the program to support an array of scholarly publishing needs of our campus. We will be providing new opportunities for our faculty and students to disseminate their work, if they choose to do so, under the Emory banner." O'Grady remarked, "What distinguishes our program is that Emory retains ownership of the scans and doesn't give them away to our corporate partners."

If you're interested in tracking what library partners already involved in the Google Book Search project are doing with their digital copies, check out the article by Jill Grogg and Beth Ashmore in the April issue of Searcher, entitled "Google Book Search Libraries and Their Digital Copies: What Now?" and available online at www.infotoday.com/searcher/apr07/Grogg_Ashmore.shtml.


Barbara Quint was senior editor of Online Searcher, co-editor of The Information Advisor’s Guide to Internet Research, and a columnist for Information Today.


Comments Add A Comment

              Back to top