KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM EContentMag Faulkner Information Services Fulltext Sources Online InfoToday Europe Internet@Schools Intranets Today KMWorld Library Resource Literary Market Place OnlineVideo.net Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



News & Events > NewsBreaks
Back Index Forward
Twitter RSS Feed
 



CORRECTIONS: Google Print Not All I Said It Was
by
Posted On August 29, 2005
Google does not supply participants in the Google Print program with digital, "e-book" versions of the print books digitized from publisher inventories or library collections. Publishers receive no digital copies of any kind. The five giant research libraries working with Google Print for Libraries do receive digital page images and text files representing the OCR (optical character recognition) of the text. However, these are not anything like e-books, e.g., PDF downloadable files.

At this point, I have written a half dozen NewsBreaks on the Google Print project, not counting this one. My coverage started back in December 2003 and extended to a NewsBreak published Aug. 15 of this year.

"Google Beta Tests Book Search Service,"
http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16555

"Google Print Expands Access to Books with Digitization Offer to All Publishers," http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16357

"Google and Research Libraries Launch Massive Digitization Project," http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16307

"Google's Library Project: Questions, Questions, Questions,"
http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16302

"Google Library Project Hit by Copyright Challenge from University Presses," http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16195

"Google Slows Library Project to Accommodate Publishers,"
http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16141

In the first NewsBreak, I wrote, "Clearly Google will retain ownership of the scanned version of the texts. A representative indicated that the company has no plans at present to share the digitized copies with publishers or any Google partners or affiliates." But somewhere along the line, I came to believe that Google did supply electronic copies to publishers, e-books that the publishers could sell. In the last NewsBreak, covering Google's offer to connect Google Print publisher accounts to books digitized through library connections, I described how the offer would let publishers "acquire saleable digital copies of their backlists via library-held copies."

Not so, as Google representatives were kind enough to inform me. I herewith apologize to all my NewsBreak readers for this serious mistake. Readers of Searcher and Information Today will soon see my apologies for columns that discussed the impact of the Google Print project on libraries and the information industry.

Alarmed by the shortfall in my journalistic accuracy, I decided to double check all my coverage and see if there were any other discrepancies that needed clarifying or coverage that needed updating. There were a few points:

"Now, whenever a book contains content that matches search terms, Google will display a special box linking to book results."

Google does not use a box. A "Book Results" listing may appear at the top of a general search results page letting users click through to a book page.

"While Google does not publish a complete list of publishers participating in the program. ..."

Google representatives informed me that "the vast majority of major publishers in the U.S. and U.K. are currently participating" in the Google Print for Publishers program. Publishers in the program come in all sizes, even some with a single print. The program is open to all copyright holders, including authors.

"Ads only appear on the initial Web search page from which users can click through to book results; a royalty percentage of money from those ads goes to Google Print publishers."

According to Adam M. Smith, manager of the Google Print program, publisher ad revenue depends upon the publisher's decision to allow ads on their content. As of now, Google has decided not to attach ads to the public domain books, to which Google offers full-text access. Nor do the books entering the system from the research library connections carry any ads. The ad revenue comes from book result displays with ads appended, not the initial Web search page.

"Google will temporarily stop digitizing in-copyright books from its library partners and will concentrate, instead, on accelerating its public domain book digitization. ... The moratorium will last until November."

The moratorium ends at the beginning of November, but Google will continue to honor requests from publishers for a while, probably until early next year.

"... one librarian working with Google who said that the library received its electronic digitized copies regularly every month ..."

The University of Michigan continues to supply detailed information on its participation in the Google Print for Libraries project. It released a copy of its Google Print contract as a public document ( http://www.lib.umich.edu/mdp/um-google-cooperative-agreement.pdf).

Its latest FAQ, "UM Library/Google Digitization Partnership FAQ, August 2005," (http://www.lib.umich.edu/staff/google/public/faq.pdf), describes the format of the material Google returns to them:

  • Most pages (i.e., those that consist of print without illustrations) are delivered to Michigan as 600 dpi TIFF images using ITU G4 compression.
  • Occasionally, pages include significant illustrations; these are provided to Michigan as 300 dpi JPEG2000 images.
  • OCR (performed by Google) is provided with each page.

Well, now you have the facts. And, if you'll forgive a final, self-centered, rhetorical flourish, the truth-telling experience has put me in mind of two quotations. One, "There may be some things more painful than the truth, but I can't think of any." Two, "Truth can be costly, but in the end it never falls short of value for the price paid." It's been a learning experience and I hope I have learned from it. Accuracy, Accuracy, Accuracy.


Barbara Quint is senior editor of Online Searcher, co-editor of The Information Advisor’s Guide to Internet Research, and a columnist for Information Today.

Email Barbara Quint
Comments Add A Comment

              Back to top