"‘Bombshell’ is an apt term," said Stephen Crocco, head librarian at the Princeton Theological Seminary, in describing reactions to the news that awaited him on Friday, May 23, after an urgent summons back to the office interrupted his lunch. A blog announcement followed by an email message informed Crocco and his staff that the Live Search Books project scheduled to go into full production June 1 would no longer be supported by Microsoft. As of May 30, Microsoft (www.microsoft.com, www.live.com) ended the Live Search Books and Live Search Academic projects initiated in 2006, in obvious competition with Google Book Search (originally Google Print) launched in 2003 and Google Scholar begun in 2004. Both library and publisher partners of Microsoft began the scramble for alternatives. "We’ve got to scurry," said Brewster Kahle, head of the Internet Archive (IA; www.archive.org) and leader of the Open Content Alliance (OCA).
Though the content from both services—described in the Microsoft announcement as some 750,000 full-text books and indexing to 80 million scholarly journals and reports—will remain in Microsoft’s Live Search engine, the material will no longer have special segregated interfaces and handling. The large-scale digitization and scanning programs Microsoft had supported with leading libraries in the U.S. and abroad will "wind down," along with the digitization and ebook services for in-copyright books to publishers.
The reason for the cancellation was money or the lack of it. Microsoft could not foresee a sustainable business model for the service, though neither Live Search Books nor Live Search Academic had carried any ads beyond links to online booksellers. The announcement from Satya Nadella, senior vice president for search, portal, and advertising at Microsoft, stated the following:
Based on our experience, we foresee that the best way for a search engine to make book content available will be by crawling content repositories created by book publishers and libraries. With our investments, the technology to create these repositories is now available at lower costs for those with the commercial interest or public mandate to digitize book content.
Under the direct Microsoft arrangement with libraries, access to scanned books was available through Live Search. In ending the program, Microsoft announced it was "removing our contractual restrictions placed on the digitized library content and making the scanning equipment available to our digitization partners and libraries to continue digitization programs." The Internet Archive had served as Microsoft’s digitization agent for the large scale digitization and will take over ownership of the scanning equipment in place at former Microsoft library partners. The IA also has its own Scribe scanners in place at Microsoft libraries and OCA member libraries.
Although immediate reactions to the announcement seemed skeptical as to the ability of library budgets to absorb digitization on the scale Microsoft had promised, Kahle saluted the $10 million effort by Microsoft over the last several years. Reacting to the Microsoft announcement, Kahle posted a notice on the IA forum, speaking also for the OCA, and stating, in part, the following:
The Internet Archive operates 13 scanning centers in great libraries, digitizing 1000 books a day … Today, Microsoft has announced that it will ramp down their investment in this area. We very much appreciate their efforts and funding in book scanning over the last 3 years. As a result, over 300,000 books are publicly available on the archive.org site that would not otherwise be. …
Funding for the time being is secure, but going forward we will need to replace the Microsoft funding. Microsoft has always encouraged the Open Content Alliance to work in parallel in case this day arrived. Let’s work together, quickly, to build on the existing momentum. All ideas welcome.
Kahle was both surprised and unsurprised by the announcement. "I always knew this would happen. It’s what corporations do. I just didn’t know when. I didn’t think it would be this year. I hoped for another year or so, but I’m thrilled that they worked so long and hard and brought us to another level."
Kahle feels that the announcement was a "wake up call. The idea of a couple of corporations owning the history of intellectual discourse is a bad idea. That should be the job of libraries and publishers, not one corporation." To Kahle, navigation and hosting of content should be distinct functions in order to guarantee the widest distribution of content. One of the reasons Microsoft pulled out of the OCA a year after it started was to guarantee that Live Search would host the masses of books they committed to digitizing. The OCA has a firm policy of opening all its content to all search engines, even Google. Kahle’s primary quarrel with Google Book Search lies in its confining access to Google searchers.
Live Search Academic
All the furor that erupted after the Microsoft announcement seemed to focus on Live Search Books, but Live Search Academic reportedly gets much more usage. The internet rating service, comScore, had begun to track Live Search Academic in its analyses of Live Search usage. Though comScore’s procedure of rounding up to the million zeroed out Live Search Academic statistics, a comScore representative estimated that it was probably running a couple hundred thousand uses a month.
The Live Search Academic service took content from 52 publishers including ACM Press; American Institute of Physics; American Physical Society; Blackwell Publishing; Elsevier, Inc.; IEEE Press; Institute of Physics Publishing; John Wiley & Sons; Nature Publishing Group; Taylor & Francis Group; etc. It focused on computer science, physics, and electrical engineering. Microsoft also partnered with CrossRef (www.crossref.org) to supply metadata and DOI identifiers, which—among other things—assisted in expediting link resolving citations to library holdings.
However, Ed Pentz, executive director of CrossRef, says that publishers had told him that Google Scholar was much more productive of referrals. Actually, Pentz added that the main Google service supplied even more referrals than Google Scholar.
When I asked Anurag Acharya, the engineer behind Google Scholar, whether he wanted any content that appeared on Live Search Academic, he said, "I’m not sure, but as far as I know, we have everything they had." When I asked him if there were any features to Live Search Academic he would like to have, he answered, "No. If I liked it, I would have had it." Such is the language of digital Armani-wearers.
For publishers the range of alternatives seems broad and growing. Tom Turvey, director of Google Book Search partnerships, reports that it has more than 20,000 publisher partners supplying well over a million books to the Google Book Search collection. Turvey said that their partners, to whom they are fully committed, are "more engaged than they have ever been. We don’t set a restrictive or exclusive policy. It makes sense for publishers to have as many avenues to their market as possible. People look in multiple places, so it’s good for publishers to support multiple channels."
Libraries that seek low-cost mass digitization clearly face a harder challenge. Google remains in a growth mode, but that can hardly last forever. The OCA, with scanning operations already working at 70 libraries and the participation of several large commercial concerns even after Microsoft’s withdrawal, would like to take on as much as they can, but they do charge 10 cents a page.
One reality remains: According to analyses of the search market, 60% of all searches occur on Google. Yahoo! has about 29% and Microsoft’s Live Search is about 9%. If you want your data seen, you have to get it to Google. Microsoft apparently tried to use book and scholarly journal content to draw customers to Live Search and, when it didn’t work, no eyeballs meant no chance of advertising or sponsorship revenue. Google doesn’t attach ads to Google Book Search or Google Scholar, as such, except for links to online booksellers. However, it could at any time.
The issues of funding library digitization and guaranteeing the broadest access remain in flux.