Google Books (http://books.google.com and http://books.google.co.uk) continues its march through the national libraries of Europe with the announcement of a deal with the British Library. Simon Bell, head of Strategic Partnerships and Licensing at the British Library, reports that the Library is the fifth European nation to open its national library collection to Google Books over the last couple of years. The digitization project between Google and the British Library will focus on 250,000 books and documents published between 1700 and 1870. The project will encompass up to 40 million digitized pages. Clearly out-of-copyright, the content will be available to all through both Google Books and the British Library’s own website.
The United Kingdom’s national library contains one of the largest collections in the world. With a history extending over 250 years, the British Library has more than 150 million separate objects, including books, journals, manuscripts, stamps, music, patents, photographs, newspapers, and sound recordings. When Google began its Google Books (née Google Print) library partnerships, which now include more than 40 libraries worldwide, Oxford University’s Bodleian Library was its first and largest U.K. contributor. Bell listed the recent national libraries Google has added: Italy with two national libraries (Florence and Rome), Austria, the Netherlands, and the Czech Republic.
Although Bell described the scope of the new digitization project as “a real eyeful,” he added that it is only “a drop in the ocean as far as our entire collection.” The British Library website already offers around 4 million digitized items. Recently, the Library announced a partnership with brightsolid to digitize up to 40 million pages of its newspaper collections. In the past, the Library partnered with Microsoft to digitize 65,000 19th century books, some of which are now available as an app on Apple’s iPad.
Both the Library and Google are committed to broad access. The Library’s 2020 Vision document commits the Library to a policy of opening access to anyone who wants to do research. However, there are still some access issues. Though available for full-text searching, downloading, reading, sharing, and manipulating, the Google/British Library arrangement stipulates content is available only for “non-commercial purposes.”
Even on the British Library’s own website, not all the digitized items appear in one central location. Bell described the situation, “We have made a number of deals with others and each is different with different contractual stipulations. The Google deal is great. It’s free and users will be able to look at the content on the British Library web site anywhere in the world. Some of our other contracts only allow viewing the content on the premises. This deal gives all users the right to use it for free. By comparison, other deals put content behind paywalls.”
He added, “The underlying material is clearly in public domain. There’s no question about that, but Google is creating other assets, like metadata and OCR [optical character recognition]. So, we can’t license or do anything with their digital copies commercially or through commercial third parties. It’s still a fairly good payoff for us. We’re not going to get that kind of money from the government. As to noncommercial uses, the time will come when the contract has run its course. When the contract is done, then we are open to new uses.”
No copy of the contract was available for the public and Bell wouldn’t say how or when the contract might run its course. He did say, however, that “if another organizaton wants to digitize even the same content, they are free to do so. But, they can’t take Google’s digitized content. That’s Google’s digital asset and we can’t give it to another entity.”
However, Bell indicated that they can share by “exposing” content to other collections. For example, they plan to make the works available through the Europeana Digital Library. Bell also indicated that the Library is in regular communication with the HathiTrust, the non-profit organization based around merged contributions of some Google Books library partners. He described joining the HathiTrust as “a possibility, but not at the moment. We are, however, in close contact with their libraries, particularly the ones that were part of the Microsoft book digitization, like the hundreds of thousands of titles digitized for the Library of Congress.”
As usual, Google will assume all the digitization costs for the project. It will supply copies of digitized content back to the British Library, which will preserve archived copies in its Digital Library System. In the course of the project, which Bell guessed might take 3 years, the Library will select the books, pamphlets, and journals to be digitized. It will focus on material not currently available digitally. Bell described the process, “We will draw up a list of content not yet digitized, doing a rough search, then Google will do a better search. It only makes good sense. They already have some 15 million books and a substantial number of them in English.”
The first works to be digitized will range from feminist pamphlets about Queen Marie-Antoinette (1791), to the invention of the first combustion engine-driven submarine (1858), and an account of a stuffed Hippopotamus owned by the Prince of Orange (1775).
By the way, the British Library has a number of digitized book collections available. According to Bell, a thousand titles are free in their Historical Books Collection available through an iPad app. And, plans are continuing. “In August of this year, we will launch a new platform with a bunch of other stuff, free for higher education within the U.K., plus an iPad project. And we have content available in print-on-demand with Amazon. Our content is largely free though.”