With all of the mass book digitization projects in the news, people may have begun to wonder why the largest library in the U.S., the Library of Congress (LC), has been silent. Wonder no longer. The emerging "megalibrary" is getting a virtual "rare books" room. LC (http://www.loc.gov) has launched an initiative to build a World Digital Library (WDL). Its first partner is Google, which has made a $3 million contribution. In contrast to the Google Book Search and Open Content Alliance strategies of digitizing all books, the WDL project will focus on digitizing "rare and unique cultural materials held in U.S. and Western repositories with those of other great cultures." Along with books, the WDL will cover documents, video, audio, manuscripts, etc. Funding and content will come from public and private partnerships, with both U.S. and international participation. The WDL expands on LC's Global Gateway, begun in 2000, by broadening the geographic scope to non-Western nations and cultures and focusing on the cultures and histories of those nations. The Global Gateway focuses on materials reflecting the historical intersections between contributing countries, and the U.S. Google's contribution will help fund the initial planning stages for the project.
The World Digital Library initiative stems from a speech given by the Librarian of Congress, James H. Billington, to the U.S. National Commission for UNESCO in June (http://www.loc.gov/about/welcome/speeches). The Library of Congress has long pioneered digitization, in part as a function of its preservation efforts. It has already built major digital libraries involving extensive digitization, especially of fragile material (e.g., the American Memory project begun in 1990, which launched into a Web site program in 1994 as part of the National Digital Library Program). The American Memory Web site (http://memory.loc.gov/ammem) now includes more than 10 million rare and unique materials supplied by LC and its partners. The U.S. Congress has also mandated LC to take a leadership role in forming the National Digital Information Infrastructure and Preservation Program (NDIIPP), a network of institutional partners working to build a digital preservation architecture for collecting, preserving, and making accessible material only available in digital form (http://www.digitalpreservation.gov).
In announcing Google's status as the first partner in this public-private initiative, Google co-founder and president of technology Sergey Brin stated: "Google supports the World Digital Library because we share a common mission of making the world's information universally accessible and useful. To create a global digital library is a historic opportunity." Google already has worked with LC on pilot digitization projects; one recently completed project involved approximately 5,000 public domain books that were considered fragile. The company continues to digitize historical material in LC's Law Library. None of the arrangements involve payments to Google. LC representatives also indicated that the Library is considering working with the Open Content Alliance as well as Google.
LC representatives said that no specifics had yet been set for the formats in which the data would appear or for its accessibility through full-text searching or translations. Finding known items using bibliographic metadata was the only promise they could make future users at this early stage of the project. In developing the initial plan for the WDL project, LC will focus on identifying key technology issues for digitization and organization, including presentation, maintenance, standards, metadata schemas, and resources such as equipment, staffing, and funding. The effort will focus both on access and preservation. Once developed, LC will make the WDL plan available to libraries, content owners, and their supporters.
As for its experience with international coverage, more than half of LC's existing book collection consists of non-English language books. The Global Gateway Web site (http://international.loc.gov/intldl/find/digital_collaborations.html) contains multilingual content and multimedia presentations that include contributions from repositories in Russia, Spain, Brazil, the Netherlands, and France. LC representatives expect WDL to build on the Global Gateway partnerships and hope someday all national libraries will participate.
When it comes to issues of handling non-Roman alphabets, Kevin Novak, director of Web services and educational outreach at LC, said that the Library of Congress Integrated Library System had become Unicode-compliant for eight popular scripts just this month. Unicode is the universal coded character set standard. This would make searches possible via non-Qwerty, non-Roman or augmented Roman alphabet keyboards. However, Novak also indicated that LC would probably work out multiple access routes to accommodate different languages. For example, in the case of handwritten or poorly typewritten documents in the American Memory digital collection, LC provides clear digital text supplements.
Library of Congress inclusion policies for its digital collections such as American Memory require all items to be free of copyright, either because they fall into the public domain or they have become available due to special permission. Novak said that, in the case of the American Memory collection, many outlets have not only linked to its content but have taken it and, in some cases, produced CD-ROMs for sale. It is yet to be decided how policies will change when dealing with the world's national libraries, as LC hopes to do with the WDL project.
However, Billington's goals aim at the broadest possible access. "The World Digital Library would make these collections available free of charge to anyone accessing the Internet, and it could well have the salutary effect of bringing people together by celebrating the depth and uniqueness of different cultures in a single global undertaking," said Billington. "We are grateful for Google's contribution to this important initiative, and we will seek contributions from other private sector companies with an equally enlightened self-interest."
Regardless of any risks the emerging "megalibrary" would pose to the future of support for traditional brick-and-mortar libraries, Novak said LC is "striving to get the information out there for its educational and historical value. The end goal is to demonstrate to people the importance of libraries, of the Library of Congress, and of the value of our collection."