Reveal Digital Looks to Digitize Special Collections
Posted On February 7, 2013
A new company, Reveal Digital, offers a service to libraries under a new cost-recovery revenue model aimed at permanent archiving and open access. Special collections, curated by expert librarians, exist all over the country in libraries large and small, public and academic. But, too often these collections have not gone online due to a lack of resources or funding. Now Jeff Moyer, an experienced manager of major vendor digitization projects, has formed Reveal Digital to create a network of library partners and specialty digitizing services.
The first collection under the process of aggregation and digitization by Reveal Digital is titled Independent Voices, a multisource compilation of alternative press and advocacy publications. This project is expected to take several years. The new cost-recovery model lets libraries interested in subscribing to special collection packages indicate what they would pay to contribute to each project. Once costs are recovered, the collections go to archiving and open access. Reveal Digital uses outside services to provide key functions.
Moyer stated his goals for the new company: “We believe this platform will enable new and unimagined collaborations among libraries and content holders as they work together to create collections of all kinds—from highly specialized to broad and inclusive.”
Probably the most intriguing aspect of Reveal Digital’s plan is its “cost recovery=open access” business model. The way it works, after identifying candidate special collections, the company and its team of contributing vendors assess all costs associated with producing a digital collection. This assessment then becomes its sales threshold. They then market the prospective product to libraries, which make nonbinding agreements to a certain figure. The greater the number of prospective purchasers, the lower the price and/or the larger the coverage. If there is insufficient interest to meet the sales/cost threshold, Reveal Digital will go ahead anyway, scale back coverage, or cancel the project. Two years after the sales/cost recovery threshold is reached, the collection moves to open access.
Costs include the cost of conversion (sourcing, scanning, metadata creation/tagging and loading), editorial (title selection, copyright clearance), royalties (when required), sales and marketing, systems (development, hosting, archiving), overhead (legal, finance, and general business expenses), and publishing and management (overall project guidance). During an open enrollment period (usually around 6 months), libraries will be asked to enroll in a nonbinding agreement to help estimate sales forecast and, thereafter, pricing for each collection. The purchase price is determined based on the expected number of libraries that will purchase the collection in its first 4 years. Prices are tiered based on library type and population served.
Several outfits are working with Reveal Digital to operate the service. LYRASIS, a 1,700-member library consortium, will act as the primary sales channel for collections and handle invoicing. LYRASIS membership is not required for libraries to purchase collections. PubFactory, a property of Safari Books Online, will host the publishing platform. Image Data Conversion will do scanning, although libraries with in-house capabilities may prefer to do it themselves. As for permanent archiving for open access, Reveal Digital is still in discussion with such services as Internet Archive and the HathiTrust.
Documents digitized by Reveal Digital use image formats rather than direct text. Pages are scanned at 300 dpi, 24-bit color (uncompressed TIFF and compressed JPEG and 72-dpi thumbnails for web access). Title/issue/page-level metadata is created in a METS/ALTO format. As for searching, Moyer affirmed that OCR treatment made all content full-text searchable, but retrieval would pull up issues, not articles. He explained that this accommodated the Supreme Court’s Greenberg v. National Geographic case approval of republication rights. Reveal Digital’s retrieval will offer page image-based delivery with searchable text/metadata and hit term highlighting and browsing by series and title; and both basic and advanced search.
Past collaborations between libraries and vendors have accomplished some massive digitizations. Moyer led much of ProQuest’s Early English Books Online (EEBO) Text Creation Project. Ironically, one of the Reveal Digital’s Steering Group members, Mark Sandler, director of the Center for Library Initiatives, Committee on Institutional Cooperation (CIC), seemed to support Reveal Digital in part as a form of opposition to such efforts as EEBO. He disliked the idea of libraries collaborating with vendors simply to create licensed products that only some libraries could afford and basically “re-copyrighting” material that had been long in the public domain. The commitment to pull content into open access was Reveal Digital’s main appeal to Sandler. One should also remember that the most massive collaborative library collection digitization project—Google Books—has made public domain content generally downloadable through both Google and HathiTrust.
Independent Voices: The First Project
Independent Voices aggregates digital collections of alternative press materials, including full runs of newspapers, magazines, and journals drawn from widely scattered academic collections. The project will be developed over the next 4 years and require digitization of more than 1 million pages of content. The collections (which will be cross-searchable and include key metadata) include:
- Women’s Alternative Press
- GI Underground Press from the Vietnam War era
- Campus Underground Newspapers
- Minority Press (Black, Hispanic, Native American, Asian)
- Extreme Right Wing Press
- Anarchist Periodicals
- GLBT Press (gay, lesbian, bisexual, transgender)
- Literary (Little) Magazines
Full descriptions of the collections can be found on the company’s website. Content for this project (and all future collections) is curated by a group of scholars and librarians who provide editorial guidance.
A beta version of the collection has launched. A full release of selected content will take place in March 2013. An open enrollment period is under way now through June 30, 2013, wherein libraries can sign up to purchase this product (the agreement is nonbinding but accepted in good faith).
At this point, Moyer stated the price was running about $1,200 for small colleges, up to $5,000 for large university libraries or public library systems. If more than the target number of libraries commit during the enrollment period, the price will be reduced proportionately. Libraries who purchase the collection gain immediate access to the information, have an option to load the collection locally, receive title-level MARC records and COUNTER-compliant usage reports, and gain priority in providing new content. Purchasing libraries will also have influence over future collection digitization projects. Moyer hoped that these privileges and a sense of the greater good would suffice to prevent libraries from just waiting for open access.
Contributing libraries also receive benefits, including free copies of digital files (images and metadata) for all material sourced from the library with no restrictions on use, permission from the rightsholders to display the material on the library’s website for in-copyright material, free access to the collection if they contribute more than 20% of the content, reimbursement from Reveal Digital for all costs related to preparing the material for shipment, shipping costs, and any re-shelving costs. Contributing libraries also receive full credit throughout Reveal Digital products, according to Moyer.
Most of the content Reveal Digital expects to deal with—and is dealing with in the case of Independent Voices—is not public domain. Clearing copyright permissions is one of Reveal Digital’s main contributions to the process. However, according to Moyer, they have been fairly lucky with their first project: “In the case of the alternative press, the people involved were activists when they did it in the Sixties and Seventies. When we approached them, no publisher demanded royalties. They were just pleased the material would be more widely accessible and gave us permission without royalties. If we did have royalty claims, we would still focus on open access content. The promise of open access in the future may limit our content somewhat, but we’re going to keep trying to work it out.”
So what comes next after Independent Voices? It’s too early to say, according to Moyer, but he is “hopeful the ideas will come from librarians. We want them to see us as enabling rather than the one defining collections. We have created the framework. Now we look for dozens of project proposals by librarians.”