NewspaperARCHIVE.com provides a massive searchable database of local newspapers (more than 5,000 titles) with content dating from 1607. It runs more than 130 million pages with new ones added at the rate of one a second. A consumer version of the complete file has the usual attractive prices of a Net product—from free but inconvenient to full-featured and unlimited usage versions with prices starting at $19.95 a month. Library sales have grown over the years; they now have some 250 libraries contracting for the database. Now the company has signed an exclusive worldwide distribution agreement with ProQuest. ProQuest will not provide any of its own newspaper coverage, but it does plan to re-package the NewspaperARCHIVE.com content and re-price it. In the course of gathering information on the new arrangement, ProQuest changed product plans dramatically in just a few days. However, one thing was clear: Prices for libraries will really change.
For example, the lowest price available, one covering small public libraries, was $1,100 for the state of Illinois alone. A check with a local public library having an existing contract with NewspaperARCHIVE.com revealed that its annual subscription payment for the entire database was $900. The librarian who told me the price added a comment, “And we won’t pay a cent more.”
NewspaperARCHIVE.com grew out of Heritage Microfilms in the late 1990s and changed its company name. It continues to provide libraries with microfilm products and continues to expand its microfilming efforts. The online file is based on digitization of the microfilmed pages and then OCR-ing the text to produce a full-text index. The database has serious quality problems. A university news archivist described the file as having “tons of errors.” However, he was still a fan.
Look at the layers of problems involved. Cheap paper stock, sometimes more than 100 years old, from local newspaper presses, often using worn-out type fonts, and all this then microfilmed before being digitized makes the task of successful optical character recognition daunting. However, since the file is built on images, it offers browse capabilities that let smart humans work with the content. For example, if the indexing won’t give you every entry for a subject, it might get you to the right dates and then you can page through the newspaper. In any case, the database is unique in its breadth and depth for any kind of historical research and especially for genealogical research. Or is it?
When we interviewed Chris Cowan, vice president of product management at ProQuest Information Solutions, he made it clear that ProQuest planned to carve NewspaperARCHIVE.com’s content into subset packages. The details of the packaging plans changed dramatically in just 3 days, but the resolve not to license the data as a whole remained firm.
The unavailability of the entire file to the library market as a searchable whole could affect the usability of the content and thus purchase decisions. In its own consumer version, NewspaperARCHIVE.com strongly emphasizes the value of the file for genealogical research. This is an area that ProQuest itself has provided products, usually through other affiliated deals similar to this one with NewspaperARCHIVE.com, e.g., Ancestry Library Edition and Heritage Quest. A recent study of a network of public libraries showed that genealogical databases were the only area showing usage growth. Confining the content in NewspaperARCHIVE.com to specific geographic areas, as in ProQuest’s current plans, would seriously undermine the value of the content for genealogical research.
What happened in the U.S. in the 19th century? Lots of things of course, but the most permanent change involved population movement. In 1800, the U.S. was a small Eastern seaboard country; in 1899, it was a continental nation reaching as far west as Hawaii and Alaska. And we weren’t exactly stay-at-homes in the 20th century. Twentieth-century demographic trends, in particular, urbanization and frostbelt-to-sunbelt, require data unbound by any state-only data limits. The massive population redistributions require a national level database for successful genealogical research.
Speaking of regional subsets, this illustrates the volatility of ProQuest’s plans. The first day I spoke with ProQuest officials, they planned to offer 11 subsets for individual titles (out of more than 5,000), 19 states (out of 50 plus D.C.), and two countries (Canada and the U.K.). No plans for regional groupings were currently planned.
The next day, four regionals were authorized, and the following day it emerged that the regionals would encompass all the “leftover” states not among the 19 state subsets. So that gives us the entire NewspaperARCHIVE.com content, but what library could afford to buy all the subsets? And what library patron or even librarian would endure having to sequence search subsets instead of one search for the full file? Whatever ProQuest plans, it has to rely on existing NewspaperARCHIVE.com platform features and technology, which, by the way, do offer user-initiated subset searching as well as the full file.
So what are the current offerings? By the way, there is one important new feature in the ProQuest library editions of NewspaperARCHIVE.com—perpetual archive licenses. As described by Cowan, “Perpetual Archive License. Library makes a one-time purchase of the product and ‘owns’ the data. Funding usually comes from the collection development budget. Typically, there is a smaller annual access fee for access. If desired by the library, the data could be delivered to the library for local hosting. The PAL will be more expensive than an annual subscription. A subscription is like leasing the product.”
What prices? Here is an example for the state of Illinois:
Type of Library
Pop Served/ FTE
PAL Continuing Service Fee $
| || || || || || |
| || || || || || |
Cowan did indicate some flexibility in pricing. “ProQuest sales representatives would work with librarians to determine the library’s customized price. There will be generous discounts for subscribing to or purchasing multiple products.”
Here are the products offered now. Note that the regionals exclude any coverage of the single state subsets. So if you buy the “West States,” they come without California and Utah; “Northeast States” don’t include New York or Massachusetts; “Central States” lack Illinois and Kansas among others. You get the idea, but will your patrons understand using a regional file?
NewspaperARCHIVE Library Edition
Years of Coverage Chart
Title / Collection Years of Coverage/PAL Ownership Years
Abilene Reporter News 1784–1977/ 1784–1977
Daily Herald (Chicago) 1901–2007/ 1901–2003
Cedar Rapids Gazette 1948–2012/ 1948–2003
Galveston Daily News 1865–2012/ 1865–2003
Kingston Gleaner 1834–2013/ 1834–2003
Oakland Tribune 1874–1977/ 1874–1977
San Antonio Light 1883–1977/ 1883–1977
Santa Ana Orange County Register 1869–2012/ 1869–2003
Syracuse Post Standard 1875–2012/ 1875–2003
Winnipeg Free Press 1874–2013/ 1874–2003
Wisconsin State Journal 1852–2012/ 1852–2003
Arizona 1860–2012/ 1860-2003
California 1846–2013/ 1846-2003
Illinois 1830–2012/ 1830–2003
Indiana 1753–2012/ 1753–2003
Iowa 1800–2013/ 1800-2003
Kansas 1868–2012/ 1868-2003
Maryland 1799–2012/ 1799-2003
Massachusetts 1784–2011/ 1784-2003
Michigan 1753–2012/ 1753-2003
New Mexico 1847–2012/ 1847-2003
New York 1753–2012/ 1753–2003
North Carolina 1799–2012/ 1799-2003
Ohio 1753–2012/ 1753–2003
Pennsylvania 1769–2013/ 1769-2003
Texas 1784–2011/ 1784-2003
Utah 1871–1977/ 1871-1977
Virginia 1822–2012/ 1822-2003
West Virginia 1896–2012/ 1896-2003
Wisconsin 1813–2012/ 1813-2003
Canada 1872–2011/ 1872-2003
UK 1607–2013/ 1607-2003
Regional Collections with page counts
Central States 3,839,719
North Dakota 129,127
South Dakota 227,183
Northeast States 2,140,652
District of Columbia 162,581
New Hampshire 464,458
New Jersey 138,270
Rhode Island 308,137
Southeast States 5,525,848
South Carolina 739,129
West States 5,049,106
There will be some overlap between ProQuest’s own newspaper archives, but NewspaperARCHIVE.com’s version will prevail. No improved version of the Washington Post, either in images or indexing, will emerge from the ProQuest Historical Newspapers, which reaches back to issue one of the leading national newspapers as well as black and Jewish newspapers in its 36-paper coverage. ProQuest Newsstand covers close to 1,400 titles, but with no content earlier than 1977. Whether ProQuest will, in time, add content now unique to NewspaperARCHIVE.com is unknown.
Speaking of NewspaperARCHIVE.com and its content, you may be asking about the consumer version. This version will not be available to libraries under the new arrangement. People calling the company about library access are immediately forwarded to ProQuest staff. The prices on the consumer version are:
Introductory/quarterly 100 views also print and save $19.95
Unlimited monthly $29.95
Unlimited semiannual $99.95
Free service/membership, full file, no print, no save, one view a day
As to existing contracts between NewspaperARCHIVE.com and libraries, Cowan responded, “Library institutional subscriptions will go through ProQuest, with public libraries having the option to keep their subscriptions as is for a year. After that, public libraries will select among the different NewspaperARCHIVE library edition products.” No comment was made on this issue about academic libraries.
Squeak, Wheel, Squeak
This is a rough one. I can only urge librarians—both those interested in this useful database and even those interested only in the vendor relations aspects—to make their concerns known to the two vendors involved, to other vendors, and to collegial communications. Comments sent to this site will be greatly appreciated.
On a personal note, I’ve gotten something good out of this—an editorial for the May/June issue of Online Searcher. Right now, my working title is “What Were You Thinking?!!” The Searcher’s Voice editorials are all available full-text on www.infotoday.com/onlinesearcher. This one should be a doozy. I may have an idea on how librarians might be able to tap into consumer data source pricing without actually violating terms and conditions. Stay tuned.