In the Olympian struggle among purveyors of online news sources, there are two measurements that people use to score performance: the number of sources included in a database and the completeness of the file. Dialog Corp. has just upped the ante in the number-of-sources competition by announcing it will put 6,500 news and business titles into one database that will be titled NewsRoom. The database, which was scheduled to launch March 1, will be exactly the same on Dialog, DataStar, and Profound.
Using Profound's NewsLine (with its 5,000 titles) as a benchmark, Dialog closely examined the sources to which it had access. Paul Colucci, Dialog's senior vice president for product development, said: "We wanted to leverage the historical strengths of Dialog, DataStar, and Profound to create one collection of information. We're creating a collection that's more adaptable for intranet access for intermediate and novice users, but that will also be valuable to experienced information professionals." News and business, he added, were the areas most applicable to all customers.
More Sources from More Publishers
Now that Dialog is a Thomson company, its access to sources is greatly expanded. Gale Group, for example, is a natural to contribute to NewsRoom. More surprising is the addition of sources from non-Thomson publishing companies, such as McGraw-Hill. Even more amazing is ProQuest's willingness to contribute its ABI/INFORM Global, Accounting, Periodical Abstracts PlusText, and Newspaper Abstracts databases.
Why would a publisher abandon its own brand and accept NewsRoom? This is a classic question with a classic answer. NewsRoom is aggregated information. If you know you want a specific title, you're likely to go to the database bearing that brand. If it's a topical search—and searches are becoming more topical all the time—you want an answer, regardless of the source. This boosts one timeless professional searcher technique: If you don't know what you're looking for, hit the largest collection first and narrow results from there. Dialog hopes that searchers will regard NewsRoom as their first stop on the information quest. This benefits both Dialog and individual publishers.
There are some technical considerations (particularly on the Dialog and DataStar platforms) that NewsRoom solves. Unlike the architecture of rivals Factiva and LexisNexis, Dialog and DataStar are divided into individual files. You can combine files to increase retrieval recall, but there's a limit of 60 files on the Dialog OneSearch system and a similar limitation with DataStar CROS. By combining so many sources into NewsRoom, that limitation has been neutralized.
Dialog also has its eye on how NewsRoom will play inside the enterprise. Dialog's Intranet Toolkit, DataStar Private Star, and other customization tools are expected to become much more usable at the desktop with the addition of NewsRoom. Organizations will be able to order up slices of NewsRoom for enterprisewide applications. One slice will probably be industry, another geography.
Time and Distance
To medal in the numbers game, the issues of currency, lag time, and archives also gain you points. Dialog will take advantage of its recent acquisition of NewsEdge to push real-time information into NewsRoom. That's real real-time information rather than the 15-minute delayed real-time that prevails in its news wire files. However, this apparently won't be operational at the initial NewsRoom launch. More easily implemented is an Archive tab on NewsEdge's Insight 6.0 platform that will take subscribers directly to the Dialog content collection, including NewsRoom. On other platforms, a 2-year archive backfile is promised. The NewsLine database on Profound will retain its 5-year backfile, although from 2001 on it will be the expanded NewsRoom sources.
NewsRoom is firmly aimed at an international audience. Although the bulk of the articles are in English, it also has sources in the major European languages—French, German, Italian, Spanish, and Russian—as well as in some not-so-major languages (Danish and Lithuanian). If NewsRoom has Danish sources, can Swedish and Norwegian be far behind? Will we see Latvian and Estonian? What about Hungarian, Finnish, Polish, and Czech? Dutch would be a fairly obvious addition. Sources in Asian languages, with their unique character sets, pose the usual problems and won't be added soon, if ever. Before you rush out to buy a multilingual dictionary, however, note that only 5 percent of the file is non-English-language sources.
The number of titles is staggering. When you first consider 6,500 titles, particularly when it originally hit the rumor mill as newspaper titles only, you wonder if there really are that many available. However, once the definition is broadened to include broadcast transcripts, periodicals, news wires, the trade press, and scholarly journals, that 6,500 doesn't look so overwhelming. In fact, it begins to look familiar—rather like a Factiva Publications Library or a LexisNexis News Library. Or, as Searcher editor Barbara Quint so colorfully opined, "It could also look like the town dump."
Comprehensive vs. Selective
Comprehensiveness is the second scoring mechanism for online news databases. If a database claims to cover a journal, newspaper, newsletter, or wire service title, how many of the articles in that publication actually make it into the electronic version? This question addresses an essential component for information professionals. What guarantee does the searcher have that a publication will be represented in its entirety when it moves to the electronic environment?
David Brown, Dialog's senior vice president of content development, muses that the phrase "cover to cover" is one from which he shies away. "There are many reasons why a publication may not be cover to cover—technical, legal, and editorial. I prefer the term comprehensive, and that's what we looked for when we examined our content. If the same title was in multiple databases, we opted for the source that was the most comprehensive. We also considered currency in our decision. We want as robust a database as possible for global coverage." As a practical example, if the same title is found in Knight Ridder/Tribune Business News (KRTBN) and acquired directly from the publisher, the latter will get the nod for NewsRoom since KRTBN is, by definition, a selective file.
Although full text is frequently assumed these days, NewsRoom will mix abstracts with full text. Neither Colucci nor Brown could give an exact number, but their educated guess is that approximately 80 percent of NewsRoom will be full text.
Sounding a bit like a publicity agent for Disneyland, Brown emphasizes, "We want to enhance the user experience." By that he means Dialog wants to make searching easy for non-information professionals while maintaining the navigational sophistication so important to the precision searching required by information professionals.
Admitting that this was the most complex project they had ever worked on, Colucci and Brown firmly believe that, with NewsRoom, Dialog has tapped into the depth and breadth of Dialog-controlled content while preserving the best of the three Dialog platforms (four, if you count NewsEdge). Trying to identify the best elements of the platforms and their content has been done before—by several previous managements and owners of the company—but this is the first tangible implementation of such a study.
Questions remain as to the viability of maintaining multiple platforms. If Dialog's intent is to create more megafiles of aggregated data, how long can the company sustain individual platforms? Although NewsLine was considered the benchmark for NewsRoom, it's not the main reason why people subscribe to Profound. That remains the market research report collection. Should the market research databases that are scattered over Profound, Dialog, and DataStar ever be combined in a comprehensive fashion, à la NewsRoom, then the business model of multiple platforms appears shakier than ever.