Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
PRIVACY/COOKIES POLICY
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



News & Events > NewsBreaks
Back Index Forward
Threads bluesky LinkedIn FaceBook Instagram RSS Feed
 



CrossRef Search Uses Google to Provide Full-Text Access
by
Posted On May 3, 2004
CrossRef (http://www.crossref.org), a 300-member publisher trade association, has announced a pilot project called CrossRef Search that will enable users to search the full text of scholarly journal articles, conference proceedings, and other sources from nine leading publishers. Google will supply the search technologies and CrossRef the reference links to publisher Web sites. While Google will also incorporate CrossRef content connections into its general Web search engine, users who go to publisher Web sites and click on the CrossRef Search icon will reach just the scholarly subset. However, searching through the icon will access content from all participating publishers. Ed Pentz, executive director of CrossRef, foresees the number of participating publishers rising to 20 or 30 within a few months.

"CrossRef is very excited to work with Google on this pilot program. Researchers, scientists, and librarians should find CrossRef Search a valuable search tool," said Pentz. "Now, researchers and students interested in mining published scholarship have immediate access to targeted, interdisciplinary, and cross-publisher search on full text using the powerful and familiar Google technology. CrossRef Search, like CrossRef itself, breaks down barriers between publishers on behalf of research and library communities."

Searching CrossRef Search will be available to all Web users at no charge. Content will include current journal issues as well as back files. The system uses CrossRef's DOIs (Digital Object Identifiers) or standard URLs to identify and link to content.

At present, publishers participating in CrossRef Search are:

These initial publishers produce some 1,100 journals, according to Pentz. Participants have investigations underway to test how to use DOIs to improve indexing and metadata for better retrieval and to enable persistent links from search results to the full text of content at publisher sites.

The initial pilot will last throughout 2004. CrossRef plans to gather feedback from scientists, scholars, and librarians through e-mail forms and formal evaluations using external consultants, according to Pentz. CrossRef is also hoping to discuss similar programs with other search engines, Pentz said.

There are only two rules for joining the pilot program, according to Pentz. "The publisher has to have all their content indexed through the way Google indexes and make the search box available to everyone at no charge." As far as content, Google is indexing PDF files, so some results may differ from publisher to publisher. Pentz said that if there were no metadata, e.g., just scanned ASCII or OCR text with no indexing, it might not meet Google's standard. However, he doubted this would occur.

CrossRef has put no requirements on access issues. Each publisher can apply its own economic model, even if it does not include a pay-per-view option. Nor, during the initial phase, were there any mechanisms for guiding users to library holdings to which they might have "appropriate copy" access. End-users, therefore, might find themselves reading abstracts for material they can find no way to access. Pentz believes that most of the nine initial participants offer some form of pay-per-view as do two-thirds of CrossRef's members.

Operated by the Publishers International Linking Association, Inc. (PILA), CrossRef builds citation linking systems across publisher sites. Currently it has over 11 million DOI links in place covering nearly 10,000 journals, several hundred thousand conference proceedings, and tens of thousands of books. The DOI supports persistent linking, while URLs may change. All types of content available from participating CrossRef publishers will join the program, not just journal articles, e.g., Molecule Pages produced by Nature Publishing Group and the Alliance for Cellular Signaling.

Since it dropped DOI retrieval fees in January 2004, CrossRef's growth rate has nearly doubled, with over 50 more publishers joining in the first quarter of 2004. CrossRef also has over 200 library participants, and close to 40 vendor associates.

When PILA/CrossRef began in 2000, some publishers and vendors expressed concern that it could become a huge super-database that would threaten all competitors. Publishers and the information industry were assured that would not happen. However, by September 2002, the PILA board was considering a detailed proposal to launch full-text searching. The "Fast Facts" page at CrossRef's Web site still declares that "CrossRef is not an article database: CrossRef does not aggregate full-text content." When I questioned Pentz about the accuracy of that statement following the announcement of CrossRef Search, he chuckled and said: "It's still true. We don't handle articles, but our members do. We're just facilitators." He added that the original 2002 prototype worked with Fast Search, but asserted that CrossRef Search "is part of that progression. Rather than try to build our own search engine or host, we see this as an effective way to move forward. We are now in a position where we can work with multiple partners and help move forward innovation. We also see the benefits in going forward to non-exclusivity."

Some publishers may still be worried. I spoke with one director of electronic publishing at a scholarly society publisher. His institution has already had an arrangement in place with Google for over a year. They have worked extensively with Google to improve retrieval. Looking at CrossRef Search, he has real concerns over how merging with the volume of content coming from other participating larger commercial publishers might have on his content's placement. However, he felt his institution "had to be in from the git-go. By being early participants, we hope to influence placement." In any case, the experience they have already had with Google has led to a "much higher profile and lots of extra traffic." At this point, he is waiting to see how CrossRef Search plays out.


Barbara Quint was senior editor of Online Searcher, co-editor of The Information Advisor’s Guide to Internet Research, and a columnist for Information Today.


Comments Add A Comment

              Back to top