Sci-Tech Societies Unite to Create Scitopia.org Search Portal
Posted On April 16, 2007
A major new sci-tech search portal called Scitopia.org (www.scitopia.org) is scheduled to launch at the SLA annual meeting in the first week of June. A test version may actually launch a few weeks earlier. Thirteen scholarly society publishers are working together to create a free federated, vertical search portal capable of accessing some 3 million articles spanning as far back as 150 years, as well as some patents. A search on Scitopia.org will initiate simultaneous searches on all participating publishers' Web sites, will retrieve and merge results, and will present users with a relevant ranked list of bibliographic citations and abstracts from which they can choose the full-text articles they need. Acquiring the full-text articles will involve authentication to licensed subscriptions or pay-per-view payments. Deep Web Technologies (www.deepwebtech.com) is supplying the technical expertise needed for building Scitopia.org. The service's leaders hope more societies will join them soon in this effort to bring a top-quality, finished scientific product to the fore in a world coming to be dominated by Google Scholar.
The publication arms of scholarly societies perform a joint function for their parent institutions, a "do-good-and-do-well" function. They both fulfill the mission of the societies to encourage, support, and disseminate high-quality research results and often bring needed revenue to support the society's general activities. Despite the direct relationship to authors and readers found in their memberships, scholarly society publishers have found competing in the Web world something of a struggle. The "big-deal" offers from commercial scholarly publishers eat up library serials budgets, pushing smaller publishers off the table and sometimes even luring societies to let commercial publishers take over their publications. At the same time, freebie services such as Google Scholar, Elsevier's Scirus, and Windows Live Academic Search lure away end-user eyeballs. Many scholarly society publishers open some of their content to Google Scholar and other services, but this experience hasn't satisfied the societies' interests or, in the opinion of some society publishing executives, even the users they serve.
The new Scitopia.org federated-search service is meant to provide a way for users to retrieve all of the content that participating sci-tech society publishers have to offer, a way that reaches the most recent published articles as well as digitized content stretching back across decades. Scitopia.org currently accesses the electronic libraries of these thirteen participating sci-tech societies:
A 30-day freeze on adding new members is in force to allow the technical work needed for a smooth launch to take place without disruption. However, several other major societies initially invited to join have indicated an interest in a "watch-and-wait" mode, and Scitopia.org has four specific societies on the "short list." Expect to see a vigorous outreach to follow the launch.
Although most of the publishers already open their content to Google Scholar and some to Windows Live Academic Search as well, Scitopia.org will provide uniquely complete data. Barbara Lange, director of product line management and publishing business development at IEEE, and Tim Ingoldsby, director of strategic initiatives and business development at the AIP, both pointed out that Google Scholar has a spidering schedule that could leave current articles stuck in a multiweek pipeline. Scitopia.org will offer real-time updating. Ingoldsby stated, "We publish articles now every minute of the day and night, and the moment after we publish, our [Scitopia.org] search will find them. We are a much better choice for that reason."
The content will also reach as far back as the societies carry content. For example, users will be able to reach 1.5 million documents dating from 1884 from IEEE; 410,000 dating from 1893 from APS; 390,200 dating from 1930 from AIP; 245,000 dating from 1874 from IOP; 235,000 dating from 1990 from SPIE; 99,700 dating from 1902 from the Electrochemical Society; and 51,806 dating from 1990 from SAE. Ingoldsby pointed out that low usage was often used to justify open access embargoes, for example, a 6-month delay. "That may be true in medicine and the life sciences," said Ingoldsby, "but it's less true in physics, more ‘less true' in chemistry, and most ‘less true' in mathematics."
While tapping the entire content of all the participating societies' libraries, a Scitopia.org search will allow field searching, e.g., by title word or author name. Users will also be able to narrow searches to a specific publisher. Federated searching continues to have ingrained problems dating back to the days when it was called cross-file searching. One problem area involves alternative formats, e.g., author names with or without initials, with or without periods or spaces between the initials, etc. A search statement using one format could automatically block retrieval from any database not using that format without even warning the searcher of the omissions. Lange said that they recognize these kinds of problems and are already working on normalizing author entries. Ingoldsby said, "Deep Web is pretty good at this. Its founder, Abe Lederman, a founder of Verity, has decades of experience. The initial search will not be too much beyond basic federated searching, but we're already working to do XML. If we can deliver XML, then we'll have a pretty sophisticated way to meet the author challenge."
Once users find citations to articles they want to read, the system transfers them back to the society publishers' sites for retrieval. At this point, according to Lange, all of Scitopia.org's participating society publishers offer a pay-per-view option, though this is not a requirement for joining Scitopia.org. Prices vary from site to site. In addition, the sites can authenticate users through a number of access routes—licensed library patron, member, associated member, etc. Using OpenURL resolvers, Ingoldsby judged that even licenses to third-party data aggregators carrying selected content should be able to be authenticated. Again, both Lange and Ingoldsby expect some glitches to arise that Scitopia.org's staff and partners would work out.
Considering the age of some of the articles, I asked Lange and Ingoldsby how they planned to deal with the public-domain status of much of the content. In the case of material still in copyright, any payments might be owed to authors rather than the societies, if the society publisher had not had the prescience to get author agreements to concede electronic rights before electronic-information services existed. Lange stated that all those problems belong to the individual society publishers and their clients. Ingoldsby agreed, but he also stated that his organization, AIP, had always insisted on full-copyright transfer from their authors.
The second problem with federated searching has been speed. If Scitopia.org grows and adds new content sources as rapidly as its leaders would seem to wish, this could result in lengthening the time between placing a search and getting back search results. In the past, the solution to maintaining attractive turnaround speeds usually depended on the ability of techies to restructure and tweak the system containing the databases. However, Scitopia.org doesn't have any control over the databases. It just transmits the search statements. Ingoldsby admitted, "It could get slower. We'll have to see how it goes. We have one of the really top service providers. There are no bandwidth or CPU cycle problems." However, at this point, there is one hidden advantage. The AIP has a platform called Scitation that supports 27 sci-tech publishers, seven of which are Scitopia.org participants (seven and a half if you count the IEEE Computer Society). Since AIP's Scitation service can compartmentalize its databases, it should offer a quicker turnaround of merged files, according to Ingoldsby.
Funding and Marketing
How is Scitopia.org funded? Currently, the participating society publishers are supporting it. However, Lange and Ingoldsby both indicated that they hoped to attract advertising revenue with contextual search ads. At this point. Lange said, "Our goal is to have Scitopia sustain itself with maybe a little left over. Our goal is not to become a revenue machine. We want to help drive traffic to the societies and to promote the value societies' content brings with a one-stop, one-click site. But we do have a tremendous audience in just the membership base, one that advertisers should want to reach."
As to how they planned to promote awareness of Scitopia.org, Lange said that they would try using "grass-roots, viral marketing. All our partners will promote it in their own spaces—technical conferences, print journals, library conferences, etc." According to Ingoldsby, "We are really after the corporate market. So much of that market has given up subscriptions to our journals. We are trying to make it easy for them to find large swaths of content in a single search."
The area of sci-tech information portals seems to be expanding. Besides Google Scholar, Elsevier's Scirus, and Windows Live Academic Search, the U.S. Department of Energy's Science.gov site is being used as a model for Science.world, which is being developed through an alliance with The British Library. (See the NewsLink Spotlight on Science.world at http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=19230.)