Scitopia Moves Into Next Phase With Full Launch
Posted On October 22, 2007
Back in April, a new sci-tech search portal was announced called Scitopia (www.scitopia.org). (See the NewsBreak, " Sci-Tech Societies Unite to Create Scitopia.org Search Portal," at http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=35885.) At the time, this joint project of 13 sci-tech scholarly society publishers promised to launch the federated searching service at the SLA annual meeting in early June. They met that deadline with a product still in beta. Now they have released the final (for now) official version of the service with alterations built around user feedback and input from participating society staff. With the refined version in place, Scitopia—already expanded to 15 societies—will turn its efforts to promoting the service and expanding the number of participating societies, according to Scitopia’s project manager, Barbara Lange, also director of product line management and publishing business development at IEEE.
Improvements made since June focused on increasing the precision and consistency of keyword and author searches, increasing the system’s speed in returning results, and tweaking the language and layout. In the language and layout category, changes were being made right to the last. A generally favorable review of Scitopia by Yale University science librarian, Joseph Murphy, dated Aug. 16 and published in the October issue of The Charleston Advisor (www.charlestonco.com), had some complaints that have already been corrected. Dana Roth, chemistry librarian at the California Institute of Technology, was kind enough to do some tire-kicking of the latest Scitopia. Noting Murphy’s concerns, he found, " The Search button is now easy to find. They have changed the Limit To pull-down menu to a complete listing of the societies, which allows selecting as many or few publishers as you want to search."
The primary improvement made as the system went through its beta phase involved establishing a standardized XML gateway for all partners that could handle search queries effectively by working through the search syntax problems when dealing with the different metadata structures and procedures at all the partner databases. Eric Pepper, SPIE director of publications, explained how the XML standard "adapts the query structure to the source structure before transmitting the request. The database recognizes the syntax of the query and can return it in the syntax the database expects."
One of the standard tests of a federated search system’s effectiveness lies in how it handles author names, for example. Though finding an author’s work by author name can seem a simple task to end users, it actually involves many variations—initials or full name, full name with only middle initial, initials with periods and no spaces, initials with periods and spaces, initials with no periods and spaces, initials with no periods and no spaces, etc. Roth detected and Pepper confirmed that the standard used at Scitopia is now lastname-comma-space-firstinitial. This approach ensures the retrieval of everything by an author, but it also—in many cases—ensures myriad false drops. When asked why they didn’t offer more sophisticated filtering, Pepper said they were still working on it, but he did point out that authors often publish under different variations, using different citation styles in the same publications, e.g., when cited as a co-author. "The name thing is most challenging, but at least we know, if you’re searching now, all the papers will appear in the list." Pepper said they were still working on it.
The rollout at the SLA meeting was quite successful and quite lively. Pepper estimated around 100 librarians attended and had no difficulty providing feedback. Three items were mentioned at that session—clustering, aggregated RSS feeds (instead of having to place RSS alerts on each individual partner society site), and results readied for downloading into standard bibliographic management software. None of these requested improvements appear in the current version; however, according to Pepper, all are on the agenda. Clustering "is definitely in development" at Deep Web Technologies (www.deepwebtech.com), the vendor supplying Scitopia’s federation search software, and should be out this year. They have some new ideas on how to handle merged RSS feeds and the bibliographic DBMS feed shouldn’t be hard, according to Pepper.
Speed in returning results from a network of different databases on different machines in different locations is always a challenge. According to Pepper, Scitopia has set an absolute time limit of 30 seconds for returning complete results, but the normal return time with the new improvements is a "ten second delay from when you hit the search button until you see the first results and those results are usually 80–90 percent of the total results." The current system also allows better navigation of search results.
Scitopia now contains some 3 million documents, mostly peer-reviewed journals and some conference proceedings. It also includes some 50 million patents from the U.S. Patent and Trademark Office, the Japan Patent Office, and the European Patent Office. It also carries government reports through a connection to the U.S. Department of Energy’s Information Bridge (www.osti.gov/bridge). Some problems have emerged in the patent and government documents sections of the service. Murphy reported in his review (and Roth confirmed in his tests) that "Combining terms with OR in the full record field returned fewer results from patents and government documents than when the terms were combined with AND, although more results were returned from society documents when using OR than when using AND." Both Pepper and Lange recognized the problem. The government data sources, including patents, are not official "partners" in the project and do not have the same standard XML gateway improvements in place.
Scitopia advertises itself as one of the most timely of the Web’s sci-tech search engines, "ahead of other search tools such as Google Scholar." Lange told of asking Anurag Acharya, Google Scholar’s creator, how frequently it updated and receiving the answer as 1 to 2 weeks. Since Scitopia’s federated searching runs searches in "real time," it is, according to Lange, much more current. In discussion with Roth, he tested coverage of a journal’s current issue and found Scitopia beat one expensive licensed database and matched another. However, Google Scholar might have a different sort of edge on currency due to the type of content and sources it taps. For example, it will cover preprints, technical project reports, conference presentations (not final proceedings), etc. As Roth said, "Even with good services like Scitopia, a scientist would be a fool not to check Google Scholar too." (In fact, Roth and I began to muse on how services like Scitopia might integrate Google Scholar into their searches, but that’s another story for another issue of Searcher magazine.)
Lange and Acharya are in complete agreement, however, on the role services like Scitopia can play in moving content outside the U.S. Lange said, "We’re very interested in putting our information out into parts of the world that can’t afford it. It would be great to get people from China and India using it as a core service." As for supporting such a service, they are working with a vertically focused Web advertising agency called IndustryBrains (www.industrybrains.com) on the possibility of adding selective, contextual, technology-related advertising. Speaking of revenue, although Scitopia provides links to partner digital libraries that can automatically connect authorized searchers to the full text, depending on the association’s membership policies, it also requires all its members to provide pay-per-view options for users without subscriptions. The patents and government documents are free.
"It’s exciting to emerge from beta, but we don’t ever expect scitopia.org to be ‘finished,’" said Karen Hawkins, IEEE director of publication and information marketing. "This is a living project that will continually evolve through enhancements and new technology." Even since April, the organization has added two more partner societies—the Acoustical Society of America and AVS. All but one of the 15 partners have full connection to Scitopia. The American Institute of Aeronautics and Astronautics should have its connection ready by the end of the year. The delay occurred, according to Lange, because the AIAA was "in the process of transitioning to a new technology platform and didn’t want to deal with the technology issues [involved in connecting to Scitopia] at the same time."
Now that Scitopia has emerged from its beta trial phase, what’s next? Lange spoke about two main goals: promoting the service and adding new partners. Though she couldn’t provide any specifics, she did indicate that there was a significant warming toward Scitopia, even from one major holdout and she expects to see 3–5 new partner societies added by the end of the year. When asked whether she had any concerns that expansion of the number of partners could lead to a slowdown in system performance, Lange indicated that Deep Web Technologies had assured her it would not. In fact, Scitopia chose Deep Web, in part, for its scalability. She did say that they had told her 60 or 70 sites might be a problem under the current structure—maybe a problem but also one of Lange’s happy dreams.
Lange has high goals for the service. She hopes that by the end of next year, Scitopia will be driving a significant amount of search traffic and inbound referrals to each partner site. They are actively looking to have links to Scitopia placed on library Web sites. The Library of Congress, Stanford University, Rennselaer Polytechnic Institute, and libraries in Australia, Ireland, and Italy already have links to Scitopia.org in place on their sites. Scitopia has designed a widget to ease placement of their icon. Initially, they plan to get the linking widget onto partner Web sites.
Well, they must be doing something right. Both Roth ("It’s going to be a great product. Just needs a little more fine tuning.") and Murphy ("A great start, but much improvement is still needed....However, the people with Scitopia seem to be aware of and responsive to these problems.") offer measured but sincere compliments. And the same issue of The Charleston Advisor that carried Murphy’s review also announced that Scitopia had won a "Best Effort" from Charleston’s 7th Annual Readers’ Choice Awards.