Institutional repositories (IRs) form a key component in the open access movement to bring scholarly research onto the open Web. Librarians and their clients regularly search digital IRs in pursuit of scholarship, but now more and more research librarians have begun to envision institutional repositories as a responsibility, involving themselves in the creation, maintenance, promotion, and advocacy of IRs. The Association of Research Libraries (http://www.arl.org) has surveyed its members to collect baseline data on this potentially transforming, technological realignment of scholarly communication. Scholarly publishers also have their eye on the phenomenon. Elsevier has introduced a new "Search Sources" feature for its Scopus users (http://www.scopus.com); the feature allows librarians to customize interfaces to direct their users to specific institutional repositories and special digital subject collections. It appears that the new Scopus feature will simply allow librarians to preset the content preferences in Scopus that users of Scirus (http://www.scirus.com), Elsevier's free scholarly search engine, can set for themselves on a search-by-search basis, using the Advanced Search features. Scopus announcements also encourage Scopus clients to open their institutional repositories to the expanding Scirus Repository Search Program.
The 123 members of the Association of Research Libraries (ARL) represent the largest academic research libraries in the United States and Canada. The 38-question survey in early 2006 received responses from 87 member libraries (a 71 percent response rate) with 37 respondents having implemented an operational IR, 31 planning an IR by 2007, and only 19 having no immediate IR plans. Results indicate rapid growth. If the statistics hold true across all ARL members, 30 percent have an IR today, but that figure could jump to 55 percent by the end of 2007. Actual IR work by member institutions may extend beyond the scope of the survey. For example, the survey excluded digital archives to store content created by departments or other units, as well as contributions to general digital archives created by multi-institutional archives, e.g., arXiv.org. The survey also stipulated that IRs being covered make metadata available for harvesting.
The survey, conducted by the University of Houston Libraries Institutional Repositories Task Force, covered a wide range of issues for IR practices and plans:
- Background and current status
- Planning, implementation, and assessment
- Hardware and software used
- Policies and procedures
- Content recruitment and assessment
- Benefits and challenges
Appended to the main survey results are sets of model documents illustrating what some respondents are doing in the following areas:
- Home pages
- Usage statistics
- Deposit policies
- Deposit agreements
- Metadata policies
- Digital preservation policies
- IR proposals
- IR promotion
With most IRs only 2-years-old or less, it is not surprising that the mean number of digital objects carried in the IRs identified by the survey was only 3,844. Content usually encompasses theses and dissertations, article preprints and postprints, conference presentations and proceedings, technical reports, working papers, and multimedia material. Most IRs support OAI-PMH for harvesting; a little over half support OpenURL. Start-up costs generally run around $182,500 with annual operations budgets of $113,500 with staff (typically around 28 FTE) taking the lion's share, though responses to this part of the survey seemed to think the budget and staffing figures might be higher than the experience of other IR projects have shown. Though the top reasons for establishing IRs were not open access, at 89 percent, it ran a close third behind increasing global visibility of an institution's scholarship (97 percent) and preservation (95 percent).
The survey results document is entitled ARL SPEC Kit 292, Institutional Repositories (176 pages, July 2006, ISBN 1-59407-708-8, $45, or $35 for ARL members, plus $10 per publication for shipping and handling; http://www.arl.org/pubscat/order). However, you can download a table of contents and a thorough executive summary at no charge at http://www.arl.org/spec/SPEC292web.pdf. By the way, at the urging of the authors, the executive summary is much longer than those for other ARL SPEC Kits—it's 5,000-plus words as compared to the usual 1,500. If you have any interest in IRs, this is a good start for examining what deep-pocket players are doing.
DSpace continues to lead the pack in IR software selection, according to the ARL report. It is also one of the platforms accessed by Elsevier's Scirus, and it's now available for librarians customizing Elsevier's Scopus interface using the new "Selected Sources" feature. The Scopus service already has a Web search tab that takes users from the abstracts and citations that Scopus provides for print publications to the 250 million quality Web pages accessed by Scirus, Elsevier's free scholarship search engine. But now, librarians can set a separate tab, Select Sources, to designate selected institutional repositories and digital archives for searching by their user communities. Last year, Scirus established a Repository Search Program to encourage universities and research institutions to make full content available to its spider software.
The announcement of the new feature indicated that librarians could choose from more than 19 institutional repositories. According to Niels Weertman, head of product development for Scopus, some of the 19 sources could cover more than one institution's digital archives. He estimated that the number of archiving institutions was 60 or more. Among the 19 IRs or groups of IRs, the Scopus/Scirus connections lets librarians target the following:
- 6,000 documents via CalTech
- 4,400 documents via the University of Toronto's T-space
- 54,000 courseware from MIT OpenCourseWare
- 237,000 full-text theses and dissertations via NDLTD
- 363,500 e-prints on ArXiv.org
- 2,600 e-prints through Cogprints
- 12,000 NASA technical reports
- 180,000 documents via RePEc
- 11,000 documents via DiVa
- 2,200 documents via HongKong University of Science and Technology
- 5,200 Organic e-prints
- 600-plus documents via PsyDok of Saarland University
At present, the new Scopus feature would just reach the overall repositories, according to Weertman. It would not let librarians create their own arrangements of quality Web sites indexed by Scirus. "We don't handle bookmarks in Scopus now. Nor do we have any plans to. But we're always looking for how to improve the process to find literature or peer-reviewed information on the Web and make it easier for researchers in the future," said Weertman.
Many institutional archives serving the fields of scholarship carry preprints and/or postprints of published articles. Experts in the field have long worried about the growing problem of "versioning," particularly for end users who may not be aware of how different an unreviewed, draft submission may be from a final, peer-reviewed article. And, of course, pressure from the open access movement has led many publishers to release their final articles for postprint posting on institutional repositories, usually after a prescribed waiting period. One of the reasons users and their librarians turn to institutional repositories is to find free, open access versions of scholarly documents. I asked Weertman whether Scopus had any way to handle the problems of abstracts and indexing identifying one version of a document, which may or may not be an "appropriate copy" licensed access link for users, with documents found on the Open Web through Scirus. He responded that such integration is "not supported by the Selected Sources feature now. We don't match alternative records to Scopus records. If we see sufficient need in the user/librarian community, we will look to develop it, but we have no concrete, immediate plans." Elsevier will not charge more for the new Selected Sources feature on Scopus, according to Weertman.