In the last 2 years, momentum has been building for a new genre of discovery services based on centralized indexes. Beginning with Summon from Serials Solutions, launched in July 2009, followed by EBSCO Discovery Service, Primo Central from Ex Libris, and OCLC’s WorldCat Local, these services depend on massive indexes populated with content representing each component of a library’s collection. This approach to library collection search offers potential advantages relative to the style of federated search based on real-time connections to multiple resource targets. These indexes include data and metadata from the local library, such as the MARC records extracted from its integrated library system or Dublin Core records from digital collection management systems. They also aim to represent the materials in the library’s subscriptions to electronic content products.
The Concept of Index-based Search
The basic model of these index-based discovery products involves three major types of participants: those that develop and support the discovery products, the publishers and providers of content products, and the libraries that purchase and implement them. Of course, library patrons represent another set of stakeholders as the ultimate end users of these discovery services.
As these index-based search products become strategic tools in which libraries make major investments, it is important to identify best practices and develop appropriate standards. The Open Discovery Initiative addresses several areas of interest in the arena of index-based discovery tools, including transparency of the content of the indexes, consistent terms and vocabularies, and standard mechanisms for transferring content from publishers to discovery service providers.
For example, one area of interest would be providing consistent vocabulary and terms to help libraries evaluate the quality and quantity of the indexes that underlie each discovery service. The effectiveness of these discovery products depends on how fully they index the materials represented in the library’s subscriptions to content products. It is also important to have consistent ways to express whether the indexing of any given resource is based on citation metadata or whether it also indexes the full text of the materials and the frequency of updates. Clarity and transparency in this area should put libraries in a stronger position to make valid comparisons among the discovery products relative to their potential search performance for their collections. Some of the vendors have already begun publishing detailed reports disclosing the materials represented in their indexes. (See, for example, www.serialssolutions.com/discovery/summon/content-and-coverage/ or www.oclc.org/worldcatlocal/overview/content/).
This model of index-based discovery depends on a relationship of cooperation between the publishers of the content products to which libraries subscribe and the creators of the discovery products. The basic arrangement involves publishers submitting content to discovery providers solely for the purpose of indexing. As end users of these products discover and select items of interest, they will be directed to the publisher’s servers for viewing and downloading. These products aim for more powerful and efficient search models through more direct indexing—they are not publishing platforms. If they live up to their promise, they should improve the efficiency of end-user searching, increase the number of documents accessed in the various content packages, resulting in a mutually beneficial arrangement.
The materials included in these indexes come from many different sources, each with specific provisions or stipulations. Depending on the arrangement with the publisher, the discovery service will need to enforce different indexing and display rules. Some may allow publicly available searching, others may be open only to authenticated users; display of snippets may or may not be allowed. The initiative will address consistent ways to articulate the business rules that apply to how the content will be handled within the indexes and through the discovery product.
The fundamental premise of these products involves a transfer of content from publishers to discovery service producers. Rather than having many different ad hoc arrangements, there would be significant efficiencies gained in developing standards or best practices in how these data are transferred. For example, such a standard would save publishers from having to create different mechanisms for each discovery provider. Such standards would be especially beneficial to discovery providers since they may work with hundreds of content providers.
Background of the Initiative
The Open Discovery Initiative began as a discussion among a group of individuals with an interest in this genre of products. Oren Beit-Arie, chief strategy officer for Ex Libris, Marshall Breeding, director for innovative technologies and research at Vanderbilt University, and Jenny Walker, a consultant for Ex Libris, convened a meeting at the American Library Association Annual Conference in New Orleans on June 26, 2011 of a broad group of stakeholders with potential interest in the arena of discovery systems. Invitees to the session included representatives from each of the organizations involved in developing index-based discovery systems, libraries involved in using these products, publishers, and NFAIS (National Federation of Advanced Information Services), and NISO (National Information Standards Organization). Following brief presentations of each of the major stakeholder groups, the general discussion revealed that there was significant interest in advancing the initiative, and that NISO seemed to be the best organization to shepherd the process. The conveners drafted a proposal that was submitted to NISO, requesting that it form a workgroup to address this topic.
The membership of NISO has recently voted to accept the Open Discovery Initiative as a work group within its structure. Following this positive vote, NISO will constitute the membership and establish the leadership of the work group. Individuals interested in participating in the initiative should contact Nettie Lagace (email@example.com). A mailing list has been established at www.niso.org/lists/opendscovery/.