KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM EContentMag Faulkner Information Services Fulltext Sources Online InfoToday Europe Internet@Schools Intranets Today KMWorld Library Resource Literary Market Place OnlineVideo.net Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



News & Events > NewsBreaks
Back Index Forward
Twitter RSS Feed
 



Open Library Upgrading Its Service
by
Posted On May 10, 2010

The Internet Archive (www.archive.org) has completed the system changes and upgrades to its Open Library giant catalog (http://openlibrary.org). The Open Library gathers cataloging data from donors around the web, ranging from individuals to libraries to publishers, and creates web pages. Currently it carries around 23 million catalog entries, including records for "works" (or "titles" in librarian parlance) and for various editions. In March it introduced a "soft" relaunch with changes to the user interface and underlying software. The relaunch has completed. (For an enthusiastic user's reactions to the completed "hard" launch, read Gary Price's coverage on Resource Shelf ("The Open Library Relaunch Is Complete," www.resourceshelf.com/2010/05/05/the-open-library-relaunch-underway/).) But, more changes are on the way, according to George Oates ("No one's called me Georgina for years"), the project lead and designer for Open Library.

The grand goal of the Open Library is "to provide a page on the web for every book ever published," according to its FAQ. Oates has headed up the project for about a year or so. Previously she had worked as a software designer for Flickr, the photo sharing service. Not being a librarian herself, she had to adjust to the "top-down" perspective of librarians in setting rules for catalog entries. However she seeks easier ways to retrieve and describe information, one that lets people exchange their own tags and find new techniques for people to "find books they didn't know they were looking for intuitively."

The difference between the soft launch version of March and the May hard launch version is not major. "We used the soft launch just to sort of soften the landing for people used to Open Library. It gave them a chance to see what we were up to. We really didn't do too much feature development in the interim. We mainly just worked out the migration into the new dot-org." In fact some features previously in place in the original Open Library had to be removed, e.g., full-text searching. But they will come back. Oates said, "We would like to bring back full-text search as soon as we can. We have rewritten the search engine from the ground up. We're excited about the better job we have done on full-text search. Before we were wrong in indexing books at a page level with no links between pages. Now we index the whole book as one document. For the last six months we have been focusing on ebooks."

In the course of the restructure, they have also completely rebuilt the user interfaces to make the site easier to use, according to Oates. They found that "fifty percent of the requests we got were to correct an error. This meant users had a crucial misunderstanding of Open Library. The interface didn't make it clear enough that this is a wiki, a do-it-yourself site. So we made the history of editing feature and the edit button more prominent."

The book pages for the editions carry links to help users reach actual copies of the books. Oates points out that they have three levels of access information-read, buy, and borrow. Although they tap into Open Content Alliance for ebooks, Oates warned that they do not "inhale" all that content. "For example, Internet Archive has scanned serials which the Open Library cannot represent yet, like serials. Users can now check a box to only show ebooks in response to a search. One work may have six editions scanned. So, though there are around 1.2 million ebooks available, if all the ebooks at the "work" [or title] level might be more like half a million." Open Library also includes other "links to all sorts of ad hoc PDFs around the web." Since the Open Library is basically a wiki open to contributions and editing from any user, as Oates points out, anyone with knowledge of a downloadable book can add that information to the edition record. This response was a somewhat oblique reply to a question on whether they had plans to add links to Google Book Search items, particularly the 2 million downloadable public domain items.

The Internet Archive and Google have been involved in some hassles over the years, specifically in connection with the Google Books project. In fact, the Open Content Alliance was launched in 2005 to offer an alternative to Google Books (http://newsbreaks.infotoday.com/NewsBreaks/Open-Content-Alliance-Rises-to-the-Challenge-of-Google-Print-16110.asp). When a group of opponents to the proposed Google Book Search settlement formed the Open Book Alliance last year, an Internet Archive executive headed up the group (http://newsbreaks.infotoday.com/NewsBreaks/Anti-Google-Book-Settlement-Organizations-Band-Together-in-Open-Book-Alliance-55861.asp).

Within the three options (read/buy/borrow) in edition (not "work" or title) records, the link to buy can take people to four pre-canned links reaching Alibris, Amazon, Biblio.com, and Powell online bookstores. It works quickly for book records with ISBN10. In time, according to Oates, "we would like to make the whole module as dynamic as we can. We would like to ping and show the price right on the page with booksellers and prices alongside."

They have also implemented a service for visually impaired or "print disabled" users by publishing books in the DAISY format (www.daisy.org). (For a thorough report on this development, read Tara Calishain's Research Buzz report "More Books for the Visually Impaired" at www.researchbuzz.org/r/.) A list of the titles now available in the DAISY format appears at http://openlibrary.org/subjects/protected_daisy.

Although Open Library has plenty of plans, it also has a limited staff and needs to set strict priorities. Some services discussed in earlier coverage have had to take a back burner. For example, though Oates indicated an earnest desire to handle non-book material, such as serials and video and software, the immediate chances are slim. "It's a puzzle we do want to solve for different catalog records for different types of sources, but we don't now." Open Library also shelved plans for a scan-on-demand project, following the failure of a small pilot project at Boston Public Library last year. Any plans for a print-on-demand feature are also on the back burner, though Oates indicated that the main Internet Archive service had a pilot underway. In that regard, Oates thought the problem of printing out PDF files could be solved other ways by individuals.

Small organizations with large goals usually find the most prudent strategy is to work with third parties. Oates said, "Now that we've gotten our head above water, we are looking for partnerships and affiliates. In general we want to be as open as we can, though maybe not with formal partnerships." She pointed to a recent alliance with Otis Chandler's Goodreads.com, a social book site. "Goodreads generated a dump for mapping ISBNs and Goodreads identifiers. We designed a bot to look for the ISBNs and find matches. We ended up matching 3 million records in a day. It was really cool. Now maybe we can put Goodreads reviews in Open Library and vice versa." This bulk processing service is something Oates would like to try with publishers and publisher services such as the Koha open source service and LibraryThing. "Using an API, we could see other sites spring up around Open Library. We want to cultivate the capacity for external developers to build bots. We may be a fairly long way away from that, but we are aspiring. Our developers page focuses more on extracting data from Open Library than writing into it, but we are actively thinking about that." They definitely want to work more with publishers, especially in concert with the Internet Archive's open source Book Server project. They may also release features limited to registered users.

Another feature Open Library wants to build soon is the ability to build lists of books and export in a number of ways, e.g., bibliographies, printing, blogging, and linking to a website. Oates pointed out that "the first step for sharing is not to be proscriptive on who uses what. We want diversity. If we had a million lists in 2 years, that data on the content would also help us make the system richer. It would tell us why some people think some books are related."

All in all, dreams of glorious accomplishment seem to lead the way. One can only wish them all the luck in the world.


Barbara Quint is senior editor of Online Searcher, co-editor of The Information Advisor’s Guide to Internet Research, and a columnist for Information Today.

Email Barbara Quint

Related Articles

7/23/2007Open Library Launches with Library as Wiki Service
8/2/2010Digital Lending Goes into OverDrive
3/7/2011New Lending Model for Ebooks in Libraries from Internet Archive’s OpenLibrary
10/17/2011Kansas Leading the Fight for Fair Ebook Access in Libraries
8/23/2012Internet Archive Turns Up the Speed With BitTorrent
4/11/2013Amazon Buys Goodreads to Bolster Its Advice Portfolio


Comments Add A Comment

              Back to top