Searching Scholarly Tables, Figures, Graphs, and Illustrations with CSA Illustrata
Posted On January 22, 2007
CSA Illustrata is a new resource from CSA (www.csa.com) that provides deep indexing to the tabular and other graphic information published within scholarly articles. Running on the CSA Illumina platform, CSA Illustrata allows researchers to explicitly search for information presented in tables, charts, graphs, maps, photographs, and other figures. Users can view the full object (including all caption and label text), save marked results, and import the illustrations into presentations, lectures, or research. The first database available in the Illustrata product line is CSA Illustrata: Natural Sciences. Journals from Blackwell Publishing, CSA's development partner, contribute the bulk of the scholarly articles, although other publishers also contributed articles.
This is a product that has been in development for at least 2 years. It was extensively tested, with a "proof of concept" research project headed by Carol Tenopir, Robert J. Sandusky, and Margaret M. Casado (The University of Tennessee-Knoxville). They showed a prototype Tables & Figures index of approximately 300,000 objects to researchers at seven universities and two research institutes in the U.S. and Europe. The executive summary is online and the 90-page "CSA Illustrata White Paper" can be requested (http://info.csa.com/csaillustrata).
CSA Illustrata: Natural Sciences contains more than 1 million illustrations. Searches within the caption field take advantage of the Boolean precision elements inherent in Illumina. In addition to the fields one expects from a bibliographic database (author, title, journal title, descriptors), Illustrata fields include caption, category, DOI (digital object identifier), and geographic terms. Fields that specifically index the graphic object rather than the article in its entirety include geography, statistics, subject, and taxonomy.
Search boxes allow for limitation by date range, predictive model, and category. The six check boxes for category (graph, illustration, map, photograph, table, and transmission/emission image) are not the only possibilities for this field. Using the search boxes and selecting "category" at the far right of the screen, further refinements by contour plot, pie chart, line graph, gel, time-series plot, and others are possible. Thus, if a professor wants to amplify a lecture on expressed sequence tags with a pie chart, CSA Illustrata can supply the needed graphic.
Search results show a thumbnail of the illustration, the beginning of its caption text, a citation to the article, and hyperlinked object descriptors. Results can be sorted either in reverse chronological order or by relevance. From the results list, searchers can click through to either the full article or an abstract. Clicking on the author will retrieve a list of all articles in the database by that author. Items in the results list can be saved, printed, or emailed. Clicking on the thumbnail brings up a full-screen view of the illustration, plus the full caption, category descriptors, citation, object descriptors, publisher name and address, DOI, object DOI, ISSN, and accession number.
According to Matt Dunie, CSA president, deep indexing "will be the evolution of secondary publishing." Diane Hoffman, CSA's senior director of life sciences, firmly believes that deep indexing "is the only new thing in indexing since citation analysis." Librarians who participated in a CSA Illustrata focus group during the fall Charleston Conference were highly impressed and presented the development team with several suggestions that were adopted and shown at the Online Information 2006 show in London at the end of November. The sticking point, most agree, is price. As one science librarian from a large public university said, "It's seductive, but I wonder if I'll be able to afford it."
CSA firmly believes that Illustrata offers a unique approach to scholarly literature. That's only partially true. TableBase attempts a similar deep indexing for tabular information in the business literature. It provides descriptors for the table, plus searchable fields for the table titles and text. Begun by Responsive Database Services 10 years ago, TableBase is now a Thomson Gale database that has, unfortunately, seen very little innovation or product development since its introduction. In a Web 2.0 world, its demeanor is sadly antiquarian.CSA Illustrata is a breakthrough product, opening up the wealth of data previously hidden in article graphics. With a toehold in the natural sciences, the future vision is to expand beyond that subject area. With the recent acquisition of ProQuest Information and Learning (see the NewsBreak at http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=18853), CSA has the opportunity to expand not only to other scientific disciplines but also to open up hidden information in ProQuest's business, historical, and scholarly databases.