Verity Announces New Content Classifier Verity Announces New Content Classifier
Richard W. Wiggins
Posted On June 21, 2004
Verity, Inc. (http://www.verity.com) has announced its new Collaborative Classifier, software designed to enable large enterprises to build taxonomies and organize institutional knowledge. The company says the Verity Collaborative Classifier, or VCC, is the first role-based, distributed classification software.
Susan Feldman, research vice president for content technologies at IDC, said that VCC builds on technologies Verity obtained when it purchased Inktomi's enterprise search business in 2002. Inktomi had previously acquired software vendor Quiver, whose automated classification products featured work-flow management. Verity first integrated Quiver functions into its K2 knowledge management product.
Scott Whitney, director of product management for Verity, said that while VCC is built in part on previous Verity technologies: "It's a new product. It's a real, live Release 1.0." Whitney draws an analogy between VCC and a well-designed GPS system in a car, one that allows a driver to not only navigate unfamiliar territory, but also search by category for restaurants, gas stations, airports, theme parks, etc. "We build maps of an enterprise's information." He claims VCC brings "a major change in the way organizations can categorize an ever-growing pile of unstructured data."
VCC is based on roles and rules. Roles include taxonomy experts, subject matter experts within the organization (e.g., chemists, engineers, human resources staff, etc.), editors, and publishers. The company says the work-flow feature allows taxonomy and classification management to be distributed to subject matter experts who know the content, as well as to knowledge engineers who know and understand taxonomy development. Different people who serve different roles are assigned different permissions to alter categories. The company claims that VCC is the only software that enables such real-time collaboration between knowledge workers and subject experts.
VCC uses rules to define how documents should be classified. Once taxonomies have been set up, VCC automatically classifies new documents as spiders discover them. The customer can control automatic classification by defining how well VCC thinks a document matches a category. For instance, if VCC's confidence level for a candidate category exceeds 70 percent, then automatically publish it into that category; if not, route the decision to an assigned knowledge worker. Verity calls this "automated classification with manual oversight."
Verity worked with DuPont to develop VCC. Whitney said the company uses VCC to manage a 25,000 node taxonomy. Internal users include "everyone from a bench chemist to a knowledge engineer—whoever." He said VCC provides DuPont with "frictionless review between highly specialized knowledge workers, many with Ph.D.s, and the knowledge engineering staff."
Usually we think of classification schemes as a way to help people discover information. Whitney cited Raytheon as an example of a customer that needs content classification to help sort out information that can't be shared with the public for security reasons. He said Raytheon is using VCC to profile all of the content on its extranets to insure that documents that must be restricted due to federal security regulations don't accidentally leak out to the public Web.
Feldman noted that many organizations do not have a large staff of librarians or other information professionals trained in taxonomy management and document classification; she said the VCC could save enough manual labor so that a relatively small professional knowledge management staff could serve the needs of a large organization. She noted that even an organization such as her own (IDC) could benefit. "We have very large taxonomies; each of our 700 analysts is responsible for our own subject areas." She envisions VCC's automated-but-human assisted classification and work-flow management allowing "much less burdensome" knowledge management for such an organization.
Feldman praised VCC's user interface, calling it "friendly and accessible." She predicts both knowledge workers and subject experts will find VCC easier to work with than previous products.
Whitney said VCC can interact with content management systems, such as Interwoven, Lotus Notes, FileNet, and Documentum.
Other companies offering automated classification products for the enterprise include IBM with its Discovery Server product, based on IBM Almaden research, and Inmagic, whose Classifier product is powered by TopicalNet.