Inmagic, Inc. has added to its catalog of software tools that are designed to harvest and classify textual content. The new Inmagic Gatherer crawls designated Web spaces (either internal or external) and extracts content for the addition of metatags and loading into Inmagic's DB/TextWorks database software. The new Classifier can take such extracted content and automatically classify it according to a given taxonomy. The company introduced the products at the SLA 2002 conference in Los Angeles.
The announcement follows Inmagic's May release of DB/Text WebPublisher 6.0, which allows customers to publish DB/Text-resident content and Web-based applications to end-users via XML. The company says that DB/Text, the combination of the three products, will allow customers to more easily harvest, organize, and publish their textual content on the Web. The WebPublisher features a drag-and-drop, WYSIWYG editing environment. Inmagic offers the suite of products with a single interface under the name IntelliMagic.
The new products represent a broadening of focus for Inmagic, which says it serves over 15,000 special libraries and information centers in 50 countries. Gatherer and Classifier are based on software from TopicalNet that incorporates patented technology.
Susan Stearns, Inmagic's vice president of marketing, told me she sees three features that distinguish the companies' new offerings from those of competitors:
Classifier ships with a default taxonomy provided by TopicalNet. A customer can use that taxonomy or can incorporate its own taxonomy into the classification scheme. Many automated classification systems "learn" how to process a given enterprise's content through training sets. The administrator feeds a small subset of documents into the system and guides it to classify those documents. In contrast, the Inmagic product can apply the built-in taxonomy out of the box. The customer can accept the resulting classification or choose to tweak it as necessary. Classifier also allows remote control, though it does not use a browser-based interface.
- Flexibility—Stearns says that DB/Text lets customers store both textual and structured data in a single knowledge base. The new products allow existing and new customers to manage, organize, and retrieve content in a variety of formats.
- RAD—According to Stearns, Inmagic customers can rapidly develop and enhance applications using the product suite. She says some customers have built applications in a matter of days—which they previously assumed would take a year to develop.
- Cost—Stearns says the cost of these products is far lower than what competitors offer. As an example, she cites a company whose enterprise portal development was halted after an investment of over $2 million. A Gatherer/Classifier-based solution would cost a fraction of that amount. She said: "A typical configuration of Inmagic Gatherer/Classifier for an existing Inmagic customer can be purchased for less than $20,000, excluding any consultative services. A typical workgroup IntelliMagic configuration that includes DB/TextWorks, DB/Text WebPublisher, and Inmagic Gatherer/Classifier is typically priced at under $50,000."
Stearns demonstrated Classifier by taking a set of PR Newswire announcements and classifying them. She used as a "local" taxonomy a subset of categories from the Open Directory Project. The product merged the two classification schemes, blending overlapping categories as appropriate. Stearns uses a metaphor to contrast the process with other automated classification systems: "With other products, you have a black box. You feed the thing your training documents, and if you don't like what it does, you retrain it with other documents. That's a black- box approach. Ours is a glass-box approach. You can see what's going on inside the system, and if you don't like it you change categories to the way you want them to look using a drag-and-drop interface."
Gatherer runs under Windows NT, 2000, or XP Professional and requires a minimum of 256 MB of RAM. It will have a Web-based interface for scheduling, etc. Classifier runs under Windows 2000 and XP, as well as Red Hat Linux. Pricing is per server. Inmagic expects that most customers will not require more than one server each for Gatherer and Classifier uses.
The products support a variety of formats, including text, Microsoft Office, Adobe PDF, and HTML. Through XML support, Gatherer and Classifier can feed into any customer format.
Inmagic claims that the new products are highly scalable. Classifier is said to handle a throughput of 1.5 million pages per day per server. Classifier can also produce PDF reports that graphically depict major topics and subtopics in proximity.
Stearns says the toolset can be used for workgroup or enterprise alerting functions, such as informing a lawyer in a large firm when information pertaining to a certain practice area enters the textbase. She also points out that security features allow a customer to blend licensed content from third-party database providers, restricting access to those individuals in the enterprise with access privileges.
With Gatherer and Classifier, Inmagic continues to broaden its target market. At the same SLA meeting where Inmagic announced the new products, Inmagic founder Betty Eddison was inducted into the SLA Hall of Fame. Inmagic CEO Phil Green observed: "Betty Eddison always had the broadest interpretation of the information professional. These new Inmagic products carry forward that vision, expanding Inmagic's offerings further beyond those associated with more traditional library and information center needs into the knowledge management applications of any organization."
Inmagic says that as it moves from its traditional market base of special library and infocenter customers into new markets, some examples of its new crop of customers include law firms and pharmaceutical companies.