ClearForest Corp. (http://www.clearforest.com) has announced the availability of a new version of its text mining solution designed for use in enterprise business intelligence. New developments in Text Analytics release 6.0 include system management software, an administrative dashboard, upgrades to development tools, and integrated extraction-categorization capabilities. The company said its new platform fills the void between content management and business intelligence solutions and bridges the two worlds of unstructured textual information and structured enterprise data. The new release is said to provide easier system configuration and control, plus greater accuracy and relevance in information extraction and relationship discovery.ClearForest technology provides structure for unstructured content by tagging key concepts, such as persons, organizations, and events, which are hidden within text. Text sources can include various document and text formats, e-mail, PDF files, Excel spreadsheets, and PowerPoint. Unlike search and categorization technologies that organize text for better access, ClearForest's technology automatically identifies and tags entities contained inside text and categorizes documents. Randy Clark, vice president of marketing, said, "We have an analytics vision, not a retrieval vision." According to Clark, release 6.0 provides a much tighter integration of the extraction and categorization capabilities, thereby providing greater precision.
"We've spoken to a growing number of companies who are struggling with an overwhelming amount of free-format text," said Jay Henderson, director of product marketing. "We believe we provide the best possible solution—a comprehensive platform that enables enterprises to systematically structure valuable unstructured content so that it can be processed along with enterprise data for business intelligence applications."
And, the content can be drawn from a range of previously untapped corporate text sources. "In the last 2 years, it has become clear that text is the missing piece in understanding the business of an organization," said Susan Feldman, IDC's vice president for content technologies. "Customers communicate their frustrations, as well as their suggestions for product or service improvements in e-mail and on the phone. This text is lost if enterprises confine their analysis only to the data they collect. By analyzing the text of customer e-mail and CRM system text, businesses can detect problems before they become headaches, forestall litigation, or plan new products that meet customer needs. Joining that analysis to the business analytics they already use creates a more complete picture of what actually is happening. Corporations ignore the text they have stored in e-mail, CRM systems, or voice mail at their peril."
The ClearForest product suite comprises:
- ClearForest Tags: technology for accurate tagging and categorization of text
- ClearForest Industry Modules: industry-specific tagging and analytics capabilities
- ClearForest Analytics: An analytic system specifically designed for text analysis created to add value to organizations' existing business intelligence tools
A major change with the 5.0 release was the uncoupling of ClearForest's platform and the application components (the tagging and analytics). With 6.0, the Industry Modules have also been uncoupled from the platform. ClearForest Industry Modules build on the tagging and extraction functionalities and are capable of identifying entities unique to a particular industry—and their relationships to one another. ClearForest currently offers industry modules for Federal intelligence, patent analysis (intellectual property), and people and corporate profiles (for content providers). Clark said that a new module would be introduced in September. (For more information and company background, see the NewsBreak on release 5.0 at http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=16591.)
ClearForest Tags' open and flexible platform supports statistical, structural, and semantic tagging as well as custom taggers, industry and custom taxonomies, and information agents. Its administrative dashboard allows for control, configuration, and monitoring of the extraction process. The open architecture means easy integration within existing business intelligence solutions. Release 6.0 may be deployed as a stand-alone analytics application or fully integrated within an existing business intelligence system, such as those offered by Cognos or Business Objects.
Clark commented that ClearForest is differentiated from competitors by offering programmability in its developer environment. "We are the only one with an open object-oriented NLP programming language." And, he added, "We have a platform, not just a collection of tools."
ClearForest's customers include Thomson Financial, Elsevier Science, Dow Chemical, and the FBI. This spring the company completed a round of funding, totaling $10 million led by venture capital firm Greylock. The company recently moved its headquarters to the Boston area to tap the high-tech workforce there and to be closer to potential partners, and it has offices in New York and Israel.