LexisNexis (http://www.lexisnexis.com) announced that it has adopted technology that robotically extracts targeted Web content in order to streamline the updating of its Directory of Corporate Affiliations (DCA; http://www.corporateaffiliations.com) and its Advertising Red Books (http://www.redbooks.com). WhizBang! Labs, Inc. (http://www.whizbang.com) will provide its Extraction Framework software as part of a strategic alliance with LexisNexis. The software will find contact information for LexisNexis' directory products by crawling relevant Web sites. The software will then populate appropriate fields in LexisNexis databases with the extracted information.
Tom Derry is the general manager in charge of the DCA and Red Books products. He says the adoption of WhizBang! technology "is a significant component of re-engineering our processes for data acquisition. In the past, we used paper-based questionnaires that were annually sent to each contact person in the directory. In the age of the Web, our customers expect directory information to be current, fresh, and accurate."
Asked if the machine-based process might be prone to error, he chuckled and said: "You know, when a human makes a mistake with directory information, it tends to be subtle—a wrong digit in a telephone number, for instance. When a machine makes a mistake, it tends to be quite obvious at a glance." Human editors will review the automated content postings, but Derry declined to give specific details of the editorial work flow for maintaining the DCA and Red Books products. He did reveal this much: "We have a staff of 30 editors, each of whom is assigned an industry group to become expert in. They have a schedule of monthly, quarterly, and annual tasks to perform in order to keep our directories current. WhizBang! helps with some of the more mundane kinds of information, which frees our editors to focus on higher value-added information, such as understanding the interrelationships among corporations."
The new LexisNexis partnership comes close on the heels of an announced partnership with iPhrase to provide a natural language interface to the same DCA and Red Books products. (See the October 29, 2001 NewsBreak at http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=17472) Taken together, the two announcements mean that LexisNexis has moved from a manual process to an automated one for updating these directories, and from a Boolean interface to a natural language one for searching the directories.
The new automated updating of contact information will allow a DCA or Red Books entry to be as current as the contact information provided on the relevant firm's Web site. Without the new process, a user of the directory might find contact information that is outdated in the directory yet current elsewhere on the Web. In effect, this allows the LexisNexis directories to be "synchronized" with unstructured information in far-flung Web sites. "Our customers are very savvy," says Derry. "They know that much of the directory information they need is out there on the Web somewhere. They expect our online directories to be equally current, and now [they] can be."
LexisNexis appears to be pursuing a strategy of partnerships with search- and Web-technology companies to add specific functionality to specific LexisNexis products. For instance, the Lexis side of the house recently partnered with DolphinSearch to provide law firms with the ability to classify and search millions of documents in evidence (see the December 10, 2001 NewsBreak at http://newsbreaks.infotoday.com/nbreader.asp?ArticleID=17446). These adoptions of proprietary tools appear to be tailored to meet specific needs with specific products, rather than to be part of an overarching new architecture for enhancing all of LexisNexis. For instance, WhizBang! technology overlaps DolphinSearch in some ways. The vendor says that its software "finds, classifies, and extracts information from a wide variety of unstructured sources, including intranets, extranets, Web pages, and document databases."
WhizBang! Labs is based in Provo, Utah, and was founded in 1999.