The newly announced WebTop.com (http://www.webtop.com), a Dialog Corporation service, provides a Web search engine with some exciting new features, particularly k-check, the "knowledge-checker." The search engine behind the service uses "linguistic inference" technologies to identify and match key concepts. As its spiders gather data on Web sites, they feed the data into a database that automatically categorizes sites into Company and News zones using Dialog's InfoSort technology. Business searchers can tap into pre-defined business categories.Advocates of linguistic or NLP (natural language processing) search engines emphasize that the longer the search string given to a linguistic search engine, the more it has to work with to produce accurate, "disambiguated" results. Unfortunately, the tiny windows that most Web search engine interfaces offer hardly promote lengthy entries. Many Web searchers may feel uncomfortable typing in long entries as well. The WebTop.com system attempts to ease some of the burden of entering extended search statements with three options:
- Type and Search lets users enter full sentences as well as multiple key words.
- Copy and Paste lets users highlight a substantial amount of text in a document and paste it straight into the search query box.
- Drag and Drop offers user the k-check option.
The last one, k-check, offers particular advantages. Susan Feldman, leading author and lecturer on sophisticated search engines, suggests that k-check could become one of those "small killer apps that change the way people work." Users can download the k-check software and put it on their desktop. Then when they work with a document—whether it is one they have written or an imported one—or when they scan e-mail messages, or even when they look at Web pages, they can block any relevant text they see that describes the information they need, "drag" it to the k-check icon, and "drop" it there. The system will then analyze the content of the text, extract the relevant concepts, and post them to WebTop.com for a search and display of relevant Web sites
The importance of the innovation lies less in the technology than in changing the pattern of user behavior by making "knowledge-checking" as simple as spell-checking. With the rise of broadband Net connections, now extending to homes as well as offices through low-cost DSL and cable-modem connections, the "always-on" Internet becomes a reality. In an environment of effortless transitions into and out of the Net, this kind of feature could become revolutionary, according to Feldman.
John Snyder, CEO of WebTop.com, said, "Business professionals, particularly those new to the Web, want results—fast. The combination of our concept-driven search engine, which gets to the meaning of the user's interests, and our indexing of business information on the Web means that users finally have a serious solution to Web-based information retrieval."
After retrieving the results, WebTop.com divides the them into three zones of information: the Web Zone for general Web site matches, the Company Zone targeting a collection of company Web sites, and the News Zone carrying items from selected news and magazine sites. The short search-results list sorts by most-relevant page on the site or the home page. If users want to dig down a few layers into sites, they click on the "links" symbol to access more information from a particular site. If they want to go directly to the site's home page, they click on the "home" symbol. In the Company and News zones, users can further refine searches using the InfoSort business categories, e.g. by business sector, context, and country or location. At this point, Snyder says, the Company Zone only targets some 6,000 company names, but they are expanding that area.
The k-check feature only works when doing searching of the WebTop.com database of Web sites created by hundreds of spider software robots crawling the Web. Currently that site covers over 50 million pages indexed from over a million Web sites, with an index growing at the rate of 8 million documents a day. It should reach 200 million pages by the New Year, according to the vendor.
WebTop.com builds its engine around a standing product called EuroFerret, which had accrued some 36 million documents over the last 3 years, according to Snyder. They have now aimed their spiders at U.S. sites. The expansion has gone very quickly, Snyder says. The jump from 36 million to 50 million took only 2 weeks. The EuroFerret tradition offers a particular advantage. The system can automatically identify other Roman alphabet European languages. Searchers using the PowerSearch option can specify the language of the pages retrieved.
Searchers interested in trying out k-check can download it as part of a beta test now (http://www.webtop.com) or register at that site to receive a free copy when the company officially releases the software in 2000. According to Snyder, the final price had not been set, but he expected a decision by January. At present, WebTop carries banner ads, according to Snyder. He clearly hoped that a free version of k-check would continue to be available after the end of the beta test period, but other options were under consideration, including a premium version for a monthly subscription that might incorporate some Dialog data content.
The k-check software works with Microsoft Internet Explorer under Windows 95, 98, and NT. With some adjustments, it can also work with Netscape, but the company recommends Internet Explorer 4.0 or better.
We asked Snyder whether anyone else offered a similar drag-and-drop feature. He referred to GuruNet and FlySwat, but pointed out that those services limited the size of the entries they would accept to a few words, not whole sentences. He also believed that their databases were much smaller, making no attempt to reach out to the whole Web. Currently k-check usually gets 20 to 50 words, but users can set the limit to 1,000 words by using the right-click button on the mouse. He said that the sophistication behind the technology would make it difficult for future competitors to replicate effectively, since the spiders do some analysis on the fly using linguistic techniques.
K-working
The k-check and WebTop products could also connect, some day, to the intranet-oriented suite of software products that Dialog Corporation (http://www.dialog.com) introduced recently. The first k-working release offers three modules—InfoSort for automatic categorization, Discovery for natural language retrieval, and Alert for automatic searching using intelligent agents. (See the November 8, 1999 NewsBreak.) Institutions may license any or all of the modules. At full power, the k-working suite can link internal and external documents, including sites culled from the open Web.
Later releases should incorporate more features. Dan Wagner, CEO of the Dialog Corporation, said, "We fully intend to meet this growing market opportunity by extending the k-working range over the coming months to provide an even more comprehensive selection of k-working tools." This apparently represents a new direction for the company, a fact recognized in some of the documentation on Dialog's own Web site:
How is this [k-working] different from previous Dialog offerings?
K-working is Dialog's new move into markets which are not reliant on the provision of content. It is able to work with any internal or external information, thus it provides a solution for users not wishing to subscribe to Dialog's content services. However, it also broadens the capabilities of those organizations which wish to combine their Dialog content feeds with their internal data, and other external sources such as World Wide Web information to provide a broader knowledge-based strategy.
WebTop.com's Snyder indicated that they could deploy the k-check feature and spider technology to k-working. With those in place, organizations could spider their own databases and meld searching of the Web with internal documents. Snyder said that they could also customize spiders to move through barriers, e.g. by offering authenticated passwords to reach external, commercial data sources.
Currently k-working can work with databases created using Lotus Notes, Microsoft SQL Server, or Oracle, and external sources such as the Web and Dialog data. Organizations pay by module. For example, prices for a single server installation would start at $25,000 for InfoSort, $12,000 for Discovery, and $12,000 for Alert. For more information about the k-working suite, call 919/461-7348 or e-mail knowledgeworking@dialog.com.