Language is inherently ambiguous, so keyword searching can be an exercise in frustration, especially on the Internet. When I search for information on Java the programming language, I don't want to see things about coffee, or an island, or whatever. Recently, two Internet startup companies have attacked this problem of word ambiguity, each developing its own proprietary search algorithms that are based on word meaning. One product, Oingo, has been in beta testing and is now available for general use. Another, SimpliFind, has just been announced and is currently in beta testing by a group of select users.
Both of these technologies are built upon an underlying lexicon called WordNet, which was developed over the last 15 years by researchers at Princeton University. The lead scientist on the project, Dr. George Miller, is now working closely with developers at Simpli.com, Inc. WordNet is a network of words and associations that emulate the natural structure of language. Each company reports that it has taken the WordNet lexicon and expanded and built upon it, using teams of linguistic experts.
Oingo, Inc., a privately held company based in Los Angeles, was started in November 1998 by two Cal Tech alumni, Adam Weissman and current company CEO Gil Elbaz. Oingo launched in beta at Fall 1999 Internet World, winning the "Best of Show" Award for Outstanding Internet Service. In early December, the company announced the official launch of the service and invited portals, content providers, and other e-businesses to incorporate the Oingo meaning-based search technology royalty-free. Oingo Free Search includes a set of tools to enable Webmasters to seamlessly integrate Oingo's search technology into any Web site.
"During our beta phase, we were overwhelmed by the amount of positive feedback we received from our end users," said Eytan Elbaz, Oingo's director of business development. "With Oingo Free Search, we extend the benefits of our technology to Internet users on a much broader scale. This marks the first time that such a compelling search product has been offered at no cost."
He also indicated that this is an introductory product that they hope will grab market interest and generate support for meaning-based search technology. The company plans to make available additional search products during the first quarter of 2000 that will generate revenue for Oingo based on a per-search license fee, and offer additional concept-driven search customization capabilities.
Oingo currently indexes the entire Netscape Open Directory Project (ODP), which also serves as the basic directory for a number of other search engines. A target document for Oingo thus consists of a single subject page from the ODP. A search consists of several steps. First, a word or phrase is typed by the user, then the user is presented with specific meanings and the option to refine and limit the search to specific meanings for a word, as in the Java example above. An advantage of working from a lexicon with meanings is that users can do a conceptually fuzzy search. In other words, the system will also look for what the user meant but didn't say. Oingo's motto is "We know what you mean."
In the section on Search Tips, the user is advised that "Oingo is not a natural language search. This means that you will get better results when you search for phrases instead of entire sentences." The Oingo search engine includes the ability to require that all results be relevant to a specific entered meaning. This is analogous to a Boolean AND operation. By entering a plus sign (+) before a term, the user can indicate that a result must "hit" either the plain text of that term or a meaning that is semantically similar to the meaning implied by that term. According to the company, the addition of more complex Boolean operations is currently under development.
Elbaz indicated that the company is talking with some of the major portals and has already arranged some licensing deals that will be announced soon. He said that GuruNet.com would be embedding Oingo into its Instant Expert service.
About that name, Oingo, here's what Elbaz said when I asked: "Oingo is actually an acronym, but we're not saying what is stands for just yet. It has to do a lot with our next big project, and we don't want to give anything away until we have to." Hmmm, I wonder if there's a connection somehow to that eclectic '80s music group in Southern California that could "never really be categorized" (which, of course, I found by searching "Oingo" in Oingo)? …
The other interesting company tackling word ambiguity, Simpli.com, Inc., has just announced SimpliFind, a patent-pending proprietary search technology for improving Internet search results. The technology is currently in beta testing and, according to a company representative, "the developers are working aggressively on making it available to the general public as soon as possible." The real target market, however, will probably prove to be portals and search engines that would license the technology as a front end for improving search queries. Oingo, at this time, is tied to a specific database (the ODP), while SimpliFind is designed to work on any database with any information system.
SimpliFind incorporates two main options: a simple search option with "interactive query disambiguation," and an advanced search option with multiple fields. When you enter a term into the search field in a simple search, SimpliFind retrieves a list of meanings from its database (built from WordNet) and generates a pull-down menu. Users then select the appropriate meaning or enter a new meaning. For terms not in the database, users are prompted to provide a meaning, which SimpliFind learns and uses for all subsequent visits. The advanced search provides two additional text fields for multiple query terms, designed to encourage users to enter more information. Words related to the search term and its indicated meaning are used to expand the query (such as mocha and espresso for the term java, with coffee as its meaning) and are then used to prioritize the results. Search results are then based on meanings of words and the context.
Simpli.com has been able to expand the original WordNet lexicon, creating literally millions of word associations and meanings while at the same time refining the process for increased ease of use. Various enhancements and innovations related to the management team's expertise in linguistics, cognitive science, and computer sciences have been created to develop a highly relevant and scalable system to search for information. According to the company, it's these added capabilities that make SimpliFind so innovative and something that will only get better with time and more usage. Simpli.com will be discussing relationships with search engines, portals, and companies that need highly relevant search capabilities.
The company was created several years ago by chairman and CEO Jeffrey Stibel, drawing on a combination of his undergraduate and Ph.D.-level work and real-world Internet/business experience. He incorporated it as Simpli.com in the late spring of 1999. Currently Simpli.com is privately held. A round of capital was raised from various angel investors, board members, advisors, and employees, and the company is in the process of seeking additional venture capital. The Providence, Rhode Island-based company now has 20 full-time and 18 part-time employees.
Raising the Bar
In the ongoing quest to design ever more useful and refined search engines, I believe we are only seeing the beginning of developments that attempt to deal with the nuances of language and bring human judgment into the search equation. The efforts of these two companies, and other innovative developments such as the k-check feature in WebTop.com (a Dialog Corporation service—see this week's NewsBreak "Web Searching from Dialog via Webtop.com ... "), will certainly advance the cause of Web searching and foster even more improvements. Carl Lehmann, vice president of the META Group, said, "We have followed the evolution of the search engine market for years, and I believe Simpli.com represents the beginning of the next generation of intelligent search and retrieval services."
Sue Feldman, author and information industry consultant, said: "These two products mark the beginning of wise use of natural language processing for everyday information finding. Poor questions and queries have always been the biggest stumbling block to searching. We rely on language to ask questions, but most words have many meanings. By helping searchers choose the right meaning of a term, and by adding additional pertinent terms, both these products should help a searcher find focused results that are not overly narrow."
For more information, visit http://www.oingo.com and http://www.simpli.com.