It has been several years since a new search engine has launched with its own unique database at a size that comes close to that of the major search engines. Cuil (www.cuil.com) is just such a new search engine, from creators with impressive backgrounds: Google alumni Anna Patterson, who also created the Recall search engine for the Internet Archive’s Wayback Machine; her husband, Tom Costello, who worked on IBM’s WebFountain; and AltaVista’s founder Louis Monier. With such search heavyweights, Cuil launched last month to high expectations.
With strong interest on its launch day, the Cuil servers failed to handle the load, and the search engine was offline for several hours. The press and the blogs buzzed with fairly negative reactions—the site was offline, the relevancy ranking differed from Google, sites were missed, images were not related, and more. It seemed that everyone was looking for a "Google killer," and this was not it.
High expectations indeed for the first day launched. Cuil never claimed to be a "Google killer," according to Vince Sollitto, vice president of communications at Cuil and a former spokesman for California governor Arnold Schwarzenegger. Instead, Cuil is "trying to provide an alternative approach to search." Sollitto notes four ways in which Cuil differs from other search engines: database size, results presentation, content-based relevance, and search privacy.
Perhaps Cuil’s mistake was to dare to state the size of its database, especially when it titled its press release "Cuil Launches Biggest Search Engine on the Web." Claiming more than 120 billion pages indexed and "three times more than any other search engine," Cuil set itself up to be knocked down. Danny Sullivan waxed eloquently about the whole size issue at Search Engine Land. "You want to have a comprehensive collection of documents from across the web. But having a lot of documents doesn’t mean you are most relevant" (http://searchengineland.com/080728-000100.php).
Early comparisons of results from Cuil showed that it does not find as many pages as Google, Yahoo!, or Microsoft Live. Several people have complained that their sites are missing. Even so, no other search engine ever launched with such a large database. And, since this is a unique database, Cuil could have pages that get missed at Google, Yahoo!, Live Search, Ask, Gigablast, or Exalead.
The Results Presentation
Rather than a straight listing of the top 10 most highly ranked results, Cuil uses a two or three column display (users choose which display with links in the lower right corner). Cuil calls this a "magazine style layout." It is certainly a departure from the results listing of most other search engines, although not as radical a departure as SearchMe.com.
The more controversial difference is the attempt to automatically add images to each search result. According to Sollitto, "We like to place an image … to help the user understand what the page is about." To do so, Cuil either uses an image from the page itself or tries to find one on another site that is "representative of the content of the page." Sollitto acknowledges that this "is very hard to do." At this early stage, there are far too many unexpected mismatches, such as images of a competitor’s product linked to a company’s page and pictures of the wrong person connected with name searches.
As opposed to the typical 10, Cuil displays 11 results. While it has space for a 12th, Cuil instead uses that space for the "Explore by Category" box. This section provides suggested related searches. Mouse over one of the suggestions to see more specific suggested searches within the broader heading. For example, search "taxonomy" to see several category suggestions, including "Zoological Nomenclature." Mouse over that to see the narrower suggestions such as subspecies, PhyloCode, and International Code of Zoological Nomenclature.
Search engines always tend to be secretive about the details of their ranking algorithms, both to protect trade secrets and to combat gaming of results. Aiming to rely on different ranking factors than the others leads to some unusual results. Cuil’s intent is to "[rank] results by the content on each page, not its popularity." Here the specifics get murky.
While some have assumed that the ranking of results goes back to the tried (and failed, at least for web searching) techniques, such as term frequency and inverse document frequency, it seems that Cuil is trying to extract more semantic meaning in more sophisticated ways. Sollitto says Cuil uses the text on a page to "identify concepts and relationships." The "Explore by Category" box is one example. Another is the suggestions of narrower categories that sometimes appear in the gray bar near the top.
A third type of suggestions show up as the searcher types. The search suggestions (which can be turned off in the preferences) appear as a drop-down menu from the search box. In these early stages, the suggestions do not work all the time. Sollitto says that the suggestions are "built organically" as Cuil indexes the pages. The phrases come from extracted text, not from user queries.
Privacy and the Bottom Line
If judged by the aim of offering "an alternative approach to search," this early version of Cuil has achieved that goal. For the general user, it will not become a compelling alternative until the relevance, consistency, reliability, and image matching is greatly improved. For the information professional, Cuil is worth evaluating at this stage to see how it works and to consider its potential. While it has no advanced search and few advanced features at this point, Cuil does support phrase searching. The attempts to categorize results, the alternate display, and a different approach to ranking means that Cuil may be able to help searchers find documents that might not appear at other search engines, especially since Cuil uses its own unique database.
Sollitto says Cuil is "focused on continuing to improve upon what we’re doing now," and if the relevance ranking, image matching, and server reliability are all improved, Cuil has the potential to be a useful resource in the searcher’s toolkit. Whether and how Cuil might capture a sufficient audience size to make it economically viable should provide a significant challenge to the fledgling company for years to come.