In the 2 short years since Google was in beta, it has captured the fancy of searchers both novice and expert. Arguably, Google is the heavyweight champ of search engines. Now a new contender, Teoma, has strutted into the ring, seeking to provide the knockout blow. Teoma is currently in beta but has been garnering a lot of interest in the search community. In a telephone interview, Paul Gardi, president and CEO of Teoma, explained to me why he has such confidence.
Teoma sports the kind of spare look and feel of early Google. When you do a search, the results set is more complicated. Gardi walked me through a search for the term "abuse." The results page displays the following three choices:
- Web pages grouped by topic
- Web pages (a more-or-less conventional hit list)
- Experts' Links
I asked Gardi who are the experts who select the "Experts' Links." "Everyone on the Web," he replied. "What we do is analyze the structure of the Web. We look at who links to whom and we are able to identify communities—clusters of Web pages and sites that are connected in some important way. From that analysis, we are able to identify sites that stand out as authorities in any given community."Continuing the "abuse" search, Gardi pointed out that the first item on the Web Pages results set was the National Institute on Alcohol Abuse and Alcoholism. He had me follow "Related Topic & Experts' Links," which brought up a subset of related sites. The first "Experts' Links" site was the Alcohol and Drug Abuse Institute Library at the University of Washington. Gardi noted that this site is a perfect example of the concept of an authority—a site that itself offers a number of links to highly relevant, high-quality sites.
Gardi contrasts Teoma's approach with Google's: "Unlike Google, when we examine 100,000 candidate pages, we are analyzing the link structure of all 100,000 pages in real time. We know much more about the hubs and authorities that make up the Web. We want to deliver pages that are not just relevant, but that are authoritative. Global linking gives credit to every link equally. We find the links that count, and count them."
Asked if the Teoma technology was similar to IBM's work on hubs and clusters, Gardi said: "Yes, we are aware of their research project called Clever, but our implementation is different. In fact I'm not sure whether IBM ever implemented a working product."
I noted that the topical breakdown was somewhat similar in appearance to the Custom Search Folders of Northern Light. Gardi replied: "Our approach is actually quite different. They use a taxonomy that has been developed by human editors. We are building the topic list on the fly, based on the link analysis and the text we are able to pull out of the sites themselves. Our topical structure adapts itself to new topics, to new modes of expression, or even to new languages, automatically, with no need for any human intervention."
Can Teoma scale to handle 20 million searches a day? "Absolutely. We know that we can scale to a large number of users. All we need is a fat pipe to the Internet. We also know that we can scale to a much larger index. We believe very much in quality, not quantity, so we are being careful about the size to which we will grow the database. But we know we can handle 1 billion URLs—or more."
"Teoma can find all sorts of communities—including bad ones. Teoma is resistant to index spamming—something that is a huge problem for Google."
Gardi said work on the technology began in 1998 and the company was formed a year ago. I asked what "Teoma" means. He struggled for a second to phrase a reply. He laughed, "It was just a code name for the project and it seems to have evolved into our company's name. It's a Gaelic word meaning ‘cunning.'"
I asked Gardi to comment on what hardware technology drives Teoma. "We don't reveal any information about that." Pressed, all he would reveal is that Sun computers drive the engine.
When will Teoma leave beta? "We aren't giving out dates. When we've grown the database to a size we're comfortable with, we'll call it official. One thing about our community-clustering technology is that it becomes more effective as the database gets larger."
The growing buzz over Teoma surprises and pleases Gardi. "It's just been amazing how fast word is getting around."
Gardi is not shy about praising his search engine. "We know our technology is far superior to all the others out there. Maybe 1 percent of the time other engines do better than we do. We're working on fixing that 1 percent."
Lou Rosenfeld, noted commentator on information architecture (http://www.louisrosenfeld.com), said: "A good idea, maybe. Although many Web-wide search engines have yet to figure it out, it's no secret that users benefit from having results presented in different ways. And maybe, just maybe, using different algorithms will help too. Teoma has clued in, and by providing results in three different ways, may be on to something that delights users."
Can search engine history repeat itself? Can a new contender with an improbable name and a better way to measure the value of Web sites knock out the current heavyweight champ? Teoma's Gardi is confident they can do just that. Serious Web searchers should take a look at Teoma and provide feedback to the fledgling company.