Web search engine company Google has announced a new line of server appliances intended for enterprises that wish to harness its crawling, searching, and page-ranking capabilities in order to index millions of documents hidden behind corporate firewalls. The announcement includes two models of a turnkey hardware/software bundle called the Google Appliance. The company says a customer site could have the appliance up and running in under an hour—and perhaps in as little as a few minutes.
Under the covers, the Google Appliance (right) is a commodity computer that runs Linux on an Intel processor. It's rack-mountable and comes with an operating system and Google software pre-installed. The customer need merely configure the Appliance using a Web-based interface, plug it into the corporate intranet, and set the spider loose to crawl internal corporate documents.
Prior to this announcement, Google offered to host specialized indexes on behalf of corporate clients. The Google Appliance will appeal to customers who don't wish to outsource this function for reasons of control, availability, or privacy and security of corporate information. By freeing customers from the task of installing and maintaining an operating system and search software, Google hopes to provide them with the in-house control they seek with a bare minimum of IT staff involvement.
In introducing the Appliance, Google enters a market space where other search engine vendors, such as AltaVista and Infoseek/Ultraseek/Inktomi, have sought to play. As the market for Web advertising has withered in recent years, these companies have turned to the corporate intranet market increasingly as a source of revenue. Although AltaVista offered hardware/software pricing bundles when it was a part of Digital Equipment Corp., neither it nor Inktomi offers a "black box" turnkey hardware/software solution for its corporate customers. (Some specialized search software companies offer similar turnkey solutions. For instance, DolphinSearch markets an appliance used for data-mining applications for law firms and other markets.)
The initial offering includes two models: the GB-1001, which supports up to 150,000 documents and 60 queries per minute, and the GB-8008, which handles millions of documents. The GB-8008 is essentially a single cabinet with a cluster of GB-1001 hardware inside. Both models are very similar in architecture to the servers Google uses for its global service, though one spokesperson joked, "Ours are not all painted yellow." The GB-1001 model sells for $20,000, including 2 years of support and software updates; the GB-8008 starts at $250,000. Remote hardware diagnostics can be performed over a 56K modem built into the device.
Google is tight-lipped about the specifics of the Appliance's internals. John Piscitello, product manager for the Google Appliance, emphasized that the box comprises tuned hardware, an operating system, and software, all optimized for fast crawling. He declined to say how the workload is shared in the 8008 cluster, instead noting that the cluster had been tuned for optimal performance.
Google says that its Appliance brings to the corporate customer all of the following advantages that millions of Google fans find appealing in the global search service:
Google also says that the product can crawl secure content or content behind proxy servers, as well as content accessed via SSL security. A URL tracker feature helps administrators diagnose why problem areas are not being crawled. Administrators can customize the look and feel of the interface using XSLT style sheets. They can also run reports to analyze what queries their users are making most frequently.
- Google's link analysis, which brings popular and official pages to the top of a hit list (trademarked as PageRank) plus other inputs to the relevancy scoring algorithm
- Results grouping, in which multiple hits from one subdirectory are collapsed into a single group, making hit lists much more manageable
- Support for indexing over 200 document file types beyond basic HTML, including Adobe Acrobat PDF, MS Office products, etc.
- Translation on the fly of content to HTML, so users can view a document even if they lack the corresponding product on their desktop computer
- Caching of all crawled content, so a user can retrieve a facsimile of a page even if a given enterprise server is down
- Highlighting of search terms in hits when viewing cached pages
- Support for searching content in 28 languages
- Filtering by date, with a document's date determined from http headers or from internal cues in the HTML
- Spell checking, which adapts automatically to the vocabulary used in the customer's documents
Search engine analyst Sue Feldman of IDC says the Google Appliance will appeal especially to certain organizations. "Many organizations just want to get search up and not fuss with it. Search engines have always been a vague concept. If they show up with that shiny bright yellow box under their arms, and sell it for what is less than the average going price for a small system, with hardly any implementation time, they will make inroads in the market for search that is straightforward to implement."
Avi Rappoport, a search industry analyst and editor of searchtools.com, says, "This is a good, solid offering that gives other high-end search engines a run for the money, especially at larger scales."
Laura Ramos, industry analyst at Giga Information Group, says that while this is a new product and therefore "there are not a lot of indicators," she feels it will enjoy success "especially in departments that are rich in content that is all Web-accessible." Yet to be seen is how well the product will work enterprisewide, especially while handling database-driven content.
At least one competitor is dubious about the Google Appliance's prospects for large enterprise applications. Scott Holder, director of corporate marketing for the AltaVista search product, says the Appliance is "late to market, and an immature offering for the Global 2000 marketplace."
Indeed, the $250,000 question is whether large enterprises will want to entrust an enterprisewide search function in a server appliance. Many analysts—and millions of users—point to Google's superior relevancy ranking. But enterprise IT managers and CIOs may feel that competing software solutions provide more flexibility—as well as more features, such as controlled indexing of corporate databases, federated searching, and traversing of enterprise security systems (e.g., single-sign-on products).
Another interesting question will be whether the Google Appliance appeals to the university market. Google offers university indexes at no charge to participating universities, even allowing them to co-brand the search pages to make the search appear local. Hundreds of universities have taken up Google's free offer, which includes a sweep of campus documents every 30 days. A potential university customer would gain faster sweeps of their domain, and the ability to index material that is password-protected or behind a firewall. However, many institutions might balk at the cost when Google itself provides a free alternative.
These questions aside, it's clear that Google's launch of the Appliance is part of an aggressive strategy to find new revenue streams to tap. As Google CEO Eric Schmidt recently noted in a Boston Globe interview: "The mission of the company is [retrieving] all of the world's information. It's not all the world's information currently available on the Web, it's all of the world's information. So what I do is I sit down every day and I think about, ‘What information do I need to get through the day and why isn't it on Google?'"