Microsoft didn't (quite) buy Yahoo!. The companies announced a 10-year partnership in which Yahoo! will use Microsoft's search technology, and they will combine advertising networks. From their press releases and collateral materials, it's clear that the goal is a web advertising network that's big enough to compete with Google. Previously, neither company's share of search traffic was enough to lure advertisers to add a second system, in addition to Google AdSense. Together they will have almost 30% traffic, and that's apparently enough for many advertisers, providing a balance, a bargaining chip, and an escape for those unhappy with Google. In addition, Yahoo!'s ongoing sales relationships with premium advertisers is valuable to Microsoft, which has never been particularly successful with MSN or other web portals. And, as Gord Hotchkiss said in a Search Engine Land blog post, there is evidence that, "the better the ads, the happier the user"-and he says that this finding may even apply to nonshopping searches.
Microsoft has bought a 10-year relationship with Yahoo!, wherein Microsoft Bing will become the search engine software for the main Yahoo! web search. The price will be 88% of search revenue on the main site for 5 years and on other sites for 18 months. The actual crawling of pages, indexing, retrieving, and relevance-ranking webpages will switch from Yahoo!'s own back-end search engine to the Microsoft code that runs the Bing search engine. The combination of the best of both companies should improve the search results listings and the contextual advertisements for common queries: Bigger is definitely better in this case. The deal has to go through regulatory review, and technical changes will probably be slow and careful. At the gigantic scale of these top web search engines, changing even a small element may provoke unexpected side effects. Searchers and information intermediaries should not see any changes in Yahoo! search for a year or two.
Yahoo! will maintain control of its search user interface and focus on maintaining its position as the leading portal, with "destination" areas such as Yahoo! Finance and tools such as Yahoo! instant messaging. "Yahoo's greatest growth came during its earlier technology partnership with Google, which allowed Yahoo to concentrate on user experiences and content partnerships more effectively," says analyst John Blossom of Shore Communications. Prabhakar Raghavan, head of Yahoo! Labs, says that much of the savings will come from back-end infrastructure technology now that Yahoo! doesn't have to use its resources for crawling, indexing, and searching the ever-growing content on the web. It will concentrate on other aspects of search.
Implications for the Yahoo! Developer Ecosystem
YUI, YQL, Pipes, Search Monkey
Yahoo! says that some of its developer tools will not be affected by the change, including YUI (User Interface library), YQL (Query Language for searching web services), and Pipes (a mashup tool). The SearchMonkey platform connects website metadata and structure with search results to display more helpful context. With Yahoo! retaining control of the search results interfaces, this seems easy enough to continue.
We've also received questions about the future of Yahoo!'s other developer offerings, such as YUI, YQL, and Pipes. We wanted to let you know that today's news does not affect these products (http://developer.yahoo.net/blog/archives/2009/07/developer_update.html).
In addition, the Hadoop-distributed open source cloud computing platform (like a database with more flexibility and no fixed address) will be maintained. It's in wide use in many parts of the company, which requires internet-scale tools such as near-real-time choices of the articles to put on the Yahoo! homepage. It is also the basic infrastructure for Facebook, which is nontrivial.
Don't Panic! We are as committed as ever to building a world class open source Cloud Computing infrastructure and Apache Hadoop remains our solution for batch computing (http://developer.yahoo.net/blogs/hadoop/2009/07/news_flash_hadoop_development.html).
However, the BOSS (Build Your Own Search System) has no such assurance. BOSS can create custom searches on websites or for specific topic. The TechCrunch website uses BOSS and can search Delicious.com for specific fields such as tags and authors. In December 2008, the site supported 10 million queries per day, and in May 2009, there were about 1 billion queries coming through BOSS. Some busy sites passed the threshold of 10,000 queries per day for free search, subscribing to the BOSS paid search service.
Microsoft has its own search API (application programmer interface), and no one will make any commitments to continuing development or even support of BOSS in future years. This is a bit sad.
What about developer tools and APIs like SearchMonkey and BOSS?
This is the beginning of a process and we'll be working with Microsoft to determine what makes the best sense for both us and developers. Regardless, we are certainly committed to continuing to innovate on the user experience of search all across Yahoo! and on continuing to engage with the developer community on several fronts, opening up leading audience experiences and data to third-party innovation. In that context, SearchMonkey can add a lot of value to how we help people get the most out of search and out of Yahoo!. Over the next several months we'll determine what makes sense with our developer offerings and provide information when available. -Ashim Chhabra of the BOSS Development Team
For SearchMonkey and BOSS, we currently do not have anything concrete to tell you. Clearly, we'll need to work with Microsoft to determine what makes the most sense for you and for us. -Chris Yeh, head of the Yahoo! Developer Network
In an interview, Microsoft senior vice president Yusuf Mehdi said Microsoft hasn't looked at the specific lines of code in that area, but it is open to trying to take Yahoo!'s best ideas and integrate them into Bing. "We like the approach that Yahoo has done," he said, referring to SearchMonkey and BOSS.
There are a number of companies, including hakia, OneRiot, Daylife, 123people, askBoss, 4hoursearch, BuildaSearch, Newsline, and Cluuz, and thousands of developers that are using BOSS for many kinds of projects. It allows them to use the Yahoo! index with more-specific topics, interesting interfaces, and additional data (as Twitter feeds). The ones who have paid for the service are the most concerned. But the BOSS developer Yahoo! group is full of worried messages from people who have depended on this API. As Google's AJAX Search interface is still a very beta version (only returns 64 results per search), losing a robust and stable API to web search results can only reduce the options for creative use of web search data.
A Quick History of Web Search Engines
The web search industry has radically changed several times since it started crawling links and indexing the full text of webpages.
1994: WebCrawler, founded by a student from the University of Washington, was bought by AOL. Three years later, it was bought by Excite and then closed down.
1994: Excite was founded by students from Stanford University. After some success as a search engine using statistical analysis for concept matching and as a general web portal, it was bought by the networking company @Home and failed with the parent.
1994: Lycos, a project from Carnegie Mellon University, had great success as a search and portal site. It was sold to a networking company in 2000, and the search was discontinued in February 2004, although it still exists as a portal.
1994: AltaVista came from engineers at Digital Equipment Corp. who wanted to show the power of their servers. It was simpler and more straightforward than other search engines and offered Boolean operators. But it was mismanaged and eventually bought by Overture, which was bought by Yahoo! at the same time, and closed.
1995: Infoseek was independently founded as a commercial search engine and directory. It was bought by ABC's Go Network in 1999 and later by Yahoo! in 2001 and closed.
Yahoo! itself, founded in 1994 by Stanford students, quickly became the most popular web directory, manually adding new sites to categories. It soon offered search powered by Inktomi (created at the University of California-Berkeley), which Yahoo! bought in 2002. In 1996, it started showing banner ads to directory and search users. Yahoo! used Google's back end for search from June 2000 through February 2004, when it started using its own search engine.
1998: Google started a beta search engine with uncluttered pages and some link analysis (like scholarly citation analysis) the company called Page Rank. It also had "snippets" such as the KWIC (Key Word In Context) algorithm; the user experience, from homepage to results, was just better. Google pioneered AdWords in October 2000, AdWords Select in February 2002, and AdSense in April 2003. It is by far the dominant web advertising vendor; it makes nearly all its income from advertising.
2005: MSN deployed its own search engine and stopped using other back-end search engines.
2009: Microsoft launched its new search engine, Bing.
Where We Are
In mid-2009, the issue of search quality is fairly stable. Google, the current Yahoo! search engine, and Bing all provide reasonable results for the usual short keyword queries. Smarter crawlers and detailed site maps are making it easier to index "deep web" content-information in databases and other sources that usually just respond to form queries, which often are more structured and meaningful than webpages. Wolfram Alpha search, DeepDyve, Kosmix, and various projects such as the Google's Wonder Wheel and Yahoo!'s WOO (Web Of Objects) are attempting to recognize the structure and semantics of source data and provide richer results.
When Microsoft bought the search engine company Powerset, it was not for its search but for semantic analysis of the pages while indexing, for example, relating the text of a page to an image displayed there. Microsoft has enriched its Bing search by including the concepts and related terms for a subset of search results. This is part of what the company has been calling the "Decision Engine" element of Bing.
As Satya Nadella, who runs Microsoft's search and ad technology division, sees it, the company is "trying to build effectively a mind-reader ... a computational model of intent that is truly useful, in the context of very ambiguous tasks that can't be expressed well." This would be a breakthrough: Anyone who's answered a few reference questions knows how difficult it is to understand the question-and how many times the person asking the question doesn't really know what they want. The other side of this is enriching understanding of searchable information-"How do we go from raw content to true knowledge representation and reasoning over the entire wide corpus of the web?" (from a video conversation with Kara Swisher)
All three big search engines have very active search research groups on the aforementioned topics and related issues such as semantic analysis, click tracking, and other statistics. Microsoft has a research lab in Mountain View (Silicon Valley), as well as Redmond, Calif., and Cambridge, Mass. We'll see if Microsoft is smart enough to hire many of the good Yahoo! people; it has hired ex-Yahoo! researchers quite recently.
Gord Hotchkiss says, "There's a whole bunch of PO'd, disenfranchised search engineers that have just had a frightening but potentially liberating view of the future: freed from the shackles of the rapidly sinking S.S. Yahoo!, they might actually have a chance to do something meaningful in search."
Rather than compete with Google and Microsoft/Yahoo! search, they might create software that solves a little bit of the problem, from divining user intent to analyzing source documents, and sell this technology to a larger company for a lot of money, as is fairly standard these days.
Overall, for end-user searchers and intermediaries, this search engine merging is a loss: Consolidation means common queries are likely to have more helpful results, but the variety and diversity provide more results for unusual questions, the long tail of search.