The Google blog (http://www.google.com/googleblog) was enthusiastic: “Just the facts, fast.” The launch of Google Q&A late last week immediately sent me to the Web site looking for a new search tab, but I was unable to find one. I also clicked “more” over the familiar Google search box to no avail. So, I finally typed a question, and I quickly got an answer right at the top of the page. Later, Peter Norvig, Google’s director of search quality, clarified: “A search tab may develop over time, but we’re thinking of it now as a service to just get a fast result and just one result, so it doesn’t make sense to have it as a separate search.” Google’s Q&A uses open Web resources, not proprietary information or subscription databases, to answer questions.
When posing queries, one first notices that the responses come from the usual suspects (e.g., Wikipedia, Who2, the CIA World Factbook). Other free “answer” services, such as Answers.com and Factbites.com, have already learned how to mine these obvious sources for topical as well as biographical and geopolitical queries. But you’ll also notice that Google Q&A returns some unexpected (though credible) sources. It seems that Google Q&A’s plan is to mine more of the Web—to go beyond the typical sources—to answer more questions. For example, other information sources that turned up for my sample queries included the Columbia School of Journalism and even some personal pages.
Regarding the service’s knowledgebase, Norvig said: “We are not trying to limit the source to [a] single encyclopedia. We are trying to say this is an experiment, and understating as much as we can of the evolving knowledge out on the Web so you’ll see answers from many different places and be able to get answers that you couldn’t find in just an encyclopedia.” Although Google Q&A will be especially adept at finding the kinds of things that appear in almanacs and other reference resources such as facts about countries and important dates, it does not appear to be limited to static information but also includes ephemeral topics such as popular culture.
The answers appear at the top of the basic results list. A question such as, “When did William Shakespeare die?” will be answered by a terse, factual statement: “William Shakespeare. Date of death: 23 April 1616.” The wording of the following line will please librarians: It states, “According to” and then provides a link to the source. This generated a few additional thoughts. Will the sites where the information originates complain about diverted site traffic? The thinking is that sites featured at the very top of Google’s retrieval, separated out from the rest of the list, will experience increased traffic.
And, if you’re interested in knowing how many permutations of a question you can pose and how Google knows that your input should be referred to a factual answer via Google Q&A, Norvig responded: “We look through our logs, do research on question answering, look at question forms, and try to come up with as many variations as we can. So our users can say ‘Einstein date of birth’ or ‘birthday Einstein’ or say ‘Einstein birthday’ or ‘When was Albert Einstein born.’ We allow for a great deal of variation, and I think we will get better at that over time. Now that we’ve launched this, it’ll be the first time that users will explicitly be giving us questions and expecting answers, whereas before the questions we got were almost accidental. Now we will get more of a question stream and we can see which ones are working, which are not, and the service will continue to evolve.”
How does Google Q&A know it’s providing a factual answer? I searched another “answer” site called Brainboost.com and asked, “Who is Jimmy Carter?” Among the first few entries returned was, “Jimmy Carter is a saint.” Norvig maintains that Google’s answers won’t be so subjective: “This is a function of where Google gets the information from; true, it’s taking the answers from the open Web, it’s trying to be broad—but not too broad. We use judgment in deciding what are the quality sites and what are the factually oriented sites.”
Although its objective appears to be the same as that of Ask Jeeves and Answers.com, Norvig states that Google may take the concept further: “It’s a service for our users because we noticed from our logs that many people ask these types of questions, and we wanted to make it a little easier for them. They get an answer right away, and, if they want, they can explore the link for background material. But we also want to understand how to work with facts better. Traditionally we’ve been working with keywords and with link analysis, and now we want to get deeper into semantic analysis and start working with collections of facts as well as with collections of words. The idea is that we will better understand the structure of information and how facts relate to each other. It’s inevitable that we’ll be doing more with that.”
Google often rolls out new features believing that they are ready for prime time, but it admits that there are still some bugs in Google Q&A. Right now many questions are not answered in the new format. Quick factual queries such as, “Who holds major league baseball’s home run record?” and “What does Saturn look like?” don’t connect with separate Q&A responses. Occasionally, one may even encounter an incorrect answer. When I asked Google, “Who was the president of the USA in 1996?” the “answer” was Pat Choate, and the link was to an entry in Wikipedia where Choate is mentioned as the Reform Party’s vice-presidential candidate that year. Norvig responded: “That’s probably an error in parsing the page. Perhaps if you look at the entire entry the correct answer will be there. Or possibly the correct entry was just before or just after it. We’ll be making improvements.”
Wondering if Google Q&A will draw upon materials from the Google/Library Project, where the aim is to scan books and digitize them? Copyright issues will probably prevent this. However, Norvig says that it is conceivable that a query such as, “Where can I get a copy of Tom Wolfe’s I Am Charlotte Simmons?” may eventually directly connect with results from the OCLC WorldCat “find in a library” feature. Other questions that only users will be able to answer are, “Is the answer correct and do I trust the source?”