NetBase Launches Free Semantic Search Demo Site: healthBase
Posted On September 17, 2009
Health issue searching is one of the dominant areas of search, possibly only exceeded by the unmentionable one. So when NetBase Solutions (http://netbase.com) decided to create a showcase for demonstrating the utility and performance of its Content Intelligence semantic searching technology, it chose health as an area of interest to a broad public. The healthBase (http://healthbase.netbase.com) service that launched earlier this month is open to all users, providing structured searches built around four tabs covering treatments of conditions, causes of conditions, complications of conditions, and pros and cons of treatments. The service taps into masses of documents from a limited number of sources, extracting relevant sentences and sorting them into categories for display. That's the good news. The unfortunate news is that some observers have approached healthBase as a full-service health information source and found it wanting-not unexpected if one considers its primary purpose is demonstrating technology.
NetBase's Content Intelligence technology, according to the NetBase website, "uses deep linguistic parsing to literally read every sentence inside billions of documents and understand what is being said. That sentence level understanding is then used to power ‘semantic lenses' to help searchers explore problems & solutions, pros & cons, causes & effects, experts, applications, technologies and more. ... NetBase Content Intelligence platform reads more than a hundred billion sentences like these on an ongoing basis and stores the information into structured semantic indexes. ... The user can then explore any of the answers further by simply clicking on one of the results and running another search on that result."
At launch, healthBase tapped 20 medical information websites: ClinicalTrials.gov; Discovery Health; eMedicine; FamilyDoctor.org; Health Central; Health Finder; Health.com; HealthDay; HealthDay- Physicians Briefing; Healthline; Mayo Clinic; Medical News Today; MedicineNet; Medline Plus; Modern Medicine; Natural News Network; NetWellness; PubMed; WebMD; andYahoo! Health. According to Jens Tellefsen, vice president of marketing and product strategy at NetBase, "We initially picked some of the more well-known and reliable health portals or content sites, primarily to just get things going and see how our system performed."
Welcome to the Wild and Wooly Web
Actually, there were 21 sites at launch. But Tellefsen said they removed the somewhat controversial Wikipedia, despite its widespread use even among physicians. The Wikipedia inclusion was only one controversy that hit NetBase, sending it into a particularly stressful period, especially for a product for which it was not being paid. NetBase follows an enterprise business model. "We are much more keen on helping other publishers and portals and websites to enrich their services," said Tellefsen, "We're not a WebMD or a Healthline, but we would love to work with those people." (For discussion of an example of a NetBase commercial application, in this case working with publisher Elsevier, read "Elsevier and NetBase Launch illumin8," posted Feb. 28, 2008, http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=41084.)
The launch was very successful in terms of the number of searches. Volume was so high in the first week that Tellefsen said they worried whether their servers might not be able to handle the load, although that did not turn out to be a problem. However, said Tellefsen, "We learned a painful lesson with incorporating Wikipedia. We hadn't predicted people could be so creative in using health spaces for nonhealth subjects, including politics and bad language. We hadn't thought about putting in harnesses or constraints. Since then we've added more filters and dropped Wikipedia."
The immediate press coverage of healthBase also proved somewhat alarming. It was too complimentary. Leena Rao's article in TechCrunch on Sept. 2 (www.techcrunch.com/2009/09/02/healthbase-is-the-ultimate-medical-content-search-engine) bore the title "HealthBase Is the Ultimate Medical Content Search Engine." Rao judged healthBase to be "an aggregator of medical content [that] will surely help those looking for a comprehensive research tool." Her only hesitation about the service was worry that its "all machine" approach to medical content was a little impersonal. The comments that flowed into the article from readers who checked out the service were not always as complimentary.
I asked Stephanie Ardito of Ardito Research & Information, Inc. and the new Medical Digital columnist for Searcher magazine to take a look at healthBase as a realistic source of health information now. She responded:
I've run some searches on healthBase. Some topics seem okay, for example, swine flu. But swine flu is such a recent phenomenon that I think it's harder to screw up the accuracy of the results. It's the older topics that seem to cause trouble. I searched on stress and got results for oxidative stress. For hypertension treatments, surgery is an option, but the surgery articles refer to obesity or bariatric surgery. Treatments I would consider for high blood pressure are drugs (not to mention eating healthy and exercising)! Admittedly, if high blood pressure can't be controlled, one may have a heart attack and have bypass surgery, but I don't believe surgery is a first-line treatment for hypertension! I then searched on the pros of Lipitor and got a bunch of articles on lawsuits and counterfeit versions of the drug. By comparison, if you look at Medline Plus, information is clear-cut and focused.
Admittedly, I have a bias against machine-generated indexing and text mining, mainly because I haven't seen any program that gives us a high degree of accuracy. I've reviewed a lot of programs over the past few years--one of the things we regularly do is evaluate media coverage--and my clients still want us to do the analyses manually (i.e., counting the number of articles; organizing articles by what's driving coverage, for example, financials, clinical trial results, FDA approval, etc.; and tone-negative, positive, or neutral). The number of articles isn't a problem, but coverage drivers and tone are factors that don't seem to do well with most text mining.
In my own test searching, I ran across an oddity. The interface language for presenting results seems somewhat rigid. I entered the term "electrolytes" in the Treatment tab and the first of a dozen categorized responses was "Death." Not to worry though-the system then offered me a "Related Searches" click with a sub-click to "Treatments for Death." If healthBase has found a "treatment for death," it probably is the "ultimate medical content search engine."
Overall, Tellefsen considered, "Some of the press got their facts wrong with healthBase. It's not a commercial product, just a showcase so people could play with it. We never intended it to be a consumer, mainstream, supported application. Read our disclaimer page."
The differences between healthBase, the free service, and commercial applications are significant. For example, healthBase currently schedules monthly crawls of its sites, while schedules for commercial crawls are set by customers. Tellefsen said the company was now moving to a continuous crawl for its internet index. He added, "In the professional environment, we have complete control of the content set, for example, publishers. It's also targeted and configured for content enhancement and specific queries."
So What Will They Do Now?
Despite the maelstrom following the launch, NetBase intends to push forward by expanding the content and continuing to tweak the system. Tellefsen expects to expand to a few hundred websites very soon. "It was always our plan to take something out to the public and get feedback and then move forward. We are now looking at a bunch of changes, including dramatically increasing our content set. Our 2.0 version of the beta or demo will be significantly better with a lot more nifty things we'll leverage from the platform. Some things are easy to fix. We'll tighten up the precision in our new version coming out in a few weeks," said Tellefsen. He also expects significant improvements in gathering the feedback needed to improve healthBase. "We're going to have a big section up front and center. Right now the community link is not very prevalent. It's down at the bottom of the page. Next we're thinking about ways to have people tag things-with some controls-and directly engage in feedback."
Expect to see the upgrade of healthBase within a few weeks, according to Tellefsen. As for the future, NetBase is still considering other vertical search engines. Even though, as Tellefsen says, "It's not where we're making money and [it] requires a small investment, I hope we can do more showcases. Just from healthBase, we've already gotten a number of pretty serious leads. We're still a small company without any big marketing or sales program. We need publicity to reach serious companies with real needs."