Underlying top-quality databases one usually finds top-quality taxonomies. The fields of health sciences and healthcare continue to draw attention and funding for taxonomy development from both government and private sector sources, as shown by two recent announcements.
The National Library of Medicine (NLM) has announced an agreement with the College of American Pathologists (CAP; http://www.cap.org) to add CAP's SNOMED CT (Systematized Nomenclature of Medicine — Clinical Terms) to NLM's Unified Medical Language System (UMLS). With 344,000 concepts, SNOMED CT is the world's most complete clinical taxonomy. Federal health agencies have combined funds to acquire a permanent license that makes the taxonomy available free to all users. With the federal commitment, experts predict a surge in the development of new information services in health care.
On the commercial side, Factiva now offers a pharmaceuticals and healthcare taxonomy with over 800 industry-focused terms to help companies build data from both internal and external sources. Focused on business intelligence, the Factiva taxonomy generally comes packaged with support and guidance from Factiva's consulting service staff.
The addition of the SNOMED CT taxonomy to the UMLS Metathesaurus expands coverage to some 875,000 concepts and around 2 million terms derived from over 100 different vocabularies. Betsy Humphreys, associate director of library operations at NLM, said that the UMLS has around 1,500 to 2,000 licensed users at present, most of them system developers of medical information systems. The UMLS collection focuses on patient data, digital libraries, Web and bibliographic retrieval, natural language processing, and decision reporting.
Three years of negotiation preceded the NLM's issuance of a 5-year, $32.4 million contract to CAP for a perpetual license for the core SNOMED CT in Spanish and English plus updates. While NLM pays the annual updating fees, the one-time payment for a perpetual license represents a joint funding by the members of the Consolidated Health Informatics (CHI) initiative, an effort to adopt federal-wide standards for clinical health data. Member agencies include major components of the Department of Health and Human Services (HHS), as well as the Department of Defense and the Department of Veteran Affairs.
NLM expects to integrate the SNOMED files from CAP by the first quarter of 2004, after which the file will update more rapidly. The UMLS incorporates, links, and distributes different biomedical and health vocabularies and classifications in a common format.
The revised NLM license agreement for UMLS now includes SNOMED. U.S. citizens can use SNOMED and the UMLS Metathesaurus without charge and without signing a separate license agreement with CAP. Non-U.S. UMLS users must sign a separate license agreement with CAP for production uses. NLM also supplies free Java software to assist users in producing subsets of the Metathesaurus, one of which will soon accommodate SNOMED data. [For more information on UMLS, go to http://www.nlm.nih.gov/research/umls/.]
Simultaneous with the announcement of the taxonomy improvements, HHS secretary Tommy G. Thompson announced that the NIH's Institute of Medicine would design a standardized model of an electronic health record, which the healthcare standards development organization, known as H7, would evaluate. The model record should become available sometime in 2004 and, again, the U.S. government will offer it to the world at no cost. Both announcements reflect HHS' ongoing efforts to develop the National Health Information Infrastructure and a unified electronic medical records system.
Humphreys said that she "hadn't met a U.S. software developer that wasn't excited" about the development of UMLS. "Maintenance of taxonomies is expensive and requires a large scale commitment," stated Humphreys. "Developers don't want a huge overhead in background work. It's a no-win game for them. The real important piece here is the federal government's commitment. No one wanted to put a lot of effort into developing and using a terminology and then have the government pick another. Hospital and clinical information system software developers wanted the federal government to make up its mind and make the taxonomy readily available. There have been a number of years of inertia around this issue. We have now removed a couple of those barriers that made moving ahead risky, but there's still a lot of work to do and investments to make."
Jonathan Meyer, head of Clinical Focus, Inc., a Mount Vernon, N.Y., software house developing patient-centric information services for alerting physicians to changes in clinical practice, confirmed Humphreys' perspective. Meyer stated: "We are encouraged to see that the industry is moving closer to a standard for vocabularies. We don't want to have to do it alone. That's not our core business, but we need a taxonomy for our applications. There are competing vocabularies out there, but we don't want to reinvent the wheel."
Some problems remain with the new taxonomy's license. While the UMLS license has no barriers to commercial usage, it only applies to vendors serving U.S. customers. Those serving customers outside the U.S. may need to make separate arrangements, depending on which part of the UMLS taxonomy they use and who supplies that portion. NLM monitors usage, requiring a brief report each year on what licensees do with the taxonomy and identification of any people to whom the licensees may have distributed it.
Complicating the issue even more, Humphreys noted that all federal agencies have approval to use all the UMLS with no geographic restrictions and there are also some humanitarian exceptions written into the agreements with taxonomy suppliers. In a global economy with advanced information services available over the Web, this could leave medical systems developers with a lot of tedious negotiations to conduct and/or nationality-checking password systems to design. Still, Humphreys seems to be right that the federal government has made an excellent start.
If NLM decides to expand the drug codes section, one future licensee of UMLS may turn out to be Factiva, which announced a new taxonomy to serve pharmaceutical and healthcare companies. The Factiva codes currently contain over 800 Industry, Organization, and Subject terms designed to gather and construct business intelligence from internal and external sources. The codes include over 580 subject terms under Corporate News, Demographic Groups, and Medical Systems/Conditions Structure and Delivery. The XML-formatted taxonomy provides a hierarchical tree-structure, labels, definitions, and multiple alternative names for each term.
The new taxonomy represents a series of extensions to the existing Factiva Intelligent Indexing taxonomy and can be applied using Factiva Fusion or a range of other categorization and search software packages. The taxonomy can also be used as a stand-alone application or in combination with the Factiva Company Taxonomy, which currently covers over 5,000 pharmaceutical and healthcare companies worldwide. [For more details, go to http://www.factiva.com/pharmataxonomy.]
Simon Alterman, vice president of content at Factiva, said the company expects that most users of the Factiva taxonomy will work out a consulting arrangement with Factiva's experienced staff. Alterman explained that, in the consulting work on implementation: "We bring a framework, a starter kit. We customize, advise, and train on what will help clients move forward. Then we hand it over. Where possible, we like to get feedback. Our taxonomy itself was created by customer input."
The Factiva staff can deal with a number of alternative modes. "Application work will differ from company to company. Some already have application tools; others are investigating the market for that. There will be a range of circumstances with some customers having more, some less preparation, some needing customization, some using as is. We bring to market not only the taxonomy but also advice on how to use it."
Both these taxonomy developments offer different levels of service and different directions of focus, but both indicate movement in the health information field, the field that has historically generated change throughout the information technology and services area.