A Cauldron Bubbles: PubChem and the American Chemical Society
Posted On June 6, 2005
A freely accessible public database of chemical information, produced by a division of the U.S. National Institutes of Health (NIH), is at the center of a controversy over publicly subsidized data competing with commercial information providers. The American Chemical Society (ACS), said to be the largest scientific society in the world, has voiced strenuous objections to the creation and availability of PubChem. The PubChem database provides information on the biological activities of small molecules. It is designed for medical researchers and is a component of NIH’s Molecular Libraries Roadmap Initiative. Since it was unable to resolve the controversy directly with the NIH, ACS has met with members of Congress about the issues.

The NIH is the primary federal agency for conducting and supporting medical research to improve health, fight disease, and save lives. Last year NIH announced a road map for medical research in the 21st century. The road map is “an integrated vision to deepen our understanding of biology, stimulate interdisciplinary research teams, and reshape clinical research to accelerate medical discovery and improve people’s health.”

PubChem is one of several databases developed by the National Center for Biotechnology Information (NCBI) as part of the Molecular Libraries Roadmap Initiative. Created in 2004, it is a freely accessible database that provides information about small molecules. It is designed for use as a research tool for biomedical researchers and serves as a starting point in the development of new medications. According to the NCBI, the database contains approximately 850,000 chemical samples contributed from public sources. The database will grow when researchers add data from publicly funded sources. It will not include patents. Initially the database will “contain chemically diverse small molecules. Over time the collection will be modified by screening for biological assays and other biological information to provide a working set of molecules” for use in studying biology and disease.

The American Chemical Society, along with its Chemical Abstracts Service (CAS) division, has protested the government creation and support of PubChem and its availability. In a May 23, 2005, press release, ACS stated: “The ACS believes strongly that the Federal Government should not seek to become a taxpayer supported publisher. By collecting, organizing, and disseminating small molecule information whose creation it has not funded and which duplicates CAS services, NIH has started ominously, down the path to unfettered scientific publishing … We have asked NIH to refocus PubChem, not discontinue it, but refocus on the stated mission.”

In relation to the source of PubChem records, ACS claimed: “For now the main issue is not whether publishing the records violates copyright. Instead, the issue is whether an arm of the government should involve itself in general information publishing by replicating an existing private service.” The release also pointed out: “PubChem may be free to the user, but it is taxpayer subsidized. CAS has been in business for nearly one hundred years and has served world science during that time. It now seems strange to be told that there is a new public policy that our services should be provided free by the federal government.”

The ACS position does not recognize that the U.S. government is a large publisher and is mandated by law to publish a variety of materials as part of agency missions and for the public good. For example, the collection, aggregation, and dissemination of weather data are provided at the taxpayer’s expense to benefit agriculture, aviation, and the general interest of the public. If weather data were not freely available, warnings about hurricanes, tornadoes, sleet, snow, and other weather events would be available only to those able to pay a fee.

In an interview, Michael Dennis, vice president for planning and development for CAS, repeatedly stated that ACS does not want to put PubChem out of business. He said that PubChem is not focusing on its original mission and that it duplicates CAS. Dennis explained, “PubChem is aggregating any small molecules they can get their hands on.” He also emphasized that the federal government should not be using taxpayer dollars to fund services available in the commercial sector. Ultimately, he said that Congress will make the decision regarding dissemination of this information, not NIH.

Consultant and ACS member Stephen Heller, a chemist for 35 years, said in an interview that the ACS position is misleading. He believes that there is about a 3 percent overlap between PubChem and CAS and that the taxpayer-funded activity makes sense because the database and its growth will come from publicly funded resources.

Harry Rzepa, professor of chemistry at Imperial College, U.K. and Peter Murray-Rust, reader in molecular informatics at University of Cambridge, U.K., said in an open statement ( “We wish to emphasize in the strongest terms the current and future value of the NCBI/NIH’s PubChem to the scientific and medical community … We have been using the molecules in PubChem and promoting their value in research … Until PubChem virtually no chemical information was freely available. It is generally not possible to look up freely the chemical formulae of common drugs, food additives, or materials in the environment. Yet much of this information was first published many decades or centuries ago. PubChem provides a reliable, instant resource for anyone.” Rzepa and Murray-Rust concluded: “Finally, we re-emphasize the global nature of scientific information. By sharing resources freely we detect and correct errors, and encourage innovation in the way we access information.”

There is substantial disagreement between ACS, NIH, and some scientists. The extent of the overlap between PubChem and CAS is one issue. NIH staff analysis shows relatively little overlap. ACS claims there is more than a small amount of overlap. The two sources differ widely in size, scope, and resources. PubChem and CAS are tailored to the needs of different segments of the scientific community. PubChem is a relatively small database aimed at fulfilling the needs of the biomedical community. CAS is a unique and valuable resource with a broad customer base. ACS spokespeople believe that PubChem will put it out of business. Others disagree and see PubChem and CAS as complementary—not competitive.

There also is disagreement about the growth of PubChem. NIH says that growth will come from publicly funded resources. CAS’ Dennis agrees and adds that the sources are international and should not be made available through NIH because NIH is tax-supported. Finally, there are differences about the role of the federal government in disseminating information that it and other governments have generated. ACS indicates that the commercial sector should be the sole distributor of scientific information regardless of source. Others see an important role for the government in serving the needs of biomedical researchers and people in health care.

The issue likely will be decided in Congress. ACS is working with the governor of Ohio, Robert Taft, and Ralph Regula, R-Ohio. (CAS is based in Columbus, Ohio.) The appeal does not relate to making biomedical information available. ACS is claiming that PubChem will put CAS out of business, resulting in the loss of 1,300 jobs in Columbus. It is doubtful that members of Congress will understand the nature of the issues; it is likely that they will pass some sort of legislation removing PubChem from public access.



NIH Molecular Libraries Roadmap Initiative

American Chemical Society

Chemical Abstracts Service

Statement from ACS

The Association of American Medical Colleges’ public statement in support of PubChem (May 30, 2005)

Contact information for members of the U.S. Congress

Miriam A. Drake is professor emerita at the Georgia Institute of Technology Library.

Email Miriam A. Drake

