If you aren't accustomed to dealing with publicity from Chemical Abstracts Service (CAS; http://www.cas.org), you may need some guidance. Although adding new features and improvements is a year-round activity at CAS, impact announcements tend to be concentrated near the biannual meetings of the American Chemical Society (ACS; http://www.acs.org), CAS's parent company. It was no different last month when three press releases announcing new features were made in conjunction with the ACS National Meeting in Chicago.Although these developments are of primary importance to chemists and chemical information specialists, I should point out that chemistry is indeed "the central science." Both the science and its unique information-handling challenges are relevant to applications and technologies affecting all of us, especially the pharmaceutical industry. Two aspects of chemical information that are unique, compared to all other kinds of information, are chemical structures and chemical reactions. Over the past several decades, both have provided myriad challenges for seeking solutions in the realm of chemical information.
The first press release announced the enrichment of two CAS-produced databases with both calculated data and additional chemical-reaction data. The CAS Registry File is a massive database of more than 32 million chemical substances. File records for several million of those substances will be enhanced with eight data elements calculated by Advanced Chemistry Development, Inc. (ACD). The data elements are chemical/physical property data of primary relevance to drugs and pharmaceuticals. Future expansion of this enhancement to additional compound records is anticipated.
Pharmaceutical chemists have determined that a number of properties have characteristic values for a wide range of pharmaceuticals and other biologically active compounds. Among these properties are the eight now being used to enhance the CAS Registry File. Scientists or information specialists can now search for specific data or ranges of data, either directly or in conjunction with other molecular properties, including chemical structures or substructures. Formerly, such correlations had to be made by using extra steps like querying other databases or reference sources for the properties of compounds of interest.
The same press release also announced enhancements to CASREACT, the CAS-produced chemical reactions database that's loaded on the STN search service and other media. CASREACT currently covers data and information on chemical reactions gleaned from the CAS Chemical Abstracts (CA) database's chemical publications dating back 1985. CASREACT will now be enhanced with details on more than 750,000 chemical reactions from InfoChem, a German software company. Much of this information also exists as a database (titled CHEMREACT on STN), which covers published literature from 1975-1988. Therefore, integrating this information into CASREACT will extend its coverage back an additional 10 years.
Several databases of chemical reactions exist, four of them on STN alone. In practice, it's deemed best to search all available databases, even though all of them covered most of the same core group of about 120 journals dealing with chemical synthetic methods. Even though overlap is extensive among the databases, each had unique features or records. Therefore, a comprehensive search required searching all of the databases and either comparing the output or eliminating duplicates. Even though the CHEMREACT database will continue to exist on STN, incorporation of this data into CASREACT, using CASREACT standards, will facilitate chemical-reactions searching.
CAS's second press announcement covered enhancements to SciFinder, the server-based solution for making CAS databases easily accessible (by point-and-click) for scientists and other end-users. According to the announcement, the new features are planned for an October release. SciFinder has a large number of users at corporations and other organizations all over the world. A version for academic institutions, SciFinder Scholar, is used at hundreds of universities, both domestic and international.
Of the enhancements announced for SciFinder, those in current awareness, chemical-reaction searching, calculated properties, citation searching, and Spotfire match the capabilities that are currently available (or soon to be available) for files on STN. However, the BLAST DNA/protein-sequence-analysis program will be unique to SciFinder for now. According to a company representative, it may be added to other CA files in the future.
The third press release announced a cooperative effort between CAS and Spotfire, Inc. to combine the latter's DecisionSite eAnalytic applications and CAS's SciFinder. The incorporation of both tools on computers used by subscribers of both services will allow integration and analysis of data from a variety of sources, including CAS databases and internal or proprietary databases accessible at customer sites. Chemical structures, chemical/physical properties, and other data can be retrieved, integrated, analyzed, and visualized by use of the combination of these software packages.
Once again, pharmaceutical scientists and other users interested in structure-property relationships will find their work easier to perform and their creativity enhanced. Pharmaceutical companies in particular are currently even harder pressed to innovate faster and better. The initial burden falls on the research scientists, information specialists, and systems specialists to handle the rapidly increasing amount of data and information that is generated. As a former laboratory scientist commissioned to produce new biologically active compounds, the increasingly useful tools available to current scientists both boggles my mind and makes me extremely envious. Performing such research now looks to be even more fun than it did 30 years ago.