On Aug. 23, Thomson Scientific issued a press release that announced the details of the reload of Derwent World Patents Index (on Dialog)—the first complete reload of Derwent since 1998. The reload is complete on Dialog and near completion on Questel•Orbit and STN . The database has major changes in patent classifications, new content, and improved display options.
This reload was necessitated in part by the introduction on Jan. 1, 2006, of the eighth edition of the International Patent Classification (IPC) system produced by WIPO (the World Intellectual Property Organization).
First, a bit of history: Most major patenting authorities have classified patents with IPCs since their introduction in 1968. After the initial publication of the IPCs, WIPO issued new editions once every 5 years with expansions and corrections. New patents were classified with the current edition, but older patents were not reclassified as their IPCs changed.
The eighth edition, however, is something completely different. The new IPCs exist at two levels: Core (which are mostly at the group level, e.g., G01V-001) and Advanced (much more detailed classes expanded from the Core level). Countries have been permitted to choose at which level they will classify their patents. And the once-every-5-year revisions are a thing of the past—the Core level IPCs are now revised every 3 years; the Advanced level, every 3 months.
Further, a massive project is underway to reclassify 50 million old patents with the new IPCs. The EPO (European Patent Office) has converted its ECLA codes (the detailed patent classifications used by EPO examiners) into the new IPCs for all the patents in its own search files; many other countries are reclassifying their own patents on varying timetables. The process is nearly finished at this point.
In other words, with the eighth edition of the IPCs, WIPO created a process similar to that of the U.S. Patent Office of updating classes regularly and—most importantly—reclassifying old patents. So, Derwent and other international patent databases are faced with the need to load patent-class changes.
Thomson has historically included only static patent classifications in the Derwent World Patents Index, i.e., the original IPCs, as listed on patent family members. But now that they are faced with loading historic reclassifications of these patents, they have decided to add other patent classifications, including those that require regular updates. They have started with U.S. patent classes (for records containing U.S. family members), but initially they are adding only original classifications; that is, the classes that appear on the patents' front pages—not U.S. reclassifications yet.
Original U.S. classes are a start. But the use of XML format will facilitate changing data and adding new fields to records, so Thomson can be more flexible in coping with reclassifications—U.S. and other. They plan to eventually include the following in Derwent:
- Japanese F-terms (the Japanese Patent Office's internal classifications)
- Current, not just original, U.S. patent classes (revisions up to six times per year); plus regular loads of reclassifications of older U.S. patents
- ECLA classes
When Thomson adds all these classes to Derwent, the database will essentially equal PlusPat, Questel•Orbit's international patent database, in class coverage. It's not clear at this point how frequently Thomson will update these classes on Derwent.
Thomson has taken advantage of this reload to add quite a bit of information to Derwent's records—much of it from original patent family members (from major countries). The new Derwent records will exist essentially in two parts. They will still include what Thomson calls "Invention level" data, that is, the Derwent-produced information—enhanced titles and abstracts, indexing, and basic patent family data (patent numbers, publication dates, etc.). Thomson has added to the invention-level data a backfile of 750,000 additional documentation abstracts. These will be available only in the Derwent files that previously included the extension abstracts.
Derwent has also simplified searching of chemical compounds by adding Derwent Chemical Resource (DCR) numbers to the backfile, thus making two different kinds of compound numbers—the Derwent registry numbers and DCR numbers—searchable by the same number. In other words, searchers may use either DCR or Derwent registry numbers to retrieve compounds for the entire length of the database. Further, searchers may now also link DCR numbers to appropriate chemical fragmentation codes for, e.g., compound applications, thus reducing the false hits they would retrieve by using frag codes for the compounds.
The second part of the Derwent records consists of "Member level" data, so named because Thomson has extracted the data from individual patent family members. These include:
- The patent family members' original titles, in multiple languages where necessary. This will facilitate, among other things, searching for a patent whose title is known.
- Full inventor names (not just last names and initials), along with inventor addresses. Most other patent databases already include full names, and this greatly facilitates searching for and distinguishing Josiah Reginald Smith from John Robert Smith—previously rather difficult in Derwent.
- Patent assignee original names and addresses.
- Patent agent information.
- U.S. patent main claims, for records with U.S. family members. In addition to providing additional text for searching, this will be particularly handy for searchers who formerly had to pull these into their search reports from a U.S. patent database.
The good news is that Thomson has added all this new information retrospectively, back to the beginning of the database. The different hosts vary in how much they have integrated the member-level data with the invention-level data.
The reloaded Derwent includes some enhanced display options:
- Records will include multiple drawings and chemical structures—images readily available from Internet-based patent databases but not usually combined with the more powerful search capabilities of the online databases. The images will appear in the appropriate context in the records.
- Thomson has standardized some patent number formats, including WO patent publications—no longer must you second-guess whether the year is two digits or four; the document number is five digits or six (a distinct improvement).
- Some of the data in the patent records, including patent numbers, filing details, and chemical indexing, will appear in tabular format; it can be cut and pasted into spreadsheets and word processing documents.
Additional changes in display options will vary with the online hosts.
Searchers will have to do some homework to take full advantage of the additional information and new display capabilities of the Derwent World Patents Index. But the results should be worth the effort.