On June 2, 2010, the United States Patent and Trademark Office (USPTO) announced an agreement to make more than 10 terabytes of bulk electronic patent and trademark data available through Google. The no-cost, 2-year agreement provides a wide variety of previously restricted and unavailable patent and trademark data files to developers and users. The bulk data is presently accessible through Google Books at http://www.google.com/googlebooks/uspto.html.
Patent and trademark information has been available through several free websites for a number of years. The USPTO has made final patent grants and trademark awards available on its website since the late 1990s. Google's Patent Search service has more than 7 million patents and an additional million patent applications, all freely available. However, this patent data represents just the tip of the iceberg for the in-depth patent and trademark researcher. The agreement between Google and the USPTO looks to expand that range of data and allow substantially more in-depth patent and trademark searching than is currently available.
Patent and trademarks grants have informational value as the end product of the patent and trademark application processes. As intellectual property, the patent or trademark represents the physical or creative item that its owner is empowered to control. The publication of information about patents and trademarks supports several purposes. First, it can communicate to a potential user the existence and scope of the patented or trademarked item. Second, it can communicate to a new inventor or trademark applicant the existence of a previous invention or trademark. The inventor/applicant can then determine if his or her invention or trademark may be infringing, or in the case of an invention, be unpatentable as it would not be novel or original. And third, potential marketers or licensees can identify valuable intellectual resources.
More in-depth research requires more information, however. In the process of pursuing a patent application, several information searches may be required, including searching for unpublished background information from the patent process. These can include prior art or patentability searches, claims validity and infringement searches, and a marketability search to determine whether a previous patent may interfere with a current product or method. (This has been a common occurrence with digital and electronic inventions and processes.) Much of this information has limited online availability, although it may be available through regional Patent and Trademark Depository Libraries, or for a fee through the USPTO and private search firms.
The Google-USPTO arrangement is a step in that direction. The bulk data includes both patent grants and published patent applications. While grants and some patent applications have previously been available, the resources are more up-to-date and contain more data. In a random test of one such patent, #5,308,196 for a "Yieldable Confined Core Mine Roof Support," the bulk patent data files identified a 2009 Reexamination of the Patent that was missing from the Google Patents search.
In addition, the bulk patent data includes separate full-text, bibliographic and image files for both granted patents and patent applications. There are also several categories of patent administrative tools and publications, including patent assignments, maintenance fee documents, patent classification tables and materials, and USPTO "Red Book" material, all of great value to the patent application and research process.
For trademarks, the document resources are just as rich, including Trademark grants and applications as far back as 1870, recent applications that do not appear to be available on the USPTO Website, recent assignments of trademarks, and decisions of the Trademark Trial and Appeals Board.
The depth of resources available also presents the greatest difficulty in their use. As presently configured, these database files are more of a developer's tool than a searchable user's resource. The files are presented arranged by category (patent grants, classification tables, trademark assignments, etc.) and by decade or year, but that seems to be the only organization. The files are also presented as large folders that must be unzipped in order to access the contents. The format of individual files also varies. Many are image files contained within the zipped folders, but most appear to be XML documents that will need substantial processing in order to be usable. Some of the folders can be nearly a gigabyte in size with a thousand individual files.
None of the data in the files appears to be accessible using Google or another search engine. The folder names are also often cryptic using an internal USPTO system, although the two largest data collections-patent grants and trademarks-utilize the root of the patent or trademark number as part of the file name. For example, patent number 5,308,196 can be found in the Granted Patents collection under file name USP2010w01/05/308. Similarly, trademark number 1,288,000 (for "Gilley's," the Texas bar featured in the film, Urban Cowboy) can be found in the Trademark Grants & Applications 1870-2008 collection, under the file name USM0076/1288.
Nonetheless, the partnership is a positive step toward making more government information available at no or low cost. As part of the Obama Administration's Open Government Initiative, the USPTO is "committed to providing increased transparency" of its information for analysis and research. However, the USPTO also acknowledges that it does not have the "technical capability" to offer the data on its own website. By partnering with Google, it has achieved the goal of providing access to its bulk data, while also providing it time to develop a long term strategy for providing the bulk data to the public in more accessible ways.
For its part, Google has agreed to host the data without modification for 2 years. Other than organizing some of the files into the zipped folders, Google provides the data as received for use by anyone at no cost. Google has not indicated whether it intends to take further action to develop the database as part of its Google Patent service. However, the data is freely available to other developers and commercial services to work with and publish, and the USPTO hopes to add more data to the Google agreement. Perhaps soon, patent researchers and search firms will now be able to do for themselves what they previously had to pay for, or could not do at all.