Congress.gov Beta: An Early Look at a New THOMAS
Posted On September 27, 2012
The U.S. Congress and the Law Library of Congress launched the beta version of a new public legislative information system, Congress.gov (temporarily beta.congress.gov). Congress.gov eventually will replace the current official legislative information system, THOMAS. In a press release on Sept. 19, 2012, the Library of Congress noted that “the [current] system has been updated over the years, but [its] foundation can no longer support the capabilities that today’s Internet users have come to expect, including access on mobile devices.”
Along with its adaptability to tablets and smartphones, researchers will be pleased to discover that Congress.gov creates permanent URLs. On THOMAS, links to bills are temporary and specific to a user's individual search session; many users have bookmarked or copied and pasted a THOMAS URL only to get an error upon trying that URL again. Congress.gov URLs are not only permanent, they also are consistently formatted and should be helpful for surfacing official congressional information in Google and other general web search engines; for example, H.R. 3630 in the 112th Congress is: http://beta.congress.gov/bill/112th-congress/house-bill/3630. THOMAS has employed some work-arounds, such as the ability to share on social networking sites and construct legislative handles as permanent links, but the change on Congress.gov brings the system into more standard 21st-century practice.
Only a portion of the information on THOMAS is available on the beta site, so early testers will get an incomplete experience. The popular THOMAS Bill Summary and Status (BS&S) and text of legislation databases, each covering the 107th Congress (2001-2002) to the present, provide the core legislative content at launch. The “Legislation” option on the Congress.gov search box searches these two databases for all years that are online so far. Similar information for the 93rd to 106th Congresses, currently on THOMAS, will be added later. The beta site also features new profiles of Members of Congress going back to the 93rd Congress (1973-1974) and a new video and text introduction to the legislative process. THOMAS databases in the queue to migrate to Congress.gov include the texts of the Congressional Record and congressional committee reports, and databases tracking Senate action on treaties and presidential nominations.
Two of the most valuable features of THOMAS are the detailed recordings of the steps a measure has gone through in the legislative process and the links from those steps to a relevant source document. As an example, when THOMAS notes that House or Senate floor debate has taken place, that step is linked to the pages of debate in the Congressional Record. Committee reports are linked in the same way. Congress.gov does not have this feature fully implemented because some of the content to be linked is not online yet. Status steps on the beta Congress.gov are also significantly less detailed than those available in the more current Congresses on THOMAS. This means a significant loss of utility for such tasks as tracking committee action and floor amendments until the full status content is brought over.
On the plus side, searchers will find a much cleaner and more easily navigable interface. Current information is easy to find but does not obscure the legislative background and helpful instructions needed to make sense of the complex legislative process. Information architect Peter Morville of Semantic Studies consulted on the redesign, which should accommodate additional THOMAS content without the clutter of the old site.
For search, the new Congress.gov uses Apache Solr. The popular open source software is used in other federal government applications, such as the NASA Planetary Data System catalog and FCC.gov’s website search. Solr’s faceted search capabilities are also on display at the much less obscure Zappos.com consumer shoe-shopping site. Congress.gov presents the same model used at these sites: Take a stab at a first search and then refine by selecting relevant facets, or attributes, of the information you seek. With Zappos, this means sandals or boots; on Congress.gov, it means House or Senate, among other choices.
On THOMAS, the central bill search box performs a search of the current Congress. Searchers must choose Advanced Search to get to prior congresses, a design that tends to obscure the presence of past data. On Congress.gov, the legislation search box performs a search of all Congresses online and displays the option to limit that search to the current Congress, along with other facets for refining search results.
Opportunities to test full-text searching are limited on the beta site. At this early date, the only full-text database to search is the text of legislation, which is searched simultaneously with the more descriptive metadata of the Bill Summary and Status database. Along with limited content to search, the search functionality of the beta is limited in this initial release. According to the Search Tips Overview page, “Search operators are not available at this time but will be added in the future,” and “the ability to limit your search by field is not available at this time but will be added in the future.”
Most of the full-text documents on THOMAS are also available on the Government Printing Office FDsys.gov site, the official website for congressional and other government documents. The value THOMAS adds is in tying the documents to the legislation they affect and placing them in context of the legislative process. Plain language bill summaries and status steps unique to THOMAS help researchers decipher the sometimes arcane manner is which a bill becomes a law, or doesn't. Serious researchers will still want to use both THOMAS/Congress.gov and FDsys.gov, an approach that should become easier with both systems adopting the faceted search approach.
Progress on the federal legislative information system is welcome in many quarters, but advocates of open access remain disappointed that Congress has not used this opportunity to introduce a bulk data download capability. Free alternatives to THOMAS, such as GovTrack.us, rely on the inferior process of web scraping congressional databases to get their data. GovTrack’s scraped database is used by a number of civic information sites, including OpenCongress.org and MapLight.org. Advocates argue that the diverse needs of citizens can be better met if straightforward access to the data in bulk is opened to web developers. An example: THOMAS and the Congress.gov beta lack a key feature—email and RSS alerts—offered by GovTrack, OpenCongress.org, the Sunlight Foundation’s Scout site, and other alternatives.
The Congress.gov beta is still in the early stages of incorporating existing THOMAS content and implementing the improved search functions that THOMAS users have been waiting for. The Law Library of Congress, which is managing the transition, is anxious to get your feedback and suggestions via its form at http://beta.congress.gov/survey.