Data.gov: Opening the Doors to Government Data
Posted On June 4, 2009
The Obama administration's push for open, transparent, and accountable government has left the realm of policy and has now entered the real world of data downloads, machine-readable formats, and widgets. On May 21, 2009, exactly 120 days after President Obama issued his first memorandum on "Transparency and Open Government" (www.whitehouse.gov/the_press_office/TransparencyandOpenGovernment), Vivek Kundra, chief information officer within the Office of Management and Budget, has delivered the goods by launching Data.gov (www.data.gov). Executive agencies release a lot of data, but most of it is hard to find and hard to use because it is published using proprietary formats of "limited utility." Data.gov offers a solution to this problem-the government has created a "citizen-friendly" one-stop shop "that provides access to Federal datasets."
By creating a searchable and accessible data catalog, the government anticipates that innovation and entrepreneurship will flourish by allowing citizens to create "new web applications that help individuals, communities, and businesses access, sort, visualize, and understand public data in new ways." Furthermore, the government expects and hopes that data transparency will unleash "economic, scientific and educational innovation, as well as civic engagement by making it easier to build applications, conduct analysis, and perform research" (www.whitehouse.gov/open/innovations/Data).
So what can you find at Data.gov?
Data.gov features two searchable catalogs: a "raw" data catalog and a tools catalog. The raw data catalog consists of downloadable files in multiple formats-XML, Text/CSV, KML/KMZ, Feeds, XLS, or ESRI Shapefile. The tools catalog offers widgets, data mining and extraction tools, applications, and other services. The extraction tools enable users to create maps, tables, or charts of subsets of data (www.data.gov/faq). Each data set includes a link to a metadata page that offers a summary of the data set, user ratings, data set information, data set coverage, information from the contributing agency, a data set description, and additional technical documentation. Both catalogs are searchable by category, agency, keyword, and or data format. Every data set can be rated, Digg-style, for its value and usefulness. A handy tutorial (www.data.gov/howtouse) has been created to help walk you through the site.
Based on recommendations made by executive branch agencies, the site initially launched with 47 raw data feeds and 28 tool data feeds, but the goal is to add more data sets and tools on a regular basis. And they have already done just that-the number of raw data sets has risen to 80. Check back regularly for new additions. One complaint-no RSS feed for new additions-this needs to be rectified soon. (Note: the Sunlight Foundation has created its own unofficial RSS feed for Data.gov at http://data.sunlightlabs.com/data.gov/datafeed.xml.) And since we are requesting upgrades to the site, adding a blog to the site would be a welcomed addition. The blog could update changes/improvements to the site, discussions about why some data sets are not available, new additions to both the raw and tools catalogs, and possible new data sets yet to be released. Oh, one more thing-how about adding an "Innovations Gallery" similar to the White House Open Government Initiative Gallery (www.whitehouse.gov/open/innovations). An Innovations Gallery fosters community activity and participation as well as provides a great opportunity to share and highlight the creativity and passion of web developers across the country.
So far the executive agencies participating/sharing their data with Data.gov for the raw catalog include the Department of Commerce (Bureau of Economic Analysis, Census Bureau, National Oceanic & Atmospheric Administration, National Weather Service, and U.S. Patent & Trademark Office); Energy Information Administration; Environmental Protection Agency; Department of the Interior (Fish & Wildlife Service, U.S. Bureau of Reclamation, U.S. Geological Survey); Small Business Administration; Treasury Department, IRS (Statistics of Income), Office of Management & Budget, and Social Security Administration. The tools catalog includes the Department of Agriculture (Economic Research Service); Department of Commerce (Census Bureau, National Oceanic & Atmospheric Administration); Department of Education (National Center for Education Statistics); Health and Human Services (Agency for Healthcare Research & Quality, Centers for Disease Control & Prevention, Food & Drug Administration, National Center for Health Statistics, National Cancer Institute); Department of Transportation (Bureau of Transportation Statistics); Homeland Security (Federal Emergency Management Agency); Department of Justice (FBI); Government Printing Office; General Services Administration; and the National Science Foundation.
So if your inner geek is raring to go, you can find data sets on Residential Energy Consumption (www.data.gov/details/10), Clean Air Status and Trends Network (CASTNET): Ozone (www.data.gov/details/8), the USA Spending Contracts and Purchases (www.data.gov/details/132), or you can embed the Emergency Preparedness and Response Widget (www.data.gov/details/110) on your website or blog.
If the data set you need or want is not in the catalog, go ahead and submit your request (www.data.gov/suggestdataset). Although there is no guarantee that your request will be filled, both Kundra and Beth Noveck, deputy CTO in the Office of Science and Technology, strongly believe in citizen participation and engagement and built the site to reflect that belief. Data.gov is based on the assumption that "people are smart and they have things to share," and "it's an important step in creating opportunities for citizens to engage with the government and co-create policy" (The Washington Post, May 21, 2009, http://bit.ly/V1AY5).
The Sunlight Foundation, a passionate and active proponent of open and transparent government, enthusiastically supports Data.gov by launching a new contest for web developers-"Apps for America 2: the Data.gov Challenge" (http://sunlightlabs.com/contests/appsforamerica2). Partners in this endeavor include Craig Newmark (Craigslist), Google, O'Reilly Media, and TechWeb. The winners (first prize is $10,000, second prize is $5,000, third prize is $2,500, 10 Honorable Mentions receive $500, and a super bonus visualization prize is $2,500) will be announced at Gov 2.0 Expo Showcase (www.gov2expo.com/gov2expo2009) at the end of the summer. The Sunlight Foundation believes that "when government makes data available it makes itself more accountable and creates more trust and opportunity in its actions." The contest will showcase the talents, ingenuity, and creativity of developers who use Data.gov data to create "compelling applications" that are both easy to use and inexpensive to design.
This is a brave new experiment in open, transparent government. Time will tell how useful and accessible the site will be, but it is most definitely a welcome step in the right direction.