A new web portal was announced at the recent IASSIST conference in Washington D.C. Designed to make working with metadata easier, OpenMetadata.org (OM) is the product of Metadata Technology North America and Integrated Data Management Services, which created the site to “facilitat[e] access to standards based innovative technologies for the management of socio-economic, scientific, and other statistical data.” Though the site is still in its initial deployment, the goal is for it to become a go-to resource for discovery, access, and tools for using statistical metadata.
The site currently focuses on two metadata standards: the Data Documentation Initiative (DDI) and the Statistical Data and Metadata Exchange (SDMX) standard. DDI is designed to support research data, in particular data from the social, behavioral, and economic sciences. DDI is encoded in XML and is very thorough in its coverage of research phases: concept, collection, processing, distribution, discovery, analysis, repurposing, and archiving. SDMX is designed to enable the exchange of statistical data, primarily in demographic and social statistics, economic statistics, and environmental and multidomain statistics. SDMX uses a variety of standardized formats for the exchange of metadata.
One valuable service is the Survey Catalog, which aggregates information on survey datasets from around the world. The tool is similar to TaxonomyWarehouse.com in that it provides information on datasets available, but not necessarily the data itself, as some datasets require permission or licensing for use from the owner. Some data is directly accessible but not all. Perhaps a purchase and download capability like that from ReportLinker.com could provide revenues to support the site and convenience for users. Thousands of studies are available at present, but relatively few catalogs are represented. Hopefully, many more will be added in the coming months.
Also available is a separate but equally important Agency Directory, which contains information on all of the major statistical agencies around the globe. Again, much needs to be added, which is made clear by the publishers. Links and location are provided to all agencies found by searching on services, standards, scope, or entity type. Mapping of results is also available, powered by Google. Each tool could become an indispensable source in reference work, let alone promote the value of being able to exchange the data easily.
Tools and frameworks such as the General Statistical Business Process Model (GSBPM) are all featured on OM to help promote the adoption of the standards and the development software that make use of them. For Java developers, the OpenMetadataFramework will be available in the second half of 2012. This collection of libraries and tools will be for building software around the DDI and SMDX standards. Early test projects are listed for reference. Also detailed in the “Tools” section are open source packages used in creating the products and services provided on the site.
One dynamic part of the site is the OM Labs—a place for playing with apps and tools to make using statistical data easier. As of this writing, one tool is available, in beta, for registered users of the site—the OM File Manager. This is an online utility that will enable the visualization, processing, conversion, and sharing of statistical data files. One phrase that caught this reviewer’s eye is a future feature that will produce linked data—a critical capability on the web today.
It is not clear what relationships or integrations OM has or hopes to develop with some of the other major players in the metadata space: W3, OMG, NISO, ISO, OASIS, or Schema.org, Facebook’s OpenGraphProtocol, or the Microformats community. While the nature of the DDI and SDMX standards is more formal than these last tools, it certainly could not hurt for these datasets to be featured, for example, in Google’s new Knowledge Graph displays on search results pages.
How will OpenMetadata participate in the linked data or semantic web world? Publishing of statistical information as linked data could allow OM to become a DBPedia or Kasabi-like service in this niche. Data and metadata with structural integrity and authoritative sources would be a very welcome addition to the linked data web and be an excellent example of why adopting the methods evangelized by the site are good for business. Find, license, use—simple, persistent, and trustworthy sources of high-quality data are what businesses want and need.
WIll other metadata frameworks be included, or will the focus on statistical data provide a firm scope for the site? The current, possibly short-term vision is more limited that the name. There is talk in several forums about providing better education, easier to use tools, and concise directories of metadata tools and services.
These questions would be great to have in the community forums that are promised but not yet set up. It is good to see that participation will be encouraged. It is not yet possible to request a user account and the site is still a little rough around the edges, but it is early days yet. Requests for comments from the site’s maintainers were not available by press time. Another review, and hopefully interview, would be useful next quarter to see how the service progresses.