The U.S. Government Printing Office (GPO) hit the on switch in early February and brought up its new digital system, FDsys, (http://fdsys.gpo.gov) for access to government information. The system will replace GPO Access (www.gpoaccess.gov). The new system is an information management system that enables GPO to gather, process, preserve, archive, and make U.S. government information available to the public. The system currently provides access to more than 154,000 documents. Document migration from GPO Access is expected to be completed by mid-2009 when the system will provide access to more than 300,000 documents. The system can handle items born digital, material that requires scanning, and documents resident on federal agency computers.
Robert Tapella, Public Printer, says, "GPO is committed to preserving and providing permanent public access to the documents of our democracy. … FDsys focuses on four key elements for management of Federal publications: versioning, authentication, preservation, and permanent public access." Versioning lets the user know which version of a document is being presented and is important when viewing legislation and draft documents. Authentication is necessary to maintain a chain of custody from author to end user. These four elements are part of a complex system that deals with documents dating from the 18th and 19th centuries.
Mike Wash, chief information officer, GPO, discussed the various document forms that are part of the system. He explained that there are three categories of information. "The first is born digital and the easiest because we can start to get information into a usable form as it is created rather than having to transform it. The second category is converted content—paper publications that need to be converted dating from the beginning of the country. The conversion is difficult as it involves scanning and a lot of computing resources to create a structure and searchable text. It is likely to take more than 5 years to complete. The third form is what we call harvested content. Today in the Federal Government, many agencies are self-publishing documents on their own websites. These documents are in the scope of GPO’s responsibility to provide public access." These responsibilities are found in U.S. Code, Title 44.
Searching FDsys will be more flexible and enable greater precision. FDsys will be organized into about 50 collections. The eight collections in the system now include Congressional bills, Congressional Hearings, the Congressional Record, the Code of Federal Regulations, and a weekly compilation of presidential documents. Wash indicated that the compilation of presidential documents was very popular. The compilation began on Jan. 20, 2009, and includes executive orders, speeches, press conferences, acts approved by the president, appointments, and other items. Searches can be done across all collections or in one collection. The results show the number of items retrieved in each collection and can be viewed by date published, government author, or organization. They can be sorted by relevance, by date, alphabetically, and other ways. GPO indicated that more enhancements will be made to searching and access. Wash indicated that new navigators may be added to collections to make it easier to find information.
Wash added that there will be enhancements on the submission side as well as on the access side. "The electronic submission side, which is in development now, allows agencies to go to a GPO website, securely log in, and fill out an electronic order form for the printing and electronic services they want. Later, there will be a way for them to attach their content electronically and be able to track their order." Items now in development include electronic submission of the Congressional Record, calendars, bills, and hearings. They will be followed by agency electronic submission next year.
Preservation is essential to ensure access in perpetuity to our history. It also presents challenges and complexity. I asked Wash about GPO’s approach. "We are approaching it in ways that ensure that information is in a form that could be rendered in whatever format would be required in the future. " Material in FDsys is being put into XML structures. Wash explained, "When we are migrating data from GPO access, it is being parsed and being put into an XML structure with tags for advanced search capability. In preservation processing, you want three ways to preserve information. One is very simple and is called refresh and ensures that you can open the file in the original application, making sure that the data stays current and fresh. The second process is migration. Say you had an old WordStar document from the early 1980s and you try to open it today. From an information management perspective, you would continue to migrate the WordStar document into successive generations and eventually into XML. The third process is emulation where you emulate the environment consistent with the time the document was created." Plans are that preservation will be scheduled and done on an automated basis. I asked Wash if GPO was considering hard-copy preservation. He said that they are working with industry and universities around the country to determine the best way to use hard-copy preservation.
FDsys was 4 years in the making. When I interviewed Bruce James, former Public Printer in 2003 (www.infotoday.com/searcher/sep03/drake.shtml), he pointed out that digital information raised many issues. He added that Americans have the right to know what their government is doing. The project that James discussed then has now come to fruition and is dedicated to maintaining permanent public access to government information.
Wash indicated, "Preservation and permanent public access go hand in hand. If you have usable content in your information management system, you should be able to render that content into a form needed for access." GPO has taken a significant step in providing an efficient and effective system for managing government information and making it easy for people to find the information they want.