Enabling Data Independence for Government Transparency
by Ralph Hodgson, CTO, TopQuadrant, Inc.
Open Government has become a popular theme, both in the U.S. and other countries. With “Transparency” gaining momentum, increasing categories and amounts of government data are becoming available on the web. In the U.S., an impetus for this was Barrack Obama’s memorandum to the heads of Executive Departments and Agencies. This included the following statement:
“… Government should be transparent. Transparency promotes accountability and provides information for citizens about what their Government is doing. Information maintained by the Federal Government is a national asset. My Administration will take appropriate action, consistent with law and policy, to disclose information rapidly in forms that the public can readily find and use. Executive departments and agencies should harness new technologies to put information about their operations and decisions online and readily available to the public. Executive departments and agencies should also solicit public feedback to identify information of greatest use to the public.”
Since that memorandum, in the U.S., a number of government and non-government initiatives have occurred. Of these the following are worthy of mention as illustrative of what is currently happening in the U.S. Open Government space, and as a motivation for the oeGOV initiative:
For some, transparency means making government data available on-line for browsing and searching. For others, it means using the web to make government “accountable.” Within this later group, accountability can be as specific as how elected members of the Senate and Congress conduct their activities in the political process towards outcomes that can be correlated to policy and election promises.
Such goals typically require connecting and correlating data from different government sources and doing so on a scale that can only really work with automation. For example, it may be important to look at politicians’ voting record on the environmental issues in the context of the industries and factories present in their districts, the number of pollutant spill accidents and other relevant data. As more raw data becomes available, analysis of this sort is moving from the realm of a labor intensive research project to an easily accessible query against the linked data.
Placing increasing amounts of raw data on the Web is a good first step towards government transparency. But for it to be truly useful it needs to be connectable. Since data coming from different sources is idiosyncratic, connecting across data sets today requires heroic efforts from brigades of programmers. To truly support the transparency goals, government data needs to be Findable, Interpretable, Decidable and Actionable, in short FIDA-friendly.
The challenges that confront us when we deal with data have been well reported, and can be summarized as:
“Freeing the data,” by publishing more and more diversely formatted data on the web, does not give us the “Data Independence” that is needed. To move beyond information overload, we have to “think data-based and not data-bases.” This we achieve by having data typed, linkable, composable and inferenceable through RDF and OWL.
oeGOV, (http://www.oegov.org), is an initiative started by TopQuadrant for establishing foundation ontologies for data source navigation, data aggregation, data transformation and sense-making. Ontologies for eGovernment enable:
As far back as 2003, TopQuadrant has been building ontologies for eGovernment using W3C standard languages RDF/S and OWL. The first eGovernment ontologies were the Federal Enterprise Architecture (FEA) Ontologies. At that time we needed an ontology of government bodies in order to build what we called a “Capability Manager.” This was a system, based on Semantic Web Technologies that could advise different stake-holders on the capabilities that were being provided and developed to support the FEA. We envisioned a system accessible through WEB Services that would allow agencies, other governments, businesses, and citizens to make queries about the FEA model, to find capabilities that support agency services and to assess compliance of their agency business models and architectures with the FEA.
In 2003, there was no comprehensive and trusted source for the organizational structure of the U.S. Government. Today, as far as we know, this is still the case. USA.gov provides a directory of government bodies, at http://www.usa.gov/Agencies/Federal/All_Agencies/index.shtml, but there is still not a machine processable version that defines the URIs of all Government bodies. Hence the motivation for oeGOV.
While currently the focus in oeGOV is on ontologies of Government, datasets of U.S. Government branches, agencies, departments, offices and state governments, the intention is to go beyond this with ontologies that:
oeGOV has already published a number of ontologies. In the spirit of incremental releases, the first set was published at the oeGOV blog site www.oegov.us/blog on August 1, 2009, the date that celebrates “Swiss Independence”, and deserving to be called – “Data Independence Day”.
At one level, the oeGOV ontologies can also be thought of as controlled vocabularies in RDF/OWL, establishing the URIs for every Government Body, such as usgov:DHS, usgov:DOC, usgov:DOJ, usgov:DOT, and usgov:EPA. Each Government Body is related through a model of Government structure and their reason for being can be correlated to Government Statutes. Building on this foundation, oeGOV ontologies are being used to provide an OWL schema for who is publishing what data and where that data can be found.
The oeGOV ontologies are being built in TopQuadrant’s TopBraid Suite. An example of some foundation concepts centered around ‘gov:Body’ are shown in the figure below:
TQ Gov Body
The diagram illustrates how data in different formats are associated, through publication events, to a Government Body. Over 500 government bodies are in the current release of the ontologies, which are catalogued at http://www.oegov.us/blog/?page_id=13 /. For example, the N3 graph of usgov:DHS is at http://www.oegov.us/democracy/national/models/owl/us1gov_dhs.n3
Building oeGOV is a huge effort, and we invite all interested parties to participate. We are particularly keen to have participation from U.S. Government Agencies, who we feel should own this work. To facilitate participation from different organizations and groups, the ontologies have been architected in a highly modular way.
If you are interesting in participating in oeGOV please send an email to [email protected].