September 1, 2008
GISWeekly examines select top news each week, picks out worthwhile reading from around the web, and special interest items you might not find elsewhere. This issue will feature Industry News, Top News of the Week, Acquisitions/Agreements/Alliances, Announcements, People, Training, New Products, Around the Web and Events Calendar.
GISWeekly welcomes letters and feedback from readers, so let us know what you think. Send your comments to me at Email Contact
Susan Smith, Managing Editor
Update on the DHS Data Model
By Susan Smith
What is the Department of Homeland Security Data Model?
The Department of Homeland Security (DHS) Data Model has been around since July of 2007, openly published on the Federal Geospatial Data Committee’s (FGDC) website, based, according to Mark Eustis, Department of Homeland Security, OCIO and GMO, and Mike Lee, FGDC Homeland Security Working Group on “open government and international standards.”
A definition of the DHS Geospatial Data Model (GDM) found on the FGDC site reads as follows: “This DHS GDM is a standards based, logical data model to be used for collection, discovery, storage, and sharing of homeland security geospatial data. The model will support development of the Department’s services-based geospatial architecture, and will serve as an extract, transform, and load (ETL) template for content aggregation.”
According to a report published in July, 2006, the DHS is mandated to comply with standards, and the standards listed at that time include the National Information Exchange Model (NIEM) and FGDC framework standard.
According to Eustis, “the GDM is a harmonized collection of International, Federal, and certain community standards…as well as national plan recommendations, DHS sector-specific requirements, and various emergency-response and general standards models from pertinent resources. Here’s the abridged list:
- FGDC Framework Data Content Standard.
- USGS Project Bluebook data model (ESRI form),
- National Information Exchange Model (NIEM),
Secondary and/or associated sector-specific resources
- FEMA MultiHazard; Emergency Management & Infrastructure Protection,
- DHS Infrastructure Protection Taxonomy, v1.0
- Geographic Names Information System (GNIS) Feature IDs and types,
- Feature types for FGDC Emergency Management Symbology,
- National Incident Management System (NIMS) Resource types,
- National Hydrography Dataset (NHD) Feature Types,
- FGDC Cadastral Subcommittee, Revised Cadastral Model,
- National Response Plan Categories,
- Homeland Security Infrastructure Protection (HSIP) Feature Types,
- NASA Land Cover Classification types,
- American Planning Association (APA) Land Use Classifications,
- FGDC and ISO Geospatial Metadata
The GDM that is the source for the data schemas produced by the GDM-o-Matic is an entirely open, completely standards-based conceptual model.”
So far the DHS Data Model aggregates a lot of the domain experience from the emergency community including: FEMA, FGDC, National Response Frameworks, data products that are emanating from the NGA, and modeling done against the feature types that are within the Homeland Security Infrastructure Program (HSIP) dataset.
The bottom line is that the standards the DHS refers to are informed by federal Homeland Security agency standards and other federal agencies and their reliance on ESRI software and geodatabases. The DHS Data Model is not software or platform independent at this time.
“Think of the data model like the New York Public Library and it contains everything there is to know about almost everything that could happen in and around the emergency services world, which includes the potential impacts to environment, infrastructure, and a number of different domain areas,” explained Eustis. The “schema generation tool” or “GDM-o-Matic,” an implementation model translation tool, allows people to extract a direct implementation model and put that into their ESRI software. “It will also publish an open XML format product in our next iteration, but the schemas that are produced now are an XML product, formatted to fit into the standards that ESRI uses.”
Two questions come to mind in reviewing this information: 1) Does the DHS Data Model use Geographic Markup Language (GML), the XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features? GML serves as a modeling language for geographic systems as well as an open interchange format for geographic transactions on the Internet (as defined by the OGC website). And 2) Where does extract, translate and load (ETL), the translation software designed by Safe Software, come in?
Eustis explained that “GML is an object based language and ESRI is not. ESRI is the tool that DHS uses internally and it’s the tool that, according to Daratech, has an 89% marketshare in the state and local communities.”
However, pure GML is available directly on the FGDC website with some Oracle Spatial implementations, and “as the tool evolves I guess there’s an expectation that we will be publishing pure open GML-NIEM-compliant XML and work towards the open formats,” Eustis added.
Because this is an evolutionary process, Eustis said that they have started with the tool with which the DHS and NASA have an enterprise license, and that is ESRI, so “ESRI geodatabases and formatting standards are the industrial standard that folks tend to be using. The hosts and nomenclature (of the DHS Data Model) all come from open standards and it’s our hope and plan, depending upon reaction from the community, to continue to support and move towards more and more formats and standards. The next one will be the NIEM XML. The XML statement can be parsed by any tool you want. There’s nothing hidden.”
NIEM is the National Information Exchange Model, a collaborative program between DHS, the Department of Justice and a number of other entities within the federal community. NIEM builds on the longstanding global justice data XML format GJXML.
The modeling team also builds Oracle implementations to serve other users, such as Autodesk, Intergraph and MapInfo users who work with Oracle Spatial. “The plan is to ensure that this technology remains as open, accessible, and useful as possible; thus the sponsorship and release through the FGDC,” said Lee.
Safe Software, creators of extract, translate and load (ETL) technology, plans to adapt the technology that the DHS has produced to generate schemas in the schema generation tool (sometimes called the GDM-o-Matic) so that it can generate schemas transfer files back and forth are conformant to the GDM. Safe will build this capability into their software.
Eustis described the GDM as a bigger conceptual repository of information and the schema generation tool, GDM-o-Matic allows people within particular sectors of HS in particular geographic areas to reach into that repository of information and pull out information that is very specific to them and share data in that common knowledge. People always have different ways of doing things in different places. The GDM conceptual model provides a common operational picture and reference point for users to find features and attributes they need, and if necessary use the ETL process.
“Sharing data is not a trivial process,” Eustis pointed out. “One group called a hydrant a hydrant, another group might call a hydrant “hyd” or another group might call it a standpipe. You need to have some mechanism of standardizing the nomenclature and then providing the tools that people can map from one space to the next.”
Paul Daisey, Geographic Division, U.S. Census, OGC, incidentally was one of the authors of both UML and GML. Eustis said that UML, the conceptual language of GDM, could be “exported as a GML model if someone wants one.”