Global Edge-Matched Admin Data
The Global Admin project is a data processing pipeline designed to take the best available administration boundary information from multiple sources (OCHA, government, OSM, GADM, etc), aggregating them into a single edge-matched dataset with a common schema using an automated methodology. The most detailed admin level available is used, ranging from 0-4 depending on the source. This dataset uses ISO-3 codes as the primary unit of edge-matching, with 249 currently used (not including disputed areas). Of these, 227 ISO-3 codes have subdivisions available: 114 sourced from OCHA, 106 from GADM, and 7 from public government data repositories. All these sources are matched to an admin 0 dataset produced by the UN Geospatial Information Section, and coastlines produced by OpenStreetMap. Data can be downloaded from the following links (updated weekly), last updated Monday, 15 February, 2021:
|Layer (ADM 0-4)||GeoPackage (GPKG)||File Geodatabase (GDB)|
|Boundary Polygons||wld_polygons.gpkg.zip (4.60 GB)||wld_polygons.gdb.zip (2.02 GB)|
|Cartographic Lines||wld_lines.gpkg.zip (751 MB)||wld_lines.gdb.zip (343 MB)|
|Label Points||wld_points.gpkg.zip (19.9 MB)||wld_points.gdb.zip (11.5 MB)|
The following links point to data catalogues used for this project, all data manually downloaded, and some transformations performed. These are some of the modifications performed on each source as a pre-processing step:
- OCHA: Boundary modifications are mostly limited to areas involving disputed boundaries, such as both Sudan and South Sudan claiming the Abyei Administrative Region. In this case, the area is removed from both countries when performing edge-matching, and represented as a disputed area in the admin 0 layer. In rare cases, topology cleaning is also required. For attributes that have not been formatted to the ITOS geodatabase standard, as defined by documents on GitHub, tables are manually changed to match this so they can be automatically read.
- GADM: Lakes and water bodies are classified as administrative regions with no names in this dataset, and are therefore removed before performing edge-matching.
A common problem of merging spatial data from different sources is the existence of gaps and overlaps between sources. There are many ways to address this problem, with the approach taken here being to generate borderless digital boundary files for each input. In this context, a digital boundary file is one that does not follow shorelines and international boundaries, but rather stretches out with simplified edges, intended for users to clip with their own shorelines and international boundaries. For example, Statistics Canada uses a digital boundary file when creating census blocks, later clipped with lakes and shorelines to derive a layer suitable for reference maps.
Just as with boundaries, merging attribute columns between sources with different schemas need to be conditioned so that columns align with each other. The following schema is used.
Repeating in layer
The following columns repeat for each higher level in an admin layer. An admin 2 layer will include attributes for adm2, adm1, and adm0. Replace the "X" with the indicated level.
|admX_id||Automatically generated ID used for internal pipeline management.|
|admX_ocha||P-Code taken from OCHA sources. If not an OCHA source, a P-Code like ID is generated using the ISO-2 code.|
|admX_name1||Primary administrative region name. Uses the language defined by the "lang_name1" column.|
|admX_name2||Secondary administrative region name. Uses the language defined by the "lang_name2" column.|
|admX_name3||Tertiary administrative region name. Uses the language defined by the "lang_name3" column.|
|admX_namea||All other names listed for a region are combined together using the pipe ( | ) symbol.|
Once per layer
These columns only appear a single time per layer, providing layer metadata.
|lang_name1||Primary language used for "admX_name1".|
|lang_name2||Secondary language used for "admX_name2".|
|lang_name3||Tertiary language used for "admX_name3".|
|src_name||One of: OCHA, GOVT, GADM.|
|src_url||Link where the original data source can be downloaded.|
|src_date||Date original dataset was produced.|
|src_valid||Last date original dataset was reviewed.|
|adm_max||Most detailed administrative level available for a particular ISO-3 region.|
Only in admin 0
These columns only appear in the admin 0 layer.
|adm0_fid||Feature ID code used to differentiate states and self-governing territories sharing the same ISO-3 code.|
|adm0_name||Romanized name associated to the region defined by adm0_fid.|
|adm0_label||Map label to be used for a region defined by adm0_fid.|
|adm0_cont||Code of the continent a given ISO-3 belongs to.|
|adm0_color||When creating thematic maps, features sharing the same value for this column should be coloured together.|
|adm0_stsc||Indicates the sovereignty status code of the region given as an integer.|
|adm0_stsn||Indicates the sovereignty status of the region (State, Territory, Special Region, etc).|