Data Science Journal (Apr 2020)
Developing an Open Data Portal for the ESA Climate Change Initiative
Abstract
We introduce the rationale for, and architecture of, the European Space Agency Climate Change Initiative (CCI) Open Data Portal (http://cci.esa.int/data/). The Open Data Portal hosts a set of richly diverse datasets – 13 “Essential Climate Variables” – from the CCI programme in a consistent and harmonised form and to provides a single point of access for the (>100 TB) data for broad dissemination to an international user community. These data have been produced by a range of different institutions and vary across both scientific and spatio-temporal characteristics. This heterogeneity of the data together with the range of services to be supported presented significant technical challenges. An iterative development methodology was key to tackling these challenges: the system developed exploits a workflow which takes data that conforms to the CCI data specification, ingests it into a managed archive and uses both manual and automatically generated metadata to support data discovery, browse, and delivery services. It utilises both Earth System Grid Federation (ESGF) data nodes and the Open Geospatial Consortium Catalogue Service for the Web (OGC-CSW) interface, serving data into both the ESGF and the Global Earth Observation System of Systems (GEOSS). A key part of the system is a new vocabulary server, populated with CCI specific terms and relationships which integrates OGC-CSW and ESGF search services together, developed as part of a dialogue between domain scientists and linked data specialists. These services have enabled the development of a unified user interface for graphical search and visualisation – the CCI Open Data Portal Web Presence.
Keywords