SiDroForest: a comprehensive forest inventory of Siberian boreal forest investigations including drone-based point clouds, individually labeled trees, synthetically generated tree crowns, and Sentinel-2 labeled image patches

F. van Geffen; F. van Geffen; B. Heim; F. Brieger; F. Brieger; R. Geng; R. Geng; R. Geng; I. A. Shevtsova; I. A. Shevtsova; L. Schulte; L. Schulte; S. M. Stuenzi; S. M. Stuenzi; N. Bernhardt; N. Bernhardt; E. I. Troeva; L. A. Pestryakova; E. S. Zakharov; E. S. Zakharov; B. Pflug; U. Herzschuh; U. Herzschuh; U. Herzschuh; S. Kruse

doi:10.5194/essd-14-4967-2022

Earth System Science Data (Nov 2022)

SiDroForest: a comprehensive forest inventory of Siberian boreal forest investigations including drone-based point clouds, individually labeled trees, synthetically generated tree crowns, and Sentinel-2 labeled image patches

F. van Geffen,
F. van Geffen,
B. Heim,
F. Brieger,
F. Brieger,
R. Geng,
R. Geng,
R. Geng,
I. A. Shevtsova,
I. A. Shevtsova,
L. Schulte,
L. Schulte,
S. M. Stuenzi,
S. M. Stuenzi,
N. Bernhardt,
N. Bernhardt,
E. I. Troeva,
L. A. Pestryakova,
E. S. Zakharov,
E. S. Zakharov,
B. Pflug,
U. Herzschuh,
U. Herzschuh,
U. Herzschuh,
S. Kruse

Affiliations

F. van Geffen: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
F. van Geffen: University of Potsdam, Institute of Biochemistry and Biology, Potsdam, Germany
B. Heim: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
F. Brieger: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
F. Brieger: Carleton University, Department of Geography and Environmental Studies Ottawa, Canada
R. Geng: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
R. Geng: Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
R. Geng: University of the Chinese Academy of Sciences, Beijing, China
I. A. Shevtsova: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
I. A. Shevtsova: University of Potsdam, Institute of Biochemistry and Biology, Potsdam, Germany
L. Schulte: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
L. Schulte: University of Potsdam, Institute of Biochemistry and Biology, Potsdam, Germany
S. M. Stuenzi: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
S. M. Stuenzi: Humboldt-Universität zu Berlin, Geography Department, Unter den Linden, Berlin, Germany
N. Bernhardt: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
N. Bernhardt: Julius Kühn-Institut Bundesforschungsinstitut für Kulturpflanzen, Quedlinburg, Germany
E. I. Troeva: Institute for Biological Problems of the Cryolithozone, Russian Academy of Sciences, Siberian Branch, Yakutsk, Russia
L. A. Pestryakova: North-Eastern Federal University of Yakutsk, Institute of Natural Sciences (NEFU), Yakutsk, Russia
E. S. Zakharov: Institute for Biological Problems of the Cryolithozone, Russian Academy of Sciences, Siberian Branch, Yakutsk, Russia
E. S. Zakharov: North-Eastern Federal University of Yakutsk, Institute of Natural Sciences (NEFU), Yakutsk, Russia
B. Pflug: German Aerospace Center (DLR), Berlin, Germany
U. Herzschuh: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany
U. Herzschuh: University of Potsdam, Institute of Biochemistry and Biology, Potsdam, Germany
U. Herzschuh: University of Potsdam, Institute of Environmental Science and Geography, Potsdam, Germany
S. Kruse: Alfred Wegener Institute (AWI) Helmholtz Centre for Polar and Marine Research, Research Unit Potsdam, Germany

DOI: https://doi.org/10.5194/essd-14-4967-2022
Journal volume & issue: Vol. 14
pp. 4967 – 4994

Abstract

Read online

The SiDroForest (Siberian drone-mapped forest inventory) data collection is an attempt to remedy the scarcity of forest structure data in the circumboreal region by providing adjusted and labeled tree-level and vegetation plot-level data for machine learning and upscaling purposes. We present datasets of vegetation composition and tree and plot level forest structure for two important vegetation transition zones in Siberia, Russia; the summergreen–evergreen transition zone in Central Yakutia and the tundra–taiga transition zone in Chukotka (NE Siberia). The SiDroForest data collection consists of four datasets that contain different complementary data types that together support in-depth analyses from different perspectives of Siberian Forest plot data for multi-purpose applications. i. Dataset 1 provides unmanned aerial vehicle (UAV)-borne data products covering the vegetation plots surveyed during fieldwork (Kruse et al., 2021, https://doi.org/10.1594/PANGAEA.933263). The dataset includes structure-from-motion (SfM) point clouds and red–green–blue (RGB) and red–green–near-infrared (RGN) orthomosaics. From the orthomosaics, point-cloud products were created such as the digital elevation model (DEM), canopy height model (CHM), digital surface model (DSM) and the digital terrain model (DTM). The point-cloud products provide information on the three-dimensional (3D) structure of the forest at each plot.ii. Dataset 2 contains spatial data in the form of point and polygon shapefiles of 872 individually labeled trees and shrubs that were recorded during fieldwork at the same vegetation plots (van Geffen et al., 2021c, https://doi.org/10.1594/PANGAEA.932821). The dataset contains information on tree height, crown diameter, and species type. These tree and shrub individually labeled point and polygon shapefiles were generated on top of the RGB UVA orthoimages. The individual tree information collected during the expedition such as tree height, crown diameter, and vitality are provided in table format. This dataset can be used to link individual information on trees to the location of the specific tree in the SfM point clouds, providing for example, opportunity to validate the extracted tree height from the first dataset. The dataset provides unique insights into the current state of individual trees and shrubs and allows for monitoring the effects of climate change on these individuals in the future.iii. Dataset 3 contains a synthesis of 10 000 generated images and masks that have the tree crowns of two species of larch (Larix gmelinii and Larix cajanderi) automatically extracted from the RGB UAV images in the common objects in context (COCO) format (van Geffen et al., 2021a, https://doi.org/10.1594/PANGAEA.932795). As machine-learning algorithms need a large dataset to train on, the synthetic dataset was specifically created to be used for machine-learning algorithms to detect Siberian larch species.iv. Dataset 4 contains Sentinel-2 (S-2) Level-2 bottom-of-atmosphere processed labeled image patches with seasonal information and annotated vegetation categories covering the vegetation plots (van Geffen et al., 2021b, https://doi.org/10.1594/PANGAEA.933268). The dataset is created with the aim of providing a small ready-to-use validation and training dataset to be used in various vegetation-related machine-learning tasks. It enhances the data collection as it allows classification of a larger area with the provided vegetation classes. The SiDroForest data collection serves a variety of user communities. The detailed vegetation cover and structure information in the first two datasets are of use for ecological applications, on one hand for summergreen and evergreen needle-leaf forests and also for tundra–taiga ecotones. Datasets 1 and 2 further support the generation and validation of land cover remote-sensing products in radar and optical remote sensing. In addition to providing information on forest structure and vegetation composition of the vegetation plots, the third and fourth datasets are prepared as training and validation data for machine-learning purposes. For example, the synthetic tree-crown dataset is generated from the raw UAV images and optimized to be used in neural networks. Furthermore, the fourth SiDroForest dataset contains S-2 labeled image patches processed to a high standard that provide training data on vegetation class categories for machine-learning classification with JavaScript Object Notation (JSON) labels provided. The SiDroForest data collection adds unique insights into remote hard-to-reach circumboreal forest regions.

Published in Earth System Science Data

ISSN: 1866-3508 (Print); 1866-3516 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences; Science: Geology
Website: http://www.earth-system-science-data.net/

About the journal