Frontiers in Earth Science (Oct 2022)
On the documentation, independence, and stability of widely used seismological data products
Abstract
Earthquake scientists have traditionally relied on relatively small data sets recorded on small numbers of instruments. With advances in both instrumentation and computational resources, the big-data era, including an established norm of open data-sharing, allows seismologists to explore important issues using data volumes that would have been unimaginable in earlier decades. Alongside with these developments, the community has moved towards routine production of interpreted data products such as seismic moment tensor catalogs that have provided an additional boon to earthquake science. As these products have become increasingly familiar and useful, it is important to bear in mind that they are not data, but rather interpreted data products. As such, they differ from data in ways that can be important, but not always appreciated. Important - and sometimes surprising - issues can arise if methodology is not fully described, data from multiple sources are included, or data products are not versioned (time-stamped). The line between data and data products is sometimes blurred, leading to an underappreciation of issues that affect data products. This note illustrates examples from two widely used data products: moment tensor catalogs and Did You Feel It? (DYFI) macroseismic intensity values. These examples show that increasing a data product’s documentation, independence, and stability can make it even more useful. To ensure the reproducibility of studies using data products, time-stamped products should be preserved, for example as electronic supplements to published papers, or, ideally, a more permanent repository.
Keywords