Engaging Science, Technology, and Society (Oct 2021)
Data Durabilities: Towards Conceptualizations of Scientific Long-Term Data Storage
Abstract
With the increased requirement for open data and data reuse in the sciences the call for long-term data storage becomes stronger. However, long-term data storage is insufficiently theorized and often considered as simply short-term data that are stored longer. Interviews with scientists at a German university show that data are not in themselves durable; they are made durable. While Science & Technology Studies data research has emphasized the relational character of data, always embedded in local contexts and infrastructures, we propose to add the temporal dimension of data durability to this understanding. We replace notions of long-term and short-term stored data with notions of publication data and project data, because the latter terms point to the practices through which data durability is made in a variety of ways, contingent on the kind of research phases in which the data are embedded, and on their infrastructures and practices. With the notion of data durability devices we inquire into technologies and tools, techniques and skills as well as organizational arrangements, cultural norms and relations that contribute to making data durable. We define scientific data as durable as long as they can operate in a socio-technical apparatus and uphold their capacity to make claims about the world. The scientists’ data practices revealed what we term media data durability devices and scientific data durability devices. The former were media materiality, the care for this materiality, and the compatibility between data and the data apparatus, which all contributed to shaping data durability. Scientific data durability devices, on the other hand included concealment and competition, through which data durability was prolonged, but also distributed unevenly among researchers. With these proposed concepts we hope to initiate discussions on the making of long-term data storage, just as we believe the concepts to be helpful for making realistic and relevant decisions about what data to store and for how long.