Scientific Data (Jul 2024)
Unfolding the downloads of datasets: A multifaceted exploration of influencing factors
Abstract
Abstract Scientific data are essential to advancing scientific knowledge and are increasingly valued as scholarly output. Understanding what drives dataset downloads is crucial for their effective dissemination and reuse. Our study, analysing 55,473 datasets from 69 data repositories, identifies key factors driving dataset downloads, focusing on interpretability, reliability, and accessibility. We find that while lengthy descriptive texts can deter users due to complexity and time requirements, readability boosts a dataset’s appeal. Reliability, evidenced by factors like institutional reputation and citation counts of related papers, also significantly increases a dataset’s attractiveness and usage. Additionally, our research shows that open access to datasets increases their downloads and amplifies the importance of interpretability and reliability. This indicates that easy access enhances the overall attractiveness and usage of datasets in the scholarly community. By emphasizing interpretability, reliability, and accessibility, this study offers a comprehensive framework for future research and guides data management practices toward ensuring clarity, credibility, and open access to maximize the impact of scientific datasets.