Journal of eScience Librarianship (Aug 2024)
Identifying metadata commonalities across restricted health data sources: A mixed methods study exploring how to improve the discovery of and access to restricted datasets
Abstract
Background: While open datasets are adopting FAIR principles to improve their discovery and use, restricted data—those only accessible via request or application—have fallen behind. Metadata is not an inherent characteristic of restricted data, which limits its ability to be found and used. To better understand discoverability and accessibility of restricted data, this study reviewed restricted health data sources to determine how they describe their datasets and access procedures, what descriptive commonalities exist across data sources, and to what extent the commonalities we found can be accommodated within existing metadata schemas. Methods: This study extracted dataset and access information provided by a sample of 48 restricted data sources, identified commonalities across these data sources to develop possible metadata elements for restricted data, and mapped these metadata elements to existing metadata schemas (e.g., DataCite) to evaluate how well they accommodate information supplied by restricted data sources. Results: Restricted data sources describe their datasets (35 commonalities) and access procedures (27 commonalities) in similar ways. Dataset descriptions aligned with existing metadata schemas, with the DDI-Lifecycle and -Codebook schemas receiving 91.4% and 85.7% exact matches respectively with the dataset elements we identified. Access procedures did not align with metadata available in existing schemas. Discussion: While descriptive dataset metadata for restricted data sources will make their data more findable, the accessibility of these datasets could be significantly improved by structured metadata capturing data access information. Presently, metadata schemas do not accommodate the level of detail restricted data sources provide about access procedures and requirements.
Keywords