Lessons learned and recommendations for data coordination in collaborative research: The CSER consortium experience

Kathleen D. Muenzen; Laura M. Amendola; Tia L. Kauffman; Kathleen F. Mittendorf; Jeannette T. Bensen; Flavia Chen; Richard Green; Bradford C. Powell; Mark Kvale; Frank Angelo; Laura Farnan; Stephanie M. Fullerton; Jill O. Robinson; Tianran Li; Priyanka Murali; James M.J. Lawlor; Jeffrey Ou; Lucia A. Hindorff; Gail P. Jarvik; David R. Crosslin

HGG Advances (Jul 2022)

Lessons learned and recommendations for data coordination in collaborative research: The CSER consortium experience

Kathleen D. Muenzen,
Laura M. Amendola,
Tia L. Kauffman,
Kathleen F. Mittendorf,
Jeannette T. Bensen,
Flavia Chen,
Richard Green,
Bradford C. Powell,
Mark Kvale,
Frank Angelo,
Laura Farnan,
Stephanie M. Fullerton,
Jill O. Robinson,
Tianran Li,
Priyanka Murali,
James M.J. Lawlor,
Jeffrey Ou,
Lucia A. Hindorff,
Gail P. Jarvik,
David R. Crosslin

Affiliations

Kathleen D. Muenzen: Department of Biomedical Informatics and Medical Education, Division of Biomedical and Health Informatics, University of Washington Medical Center, Seattle, WA, USA; Corresponding author
Laura M. Amendola: Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
Tia L. Kauffman: Center for Health Research, Kaiser Permanente Northwest, Portland, OR, USA
Kathleen F. Mittendorf: Center for Health Research, Kaiser Permanente Northwest, Portland, OR, USA
Jeannette T. Bensen: Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Flavia Chen: Institute for Human Genetics, University of California at San Francisco, San Francisco, CA, USA
Richard Green: Department of Biomedical Informatics and Medical Education, Division of Biomedical and Health Informatics, University of Washington Medical Center, Seattle, WA, USA
Bradford C. Powell: Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Mark Kvale: Institute for Human Genetics, University of California at San Francisco, San Francisco, CA, USA
Frank Angelo: Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
Laura Farnan: Lineberger Comprehensive Cancer Center, UNC School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Stephanie M. Fullerton: Department of Bioethics & Humanities, University of Washington School of Medicine, Seattle, WA, USA
Jill O. Robinson: Center for Medical Ethics and Health Policy, Baylor College of Medicine, Houston, TX, USA
Tianran Li: Department of Biomedical Informatics and Medical Education, Division of Biomedical and Health Informatics, University of Washington Medical Center, Seattle, WA, USA
Priyanka Murali: Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
James M.J. Lawlor: HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Jeffrey Ou: Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
Lucia A. Hindorff: Division of Genomic Medicine, NHGRI, NIH, Bethesda, MD, USA
Gail P. Jarvik: Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
David R. Crosslin: Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, USA; Corresponding author

Journal volume & issue: Vol. 3, no. 3
p. 100120

Abstract

Read online

Summary: Integrating data across heterogeneous research environments is a key challenge in multi-site, collaborative research projects. While it is important to allow for natural variation in data collection protocols across research sites, it is also important to achieve interoperability between datasets in order to reap the full benefits of collaborative work. However, there are few standards to guide the data coordination process from project conception to completion. In this paper, we describe the experiences of the Clinical Sequence Evidence-Generating Research (CSER) consortium Data Coordinating Center (DCC), which coordinated harmonized survey and genomic sequencing data from seven clinical research sites from 2020 to 2022. Using input from multiple consortium working groups and from CSER leadership, we first identify 14 lessons learned from CSER in the categories of communication, harmonization, informatics, compliance, and analytics. We then distill these lessons learned into 11 recommendations for future research consortia in the areas of planning, communication, informatics, and analytics. We recommend that planning and budgeting for data coordination activities occur as early as possible during consortium conceptualization and development to minimize downstream complications. We also find that clear, reciprocal, and continuous communication between consortium stakeholders and the DCC is equally important to maintaining a secure and centralized informatics ecosystem for pooling data. Finally, we discuss the importance of actively interrogating current approaches to data governance, particularly for research studies that straddle the research-clinical divide.

Published in HGG Advances

ISSN: 2666-2477 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science: Biology (General): Genetics
Website: https://www.cell.com/hgg-advances/home

About the journal

Abstract

Keywords