HGG Advances (Jul 2022)

Lessons learned and recommendations for data coordination in collaborative research: The CSER consortium experience

  • Kathleen D. Muenzen,
  • Laura M. Amendola,
  • Tia L. Kauffman,
  • Kathleen F. Mittendorf,
  • Jeannette T. Bensen,
  • Flavia Chen,
  • Richard Green,
  • Bradford C. Powell,
  • Mark Kvale,
  • Frank Angelo,
  • Laura Farnan,
  • Stephanie M. Fullerton,
  • Jill O. Robinson,
  • Tianran Li,
  • Priyanka Murali,
  • James M.J. Lawlor,
  • Jeffrey Ou,
  • Lucia A. Hindorff,
  • Gail P. Jarvik,
  • David R. Crosslin

Journal volume & issue
Vol. 3, no. 3
p. 100120

Abstract

Read online

Summary: Integrating data across heterogeneous research environments is a key challenge in multi-site, collaborative research projects. While it is important to allow for natural variation in data collection protocols across research sites, it is also important to achieve interoperability between datasets in order to reap the full benefits of collaborative work. However, there are few standards to guide the data coordination process from project conception to completion. In this paper, we describe the experiences of the Clinical Sequence Evidence-Generating Research (CSER) consortium Data Coordinating Center (DCC), which coordinated harmonized survey and genomic sequencing data from seven clinical research sites from 2020 to 2022. Using input from multiple consortium working groups and from CSER leadership, we first identify 14 lessons learned from CSER in the categories of communication, harmonization, informatics, compliance, and analytics. We then distill these lessons learned into 11 recommendations for future research consortia in the areas of planning, communication, informatics, and analytics. We recommend that planning and budgeting for data coordination activities occur as early as possible during consortium conceptualization and development to minimize downstream complications. We also find that clear, reciprocal, and continuous communication between consortium stakeholders and the DCC is equally important to maintaining a secure and centralized informatics ecosystem for pooling data. Finally, we discuss the importance of actively interrogating current approaches to data governance, particularly for research studies that straddle the research-clinical divide.

Keywords