Journal of Preventive Medicine and Public Health (Sep 2022)

The Korea Cohort Consortium: The Future of Pooling Cohort Studies

  • Sangjun Lee,
  • Kwang-Pil Ko,
  • Jung Eun Lee,
  • Inah Kim,
  • Sun Ha Jee,
  • Aesun Shin,
  • Sun-Seog Kweon,
  • Min-Ho Shin,
  • Sangmin Park,
  • Seungho Ryu,
  • Sun Young Yang,
  • Seung Ho Choi,
  • Jeongseon Kim,
  • Sang-Wook Yi,
  • Daehee Kang,
  • Keun-Young Yoo,
  • Sue K. Park

DOI
https://doi.org/10.3961/jpmph.22.299
Journal volume & issue
Vol. 55, no. 5
pp. 464 – 474

Abstract

Read online

Objectives We introduced the cohort studies included in the Korean Cohort Consortium (KCC), focusing on large-scale cohort studies established in Korea with a prolonged follow-up period. Moreover, we also provided projections of the follow-up and estimates of the sample size that would be necessary for big-data analyses based on pooling established cohort studies, including population-based genomic studies. Methods We mainly focused on the characteristics of individual cohort studies from the KCC. We developed “PROFAN”, a Shiny application for projecting the follow-up period to achieve a certain number of cases when pooling established cohort studies. As examples, we projected the follow-up periods for 5000 cases of gastric cancer, 2500 cases of prostate and breast cancer, and 500 cases of non-Hodgkin lymphoma. The sample sizes for sequencing-based analyses based on a 1:1 case-control study were also calculated. Results The KCC consisted of 8 individual cohort studies, of which 3 were community-based and 5 were health screening-based cohorts. The population-based cohort studies were mainly organized by Korean government agencies and research institutes. The projected follow-up period was at least 10 years to achieve 5000 cases based on a cohort of 0.5 million participants. The mean of the minimum to maximum sample sizes for performing sequencing analyses was 5917–72 102. Conclusions We propose an approach to establish a large-scale consortium based on the standardization and harmonization of existing cohort studies to obtain adequate statistical power with a sufficient sample size to analyze high-risk groups or rare cancer subtypes.

Keywords