BMC Genomic Data (May 2023)
HostSeq: a Canadian whole genome sequencing and clinical data resource
- S Yoo,
- E Garg,
- LT Elliott,
- RJ Hung,
- AR Halevy,
- JD Brooks,
- SB Bull,
- F Gagnon,
- CMT Greenwood,
- JF Lawless,
- AD Paterson,
- L Sun,
- MH Zawati,
- J Lerner-Ellis,
- RJS Abraham,
- I Birol,
- G Bourque,
- J-M Garant,
- C Gosselin,
- J Li,
- J Whitney,
- B Thiruvahindrapuram,
- J-A Herbrick,
- M Lorenti,
- MS Reuter,
- OO Adeoye,
- S Liu,
- U Allen,
- FP Bernier,
- CM Biggs,
- AM Cheung,
- J Cowan,
- M Herridge,
- DM Maslove,
- BP Modi,
- V Mooser,
- SK Morris,
- M Ostrowski,
- RS Parekh,
- G Pfeffer,
- O Suchowersky,
- J Taher,
- J Upton,
- RL Warren,
- RSM Yeung,
- N Aziz,
- SE Turvey,
- BM Knoppers,
- M Lathrop,
- SJM Jones,
- SW Scherer,
- LJ Strug
Affiliations
- S Yoo
- The Hospital for Sick Children
- E Garg
- Simon Fraser University
- LT Elliott
- Simon Fraser University
- RJ Hung
- University of Toronto
- AR Halevy
- The Hospital for Sick Children
- JD Brooks
- University of Toronto
- SB Bull
- University of Toronto
- F Gagnon
- University of Toronto
- CMT Greenwood
- McGill University
- JF Lawless
- University of Waterloo
- AD Paterson
- The Hospital for Sick Children
- L Sun
- University of Toronto
- MH Zawati
- McGill University
- J Lerner-Ellis
- University of Toronto
- RJS Abraham
- Canada’s Michael Smith Genome Sciences Centre
- I Birol
- Canada’s Michael Smith Genome Sciences Centre
- G Bourque
- McGill University
- J-M Garant
- Canada’s Michael Smith Genome Sciences Centre
- C Gosselin
- Canada’s Michael Smith Genome Sciences Centre
- J Li
- Canada’s Michael Smith Genome Sciences Centre
- J Whitney
- The Hospital for Sick Children
- B Thiruvahindrapuram
- The Hospital for Sick Children
- J-A Herbrick
- The Hospital for Sick Children
- M Lorenti
- The Hospital for Sick Children
- MS Reuter
- The Hospital for Sick Children
- OO Adeoye
- The Hospital for Sick Children
- S Liu
- The Hospital for Sick Children
- U Allen
- The Hospital for Sick Children
- FP Bernier
- University of Calgary
- CM Biggs
- University of British Columbia
- AM Cheung
- University Health Network
- J Cowan
- University of Ottawa
- M Herridge
- University Health Network
- DM Maslove
- Queen’s University
- BP Modi
- BC Children’s Hospital
- V Mooser
- McGill University
- SK Morris
- The Hospital for Sick Children
- M Ostrowski
- University of Toronto
- RS Parekh
- The Hospital for Sick Children
- G Pfeffer
- University of Calgary
- O Suchowersky
- University of Alberta
- J Taher
- University of Toronto
- J Upton
- The Hospital for Sick Children
- RL Warren
- Canada’s Michael Smith Genome Sciences Centre
- RSM Yeung
- The Hospital for Sick Children
- N Aziz
- The Hospital for Sick Children
- SE Turvey
- University of British Columbia
- BM Knoppers
- McGill University
- M Lathrop
- McGill University
- SJM Jones
- Canada’s Michael Smith Genome Sciences Centre
- SW Scherer
- The Hospital for Sick Children
- LJ Strug
- The Hospital for Sick Children
- DOI
- https://doi.org/10.1186/s12863-023-01128-3
- Journal volume & issue
-
Vol. 24,
no. 1
pp. 1 – 12
Abstract
Abstract HostSeq was launched in April 2020 as a national initiative to integrate whole genome sequencing data from 10,000 Canadians infected with SARS-CoV-2 with clinical information related to their disease experience. The mandate of HostSeq is to support the Canadian and international research communities in their efforts to understand the risk factors for disease and associated health outcomes and support the development of interventions such as vaccines and therapeutics. HostSeq is a collaboration among 13 independent epidemiological studies of SARS-CoV-2 across five provinces in Canada. Aggregated data collected by HostSeq are made available to the public through two data portals: a phenotype portal showing summaries of major variables and their distributions, and a variant search portal enabling queries in a genomic region. Individual-level data is available to the global research community for health research through a Data Access Agreement and Data Access Compliance Office approval. Here we provide an overview of the collective project design along with summary level information for HostSeq. We highlight several statistical considerations for researchers using the HostSeq platform regarding data aggregation, sampling mechanism, covariate adjustment, and X chromosome analysis. In addition to serving as a rich data source, the diversity of study designs, sample sizes, and research objectives among the participating studies provides unique opportunities for the research community.
Keywords