Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 2;24(1):26.
doi: 10.1186/s12863-023-01128-3.

HostSeq: a Canadian whole genome sequencing and clinical data resource

Affiliations

HostSeq: a Canadian whole genome sequencing and clinical data resource

S Yoo et al. BMC Genom Data. .

Abstract

HostSeq was launched in April 2020 as a national initiative to integrate whole genome sequencing data from 10,000 Canadians infected with SARS-CoV-2 with clinical information related to their disease experience. The mandate of HostSeq is to support the Canadian and international research communities in their efforts to understand the risk factors for disease and associated health outcomes and support the development of interventions such as vaccines and therapeutics. HostSeq is a collaboration among 13 independent epidemiological studies of SARS-CoV-2 across five provinces in Canada. Aggregated data collected by HostSeq are made available to the public through two data portals: a phenotype portal showing summaries of major variables and their distributions, and a variant search portal enabling queries in a genomic region. Individual-level data is available to the global research community for health research through a Data Access Agreement and Data Access Compliance Office approval. Here we provide an overview of the collective project design along with summary level information for HostSeq. We highlight several statistical considerations for researchers using the HostSeq platform regarding data aggregation, sampling mechanism, covariate adjustment, and X chromosome analysis. In addition to serving as a rich data source, the diversity of study designs, sample sizes, and research objectives among the participating studies provides unique opportunities for the research community.

Keywords: COVID-19; Clinical databank; Host genetics; SARS-CoV-2; Whole genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Sample and data flow in HostSeq (Aspects of graphics acquired from Wikimedia Commons)
Fig. 2
Fig. 2
PCA projection of HostSeq genomes against reference superpopulations. HostSeq genomes were merged with the 1000 Genomes reference set. The first two principal components of this merged data are shown here with HostSeq genomes in black and 1000 Genomes samples colored by their superpopulation: AFR = African, AMR = Admixed American, EAS = East Asian, SAS = South Asian, EUR = European

References

    1. Government of Canada. COVID-19 signs, symptoms and severity of disease: A clinician guide. 2021 [Accessed Summer 2022]. Available from: https://www.canada.ca/en/public-health/services/diseases/2019-novel-coro....
    1. Lin YC, Brooks J, Bull S, Gagnon F, Greenwood C, Hung R, et al. Statistical power in COVID-19 case-control host genomic study design. Genome Med. 2020;12(1):115. - PMC - PubMed
    1. Allers K, Schneider T. CCR5Δ32 mutation and HIV infection: Basis for curative HIV therapy. Curr Opin Virol. 2015;14:24–29. - PubMed
    1. Nordgren J, Svensson L. Genetic susceptibility to human norovirus infection: An Update. Viruses. 2019;11(3):226. - PMC - PubMed
    1. Coppola N, Marrone A, Pisaturo M, Starace M, Signoriello G, Gentile I, et al. Role of interleukin 28-B in the spontaneous and treatment-related clearance of HCV infection in patients with chronic HBV/HCV dual infection. Eur J Clin Microbiol Infect Dis. 2014;33(4):559–567. - PubMed

Publication types