Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 24:11:518644.
doi: 10.3389/fgene.2020.518644. eCollection 2020.

GenomeChronicler: The Personal Genome Project UK Genomic Report Generator Pipeline

Affiliations

GenomeChronicler: The Personal Genome Project UK Genomic Report Generator Pipeline

José Afonso Guerra-Assunção et al. Front Genet. .

Abstract

In recent years, there has been a significant increase in whole genome sequencing data of individual genomes produced by research projects as well as direct to consumer service providers. While many of these sources provide their users with an interpretation of the data, there is a lack of free, open tools for generating reports exploring the data in an easy to understand manner. GenomeChronicler was developed as part of the Personal Genome Project UK (PGP-UK) to address this need. PGP-UK provides genomic, transcriptomic, epigenomic and self-reported phenotypic data under an open-access model with full ethical approval. As a result, the reports generated by GenomeChronicler are intended for research purposes only and include information relating to potentially beneficial and potentially harmful variants, but without clinical curation. GenomeChronicler can be used with data from whole genome or whole exome sequencing, producing a genome report containing information on variant statistics, ancestry and known associated phenotypic traits. Example reports are available from the PGP-UK data page (personalgenomes.org.uk/data). The objective of this method is to leverage existing resources to find known phenotypes associated with the genotypes detected in each sample. The provided trait data is based primarily upon information available in SNPedia, but also collates data from ClinVar, GETevidence, and gnomAD to provide additional details on potential health implications, presence of genotype in other PGP participants and population frequency of each genotype. The analysis can be run in a self-contained environment without requiring internet access, making it a good choice for cases where privacy is essential or desired: any third party project can embed GenomeChronicler within their off-line safe-haven environments. GenomeChronicler can be run for one sample at a time, or in parallel making use of the Nextflow workflow manager. The source code is available from GitHub (https://github.com/PGP-UK/GenomeChronicler), container recipes are available for Docker and Singularity, as well as a pre-built container from SingularityHub (https://singularity-hub.org/collections/3664) enabling easy deployment in a variety of settings. Users without access to computational resources to run GenomeChronicler can access the software from the Lifebit CloudOS platform (https://lifebit.ai/cloudos) enabling the production of reports and variant calls from raw sequencing data in a scalable fashion.

Keywords: PGP-UK; cloud computing; genomic report; open consent; open source; participant engagement; personal genomics.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Flow Diagram of GenomeChronicler processing pipeline, illustrating the multiple entry points for the pipeline, resources integrated by default and generated outcomes. Either entry point of the pipeline can be run locally in a single machine, as a Nextflow workflow or in the Cloud. All source code and integrations are freely available in their respective GitHub repositories. The stand-alone GenomeChronicler is available at (https://github.com/PGP-UK/GenomeChronicler), the integration of GenomeChronicler with Nextflow is available at (https://github.com/PGP-UK/GenomeChronicler-nf) and the combined GenomeChronicler with Sarek variant calling is available at (https://github.com/PGP-UK/GenomeChronicler-Sarek-nf). The recipe files for the Docker and Singularity containers are available within the respective GitHub repositories. The resource logos are reproduced from the respective resource websites and remain copyright of their original owner.
FIGURE 2
FIGURE 2
Example Ancestry PCA plot containing the current reference data from the 1000 genomes project used by GenomeChronicler, with shaded areas broadly illustrating the origin of the populations represented.

References

    1. Beck S., Alison M. B., Graham B., Maggie B., Martin J. C., Olga C., et al. (2018). Personal genome project UK (PGP-UK): a research and citizen science hybrid project in support of personalized medicine. BMC Med. Genom. 11:108. 10.1186/s12920-018-0423-1 - DOI - PMC - PubMed
    1. Cariaso M., Lennon G. (2012). SNPedia: a wiki supporting personal genome annotation, interpretation and analysis. Nucleic Acids Res. 40 D1308–D1312. 10.1093/nar/gkr798 - DOI - PMC - PubMed
    1. Chervova O., Lucia C., José A. G.-A., Ismail M., Amy P. W., Alison B., et al. (2019). The personal genome project-UK, an open access resource of human multi-omics data. Sci. Data 6 1–10. 10.1038/s41597-019-0205-4 - DOI - PMC - PubMed
    1. Di Tommaso P., Maria C., Evan W. F., Pablo P. B., Emilio P., Cedric N. (2017). Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35 316–319. 10.1038/nbt.3820 - DOI - PubMed
    1. Ewels P. A., Alexander P., Sven F., Johannes A., Harshil P., Andreas W., et al. (2019). Nf-Core: community curated bioinformatics pipelines. BioRxiv [Preprint] 10.1101/610741 - DOI