Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 1:2019:baz013.
doi: 10.1093/database/baz013.

A scalable, aggregated genotypic-phenotypic database for human disease variation

Affiliations

A scalable, aggregated genotypic-phenotypic database for human disease variation

Ryan Barrett et al. Database (Oxford). .

Abstract

Next generation sequencing multi-gene panels have greatly improved the diagnostic yield and cost effectiveness of genetic testing and are rapidly being integrated into the clinic for hereditary cancer risk. With this technology comes a dramatic increase in the volume, type and complexity of data. This invaluable data though is too often buried or inaccessible to researchers, especially to those without strong analytical or programming skills. To effectively share comprehensive, integrated genotypic-phenotypic data, we built Color Data, a publicly available, cloud-based database that supports broad access and data literacy. The database is composed of 50 000 individuals who were sequenced for 30 genes associated with hereditary cancer risk and provides useful information on allele frequency and variant classification, as well as associated phenotypic information such as demographics and personal and family history. Our user-friendly interface allows researchers to easily execute their own queries with filtering, and the results of queries can be shared and/or downloaded. The rapid and broad dissemination of these research results will help increase the value of, and reduce the waste in, scientific resources and data. Furthermore, the database is able to quickly scale and support integration of additional genes and human hereditary conditions. We hope that this database will help researchers and scientists explore genotype-phenotype correlations in hereditary cancer, identify novel variants for functional analysis and enable data-driven drug discovery and development.

PubMed Disclaimer

Figures

Figure 1
Figure 1
High-level workflow of the database. The workflow is divided into four subwork processes including ‘Data Collection’, ‘Bioinformatics’, ‘Architecture’ and ‘User’, grouped by four different color-rounded rectangles.
Figure 2
Figure 2
Screenshots of query results for the pathogenic frequency and cancer age of onset in women with breast cancer. (A, B) Filter by ‘Gender: F’ and ‘Cancer history: Breast’. (C, D) Filter by ‘Classification: Pathogenic or Likely Pathogenic’. (E) Filter by ‘Gene: BRCA1 or BRCA2’. (F) Remove ‘Gene: BRCA1 or BRCA2’ and filter by ‘Gene: PALB2’. Query URL: https://data.color.com/v1/#gender=F&cancer_history=Breast
Figure 3
Figure 3
Screenshots of query results for the Ashkenazi Jewish BRCA founder alleles. (A–E) Filter by ‘Variant: c.68_69delAG, c.5266dupC, or c.5946delT’. Ashkenazi Jewish: the BRCA founder alleles are BRCA1 c.68_69delAG, BRCA1 c.5266dupC and BRCA2 c.5946delT. Query URL: https://data.color.com/v1/#variant=c.68_69delAG&variant=c.5266dupC&variant=c.5946delT
Figure 4
Figure 4
Screenshots of query results the personal and family history of cancer in individuals with Lynch syndrome. (A, B) Filter by ‘Classification: Pathogenic or Likely Pathogenic’ and ‘Gene: MLH1, MSH2, PMS2, MSH6, or EPCAM’. (C) Remove ‘Gene: MSH2, PMS2, MSH6, or EPCAM’. (D) Remove ‘Gene: MLH1’ and filter by ‘Gene: PMS2’. (E) Filter by ‘Gene: MLH1, MSH2, PMS2, MSH6, or EPCAM’. Query URL: https://data.color.com/v1/#classification=Likely%20Pathogenic&classification=Pathogenic&gene=MSH6&gene=MLH1&gene=MSH2&gene=PMS2&gene=EPCAM

Similar articles

Cited by

References

    1. Kurian A.W., Hare E.E., Mills M.A. et al. (2014) Clinical evaluation of a multiple-gene sequencing panel for hereditary cancer risk assessment. J. Clin. Oncol., 32, 2001–2009. - PMC - PubMed
    1. Stanislaw C., Xue Y. and Wilcox W.R. (2016) Genetic evaluation and testing for hereditary forms of cancer in the era of next-generation sequencing. Cancer Biol. Med., 13, 55–67. - PMC - PubMed
    1. Crawford B., Adams S.B., Sittler T. et al. (2017) Multi-gene panel testing for hereditary cancer predisposition in unsolved high-risk breast and ovarian cancer patients. Breast Cancer Res. Treat., 163, 383–390. - PMC - PubMed
    1. Nagy R., Sweet K. and Eng C. (2004) Highly penetrant hereditary cancer syndromes. Oncogene, 23, 6445–6470. - PubMed
    1. Lichtenstein P., Holm N.V., Verkasalo P.K. et al. (2000) Environmental and heritable factors in the causation of cancer—analyses of cohorts of twins from Sweden, Denmark, and Finland. N. Engl. J. Med., 343, 78–85. - PubMed

Publication types