Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2019 Feb;21(2):361-372.
doi: 10.1038/s41436-018-0054-0. Epub 2018 Jun 6.

Stargazer: a software tool for calling star alleles from next-generation sequencing data using CYP2D6 as a model

Affiliations
Comparative Study

Stargazer: a software tool for calling star alleles from next-generation sequencing data using CYP2D6 as a model

Seung-Been Lee et al. Genet Med. 2019 Feb.

Abstract

Purpose: Genotyping CYP2D6 is important for precision drug therapy because the enzyme it encodes metabolizes approximately 25% of drugs, and its activity varies considerably among individuals. Genotype analysis of CYP2D6 is challenging due to its highly polymorphic nature. Over 100 haplotypes (star alleles) have been defined for CYP2D6, some involving a gene conversion with its nearby nonfunctional but highly homologous paralog CYP2D7. We present Stargazer, a new bioinformatics tool that uses next-generation sequencing (NGS) data to call star alleles for CYP2D6 ( https://stargazer.gs.washington.edu/stargazerweb/ ). Stargazer is currently being extended for other pharmacogenes.

Methods: Stargazer identifies star alleles from NGS data by detecting single nucleotide variants, insertion-deletion variants, and structural variants. Stargazer detects structural variation, including gene deletions, duplications, and conversions, by calculating paralog-specific copy numbers from read depths.

Results: We applied Stargazer to the NGS data of 32 ethnically diverse HapMap trios that were genotyped by TaqMan assays, long-range polymerase chain reaction, quantitative multiplex polymerase chain reaction, high-resolution melting analysis, and/or Sanger sequencing. CYP2D6 genotyping by Stargazer was 99.0% concordant with the data obtained by these methods, and showed that 28.1% of the samples had structural variation including CYP2D6/CYP2D7 hybrids.

Conclusion: Accurate genotyping of pharmacogenes with NGS and subsequent allele calling with Stargazer will aid the implementation of precision drug therapy.

Keywords: CYP2D6 genotyping; next-generation sequencing; pharmacogenomics; star alleles; structural variation.

PubMed Disclaimer

Conflict of interest statement

CONFLICT OF INTEREST

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
A schematic diagram of the Stargazer CYP2D6 genotyping pipeline. Stargazer takes as input a VCF file, a target GDF file, and a control GDF file. Stargazer uses the variant information from the VCF file to call star alleles based on SNVs/indels. Using the target and control GDF files, Stargazer converts read depth to copy number for detection of structural variation. The output data of Stargazer include each sample’s CYP2D6 diplotype and plots to visually inspect copy number for CYP2D6 and CYP2D7. Based on called CYP2D6 diplotypes, the program outputs predicted phenotypes as well. Several external software tools, shown in red, are used within and outside of Stargazer.
Figure 2
Figure 2
Examples of structural variation detected by Stargazer in HapMap trios. Grey dots are copy number calculated from read depth. Dots colored purple and orange are the mean copy number for CYP2D6 and CYP2D7, respectively, determined by the changepoint algorithm. Each panel contains scaled CYP2D6 and CYP2D7 gene models, in which the exons and introns are depicted with boxes and lines, respectively. All panels were generated from PGRNseq v2.0 data. (A) shows European sample NA12805 that has a CYP2D6*2/*4 diplotype without structural variation; this sample was included for comparison. (B) shows a gene deletion in Yoruban sample NA18508 with a CYP2D6*2/*5 diplotype. (C) shows a gene duplication in Mexican American sample NA19685 with a CYP2D6*1/*2x2 diplotype. (D) shows a complex structural variation involving a gene duplication and a gene conversion in Peruvian sample HG01979 genotyped as CYP2D6*2/*68+*4. (E) shows a complex structural variation involving multiple gene duplications and gene conversions in Han Chinese sample HG00465 genotyped as CYP2D6*36+*10/*36+*10. (F) shows a complex structural variation involving a gene conversion in Mexican American sample NA19790 genotyped as CYP2D6*1/*78+*2.
Figure 3
Figure 3
Segregation of complex structural variations detected by Stargazer in two HapMap trios. For each trio, data from the father is shown in the top panel, data from the mother is shown in the middle panel, and data from the child is shown in the bottom panel. Grey dots are copy number calculated from read depth. Dots colored purple and orange are the mean copy number for CYP2D6 and CYP2D7, respectively, determined by the changepoint algorithm. Each panel contains scaled CYP2D6 and CYP2D7 gene models, in which the exons and introns are depicted with boxes and lines, respectively. All panels were generated from PGRNseq v2.0 data. (A) shows segregation of CYP2D6*78+*2 in the Mexican American M037 family. (B) shows segregation of CYP2D6*68+*4 in the European 1463 family.
Figure 4
Figure 4
Comparison of custom capture and whole genome sequencing. Grey dots are copy number calculated from read depth. Dots colored purple and orange are the mean copy number for CYP2D6 and CYP2D7, respectively, determined by the changepoint algorithm. Each panel contains scaled CYP2D6 and CYP2D7 gene models, in which the exons and introns are depicted with boxes and lines, respectively. Two subjects NA19238 and NA12878 were sequenced with (A) PGRNseq v1.1 at ~400X coverage with 100bp paired-end reads, (B) PGRNseq v2.0 at ~160X coverage with 100bp paired-end reads, and (C) whole genome sequencing at ~30X coverage with 150bp paired-end reads. In all three cases, Stargazer called the correct diplotypes CYP2D6*1/*17 and *3/*68+*4, respectively.

References

    1. Zhou SF. Polymorphism of human cytochrome P450 2D6 and its clinical significance: Part I. Clin Pharmacokinet. 2009;48(11):689–723. - PubMed
    1. Crews KR, Gaedigk A, Dunnenberger HM, et al. Clinical Pharmacogenetics Implementation Consortium guidelines for cytochrome P450 2D6 genotype and codeine therapy: 2014 update. Clin Pharmacol Ther. 2014;95(4):376–382. - PMC - PubMed
    1. Gaedigk A, Ingelman-Sundberg M, Miller NA, et al. The Pharmacogene Variation (PharmVar) Consortium: Incorporation of the Human Cytochrome P450 (CYP) Allele Nomenclature Database. Clin Pharmacol Ther. 2018;103(3):399–401. - PMC - PubMed
    1. Gaedigk A, Sangkuhl K, Whirl-Carrillo M, Klein T, Leeder JS. Prediction of CYP2D6 phenotype from genotype across world populations. Genet Med. 2017;19(1):69–76. - PMC - PubMed
    1. LLerena A, Naranjo ME, Rodrigues-Soares F, Penas-LLedó EM, Fariñas H, Tarazona-Santos E. Interethnic variability of CYP2D6 alleles and of predicted and measured metabolic phenotypes across world populations. Expert Opin Drug Metab Toxicol. 2014;10(11):1569–1583. - PubMed

Publication types

Substances