Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 30;38(23):5279-5287.
doi: 10.1093/bioinformatics/btac646.

Leveraging a pharmacogenomics knowledgebase to formulate a drug response phenotype terminology for genomic medicine

Affiliations

Leveraging a pharmacogenomics knowledgebase to formulate a drug response phenotype terminology for genomic medicine

Yiqing Zhao et al. Bioinformatics. .

Abstract

Motivation: Despite the increasing evidence of utility of genomic medicine in clinical practice, systematically integrating genomic medicine information and knowledge into clinical systems with a high-level of consistency, scalability and computability remains challenging. A comprehensive terminology is required for relevant concepts and the associated knowledge model for representing relationships. In this study, we leveraged PharmGKB, a comprehensive pharmacogenomics (PGx) knowledgebase, to formulate a terminology for drug response phenotypes that can represent relationships between genetic variants and treatments. We evaluated coverage of the terminology through manual review of a randomly selected subset of 200 sentences extracted from genetic reports that contained concepts for 'Genes and Gene Products' and 'Treatments'.

Results: Results showed that our proposed drug response phenotype terminology could cover 96% of the drug response phenotypes in genetic reports. Among 18 653 sentences that contained both 'Genes and Gene Products' and 'Treatments', 3011 sentences were able to be mapped to a drug response phenotype in our proposed terminology, among which the most discussed drug response phenotypes were response (994), sensitivity (829) and survival (332). In addition, we were able to re-analyze genetic report context incorporating the proposed terminology and enrich our previously proposed PGx knowledge model to reveal relationships between genetic variants and treatments. In conclusion, we proposed a drug response phenotype terminology that enhanced structured knowledge representation of genomic medicine.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Workflow to analyze semantic structures of variant annotation statements in PharmGKB. (1) PharmGKB source files, (2) example of a PharmGKB variant annotation statement, (3) structured annotation representations by existing terminologies to (4) formulation of Drug Response Phenotype terminology
Fig. 2.
Fig. 2.
Workflow for semantic analysis of unstructured text of genetic reports. (a) XML parsing to extract unstructured text from specified sections of the report, (b) extraction of unstructured text from genetic reports, (c) extraction of UMLS-identifiable concepts and drug response phenotypes, (d) grouping semantic annotations and co-occurrence analysis and (e) Network visualization. See text for details
Fig. 3.
Fig. 3.
Distribution of number of drug response phenotypes in genetic reports sentences
Fig. 4.
Fig. 4.
(a1) Knowledge model subgraph (top 50 edges); Key elements for a patient’s genetic profile are Disorders, Pathway, Treatments, and Genes and Gene Products. (a2) Knowledge model subgraph (top 50 edges); the thickness of the edges in the network represents frequencies of co-occurrence associations between two groups. Size of the nodes represent node degree and color gradient represents the distance (by edge weight) between each node and ‘Drug Phenotype’; darker color represents closer associations, as for Drug Response Phenotype and Procedures. (b1 and b2) Expanded subgraph for term ‘PIK3CA’ showing the addition of the new node ‘Drug Response Phenotype’. (a1) and (b1) are from a previous publication (Zhao et al., 2020)
Fig. 5.
Fig. 5.
Dynamic knowledge model for genomic medicine (nodes highlighted in red are significant in the knowledge model)

Similar articles

References

    1. Altman R.B. (2007) PharmGKB: a logical home for knowledge relating genotype to drug response phenotype. Nat. Genet., 39, 426–426. - PMC - PubMed
    1. Antoniou G., Van Harmelen F. (2004) Web ontology language: OWL. In: Staab,S. and Studer,R. (eds.) Handbook on Ontologies. Springer. pp. 67–92.
    1. Aronson A.R. (2006) Metamap: Mapping Text to the UMLS Metathesaurus. NLM, NIH, DHHS, Bethesda, MD, pp. 1–26.
    1. Barbarino J.M. et al. (2018) PharmGKB: a worldwide resource for pharmacogenomic information. Wiley Interdiscip. Rev. Syst. Biol. Med., 10, e1417. - PMC - PubMed
    1. Bodenreider O. (2004) The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res., 32, D267–D270. - PMC - PubMed

Publication types