Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Feb 28;118(4):1599-1663.
doi: 10.1021/acs.chemrev.7b00504. Epub 2018 Jan 11.

Using Genome Sequence to Enable the Design of Medicines and Chemical Probes

Affiliations
Review

Using Genome Sequence to Enable the Design of Medicines and Chemical Probes

Alicia J Angelbello et al. Chem Rev. .

Abstract

Rapid progress in genome sequencing technology has put us firmly into a postgenomic era. A key challenge in biomedical research is harnessing genome sequence to fulfill the promise of personalized medicine. This Review describes how genome sequencing has enabled the identification of disease-causing biomolecules and how these data have been converted into chemical probes of function, preclinical lead modalities, and ultimately U.S. Food and Drug Administration (FDA)-approved drugs. In particular, we focus on the use of oligonucleotide-based modalities to target disease-causing RNAs; small molecules that target DNA, RNA, or protein; the rational repurposing of known therapeutic modalities; and the advantages of pharmacogenetics. Lastly, we discuss the remaining challenges and opportunities in the direct utilization of genome sequence to enable design of medicines.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Cost of sequencing has decreased dramatically in the past 15 years, and it now only costs ~$1,000 to sequence a genome. Data were obtained from the Genome Sequencing Program of the National Human Genome Research Institute (NHGRI).
Figure 2
Figure 2
Oligonucleotide modifications and delivery strategies. (A) Common oligonucleotide modifications. (B) Lipid nanoparticle delivery systems. (C) GalNAc conjugated oligonucleotides for targeted delivery to the liver.
Figure 3
Figure 3
Mechanisms of action of antisense oligonucleotides (ASOs). (A) ASOs can affect gene expression by recruitment of RNase H, resulting in cleavage and degradation of the RNA target (B) ASOs with backbone or sugar modifications that prevent recruitment of RNase H can regulate expression by steric blocking of the ribosome and hence translational repression. (C) ASOs that target splice sites in pre-mRNAs can alter pre-mRNA alternative splicing.
Figure 4
Figure 4
ASO therapy for SMA. (A) In a healthy individual, the SMN1 gene produces a functional SMN1 protein. The SMN2 gene has a C to T mutation in exon 7, which results in exon 7 exclusion and a less stable SMN protein. (B) In SMA, mutations in the SMN1 gene result in loss of SMN protein leading to disease. Nusinersen targets a region of intron 7 in the SMN2 pre-mRNA to include exon 7 in the mature mRNA, resulting in production of a stable SMN protein.
Figure 5
Figure 5
Gene regulation by small noncoding RNAs. (A) The RNAi pathway. In RNAi, dsRNAs are cleaved in the cytoplasm by the Dicer-TRBP complex. The short fragments are then incorporated into the RISC complex, which cleaves complemenatry mRNAs. The RNAi pathway has been exploited therapeutically, by introducing exogenous shRNAs, typically produced from a DNA vector. These shRNAs are processed to double-stranded RNAs by Dicer before incorporation into the RISC complex. Likewise, siRNAs (double-stranded RNAs) can be exogenously introduced and incorporated into the RISC complex without processing. (B) Endogenous miRNAs are processed by Drosha (nucleus) and Dicer (cytoplasm) to produce a double-stranded RNA, where one strand is loaded into RISC to induce target mRNA cleavage or translational repression.
Figure 6
Figure 6
Mechanism of action of Miravirsen. Miravirsen, an LNA-antagomiR, sequesters mature miR-122 in a highly stable heteroduplex and represses HCV viral RNA replication. In Miravirsen, the uppercase letters indicate LNA modifications and the lowercase letters indicate DNA nucleotides.
Figure 7
Figure 7
Flowchart of the SELEX process. The process begins with a random library of DNA or RNA sequences that is mixed with a ligand of interest. The bound sequences are separated from unbound sequences, eluted, and amplified to generate a new pool of sequences for another selection cycle
Figure 8
Figure 8
Secondary structures of aptamers. (A) Pegaptanib, (B) REG1 (left) and REG1 with oligonucleotide antidote (right), (C) ARC1779, (D) N0X-A12, and (E) NOX-E36. iT denotes deoxythymidine.
Figure 9
Figure 9
Secondary structures of ribozymes. (A) The hammerhead ribozyme/substrate model and (B) the hairpin ribozyme/substrate model. Red arrows denote the cleavage site.
Figure 10
Figure 10
Design of polyamides to target DNA sequence. (A) Structures of the naturally occurring polyamides netropsin and distamcyin and (B) Watson—Crick hydrogen-bonding patterns in the DNA minor groove. The black circles represent lone electron pairs, and circles containing an H represent the 2-amino group of guanine. R represents the sugar backbone of DNA; (C) binding model between ImHpPyPy-γ-mHpPyPy-β-Dp and a 5′-TGTACA-3′/3′-TGTACA-5′ sequence. Hydrogen bonds are shown as dashed lines.
Figure 11
Figure 11
Chemical structures of the DNA minor groove binding pyrrolobenzodiazepines (PBDs). (A) Structures of PBDs. Anthramycin tomaymycin, and sibiromycin were discovered in the 1960’s and function as chemotherapeutics by forming covalent bonds with DNA (exocylic amine of guanosine), (B) The PBD dimer SJG-136 conjugated to antibodies has shown promise in clinical trials.
Figure 12
Figure 12
Chemical structures of the DNA minor groove binding pyrrolobenzodiazepines (PBDs). (A) The structures of pentamidine and stibamidine, diamidines used clinically since the late 1930’s and early 1940’s to treat a wide variety of diseases. (B) Boykin and Wilson discovered that diamidines bind AT-rich regions in DNA. SAR and structural studies revealed important features of molecular recognition and allowed for the rational design of diamidines that selectively bind GC base pairs. DB75 has shown promise as an anti- trypanosomiasis agent. (C) Dimers, such as RT533 and DB2232, target mixed DNA sequences.
Figure 13
Figure 13
DNA G-quadruplex binding ligands. (A) Structure of DNA G-quadruplex. (B) Different DNA strand directions, antiparallel, mixed-type, and parallel, result in different G-quadruplex structures. (C) Structure of BSU1051 which targets human telomeric G-quadruplex. (D) Structure of TMPyP4 which targets the promoter region of c-Myc. (E) Structure of PDS which targets telomeric G-quadruplex and proto-oncogene tyrosine-protein kinase Src. (F) Structure of quarfloxin which targets the rDNA G-quadruplex and inhibits Pol I transcription.
Figure 14
Figure 14
RNA is a single-stranded biomolecule, the structure of which can be predicted accurately from sequence. RNA adopts conformations that include Watson—Crick base pairs, internal loops, bulges, multibranch loops, and hairpin loops.
Figure 15
Figure 15
Inforna facilitates the design of lead small molecules targeting a disease-causing RNA, Small molecules are identified by comparison of the secondary structural motifs in an RNA target to an annotated database of RNA motif-small molecule interactions.
Figure 16
Figure 16
Lead optimization of a lead compound that targets the Drosha site of miR-96 to afford Targaprimir-96, a dimeric small molecule that targets the Drosha site and an adjacent 1 × 1 GG internal loop.
Figure 17
Figure 17
Repeating RNA transcripts contribute to disease by multiple mechanisms. Repeating transcripts often fold into hairpins that display internal loops. The loops sequester RNA-binding proteins, causing disease via an RNA gain-of-function mechanism. Repeating transcripts can also undergo repeat-associated non-ATG (RAN) translation, generating toxic proteins.
Figure 18
Figure 18
Small molecules that bind to RNA repeats can be appended with azide and alkyne functional groups which, when bound to the RNA repeat, are in close enough proximity to react and form potent oligomers in disease-aflfected cells.
Figure 19
Figure 19
Chemical reactivity and binding isolated by pull-down (ChemReactBIP) is a method to confirm the reaction “clickable” small molecules and to identify the RNA that served as a catalyst for the reaction. ChemReactBIP uses a small molecule appended with a biotin tag to terminate the click reaction and to allow purification of both reaction products and bound RNAs with streptavidin beads.
Figure 20
Figure 20
Methods to validate the cellular targets of small molecules. (A) Chemical cross-linking and isolation by pulldown (Chem-CLIP) involves a small molecule appended with a biotin tag and nucleic acid cross-linking agent. The cellular RNA targets are captured with biotin and analyzed using qRT-PCR. (B) Chem-CLIP-Map has been used to confirm the binding site of a small molecule within an RNA by digesting fragments of the RNA with antisense oligonucleotides and RNase H and analyzing the bound fragments using qRT-PCR (C) A cleavage approach can also be used for target validation where a small molecule is attached to bleomycin a5, a natural product that can cleave nucleic acids. RNA target are analyzed by target depletion by qRT-PCR.
Figure 21
Figure 21
Philadelphia chromosome and Imatinib. (A) The Philadelphia chromosome results from a translocation between chromosome 9 and chromosome 22, resuling in a BCR-ABL fusion gene. (B) BCR-ABL encodes a hyperactive tyrosine kinase, causing cells to uncontrollably divide in CML. Imatinib binds to the ATP-binding site and inhibits cell signaling and the progression of CML. (C) Chemical structure of Imatinib. (D) Hydrogen-bonding interactions between c-ABL and Imatinib.
Figure 22
Figure 22
Cystic fibrosis and therapeutics. (A) Structure of the CFTR membrane. (B) Ivacafor acts on a channel gate of defective CFTR (C) Chemical structures of Ivacaftor and Lumacaftor.

References

    1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W. Initial Sequencing and Analysis of the Human Genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. - DOI - PubMed
    1. Holley RW, Apgar J, Everett GA, Madison JT, Marquisee M, Merrill SH, Penswick JR, Zamir A. Structure of a Ribonucleic Acid. Science. 1965;147:1462–1465. doi: 10.1126/science.147.3664.1462. - DOI - PubMed
    1. Sanger F, Brownlee GG, Barrell BG. A Two-Dimensional Fractionation Procedure for Radioactive Nucleotides. J Mol Biol. 1965;13:373–IN374. doi: 10.1016/S0022-2836(65)80104-8. - DOI - PubMed
    1. Fiers W, Contreras R, Duerinck F, Haegeman G, Iserentant D, Merregaert J, Min Jou W, Molemans F, Raeymaekers A, Van den Berghe A. Complete Nucleotide Sequence of Bacteriophage Ms2 Rna: Primary and Secondary Structure of the Replicase Gene. Nature. 1976;260:500–507. doi: 10.1038/260500a0. - DOI - PubMed
    1. Sanger F, Coulson AR. A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase. J Mol Biol. 1975;94:441–448. doi: 10.1016/0022-2836(75)90213-2. - DOI - PubMed

Publication types