Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May-Jun;21(3):238-251.
doi: 10.21873/cgp.20443.

Genomic Frequencies of Dynamic DNA Sequences and Mammalian Lifespan

Affiliations

Genomic Frequencies of Dynamic DNA Sequences and Mammalian Lifespan

Marianna Martella et al. Cancer Genomics Proteomics. 2024 May-Jun.

Abstract

Background/aim: Dynamic DNA sequences (i.e. sequences capable of forming hairpins, G-quadruplexes, i-motifs, and triple helices) can cause replication stress and associated mutations. One example of such a sequence occurs in the RACK7 gene in human DNA. Since this sequence forms i-motif structures at neutral pH that cause replication stress and result in spontaneous deletions in prostate cancer cells, our initial aim was to determine its potential utility as a biomarker of prostate cancer.

Materials and methods: We cloned and sequenced the region in RACK7 where i-motif deletions often occur in DNA obtained from eight individuals. Expressed prostatic secretions were obtained from three individuals with a positive biopsy for prostate cancer and two with individuals with a negative biopsy for prostate cancer. Peripheral blood specimens were obtained from two control healthy bone marrow donors and a marrow specimen was obtained from a third healthy marrow donor. Follow-up computer searches of the genomes of 74 mammalian species available at the NCBI ftp site or frequencies of 6 dynamic sequences known to produce mutations or replication stress using a program written in Mathematica were subsequently performed.

Results: Deletions were found in RACK7 in specimens from both older normal adults, as well as specimens from older patients with cancer, but not in the youngest normal adult. The deletions appeared to show a weak trend to increasing frequency with patient age. This suggested that endogenous mutations associated with dynamic sequences might accumulate during aging and might serve as biomarkers of biological age rather than direct biomarkers of cancer. To test that hypothesis, we asked whether or not the genomic frequencies of several dynamic sequences known to produce replication stress or mutations in human DNA were inversely correlated with maximum lifespan in mammals.

Conclusion: Our results confirm this correlation for six dynamic sequences in 74 mammalian genomes studied, thereby suggesting that spontaneously induced replication stress and mutations linked to dynamic sequence frequency may limit lifespan by limiting genome stability.

Keywords: G-Quadruplex; Triplex; cruciform; endogenous mutagenesis; i-motif; lifespan.

PubMed Disclaimer

Conflict of interest statement

The Authors declare that they have no conflicts of interest.

Figures

Figure 1
Figure 1. Structures formed by dynamic sequence motifs. Triplex: This structure forms in duplex DNA. It is generally formed in AT rich regions and requires bases to pair in triplets (e.g., T:A:T). C:G:C+ triplets can also occur in triplex termed H-DNA; however, this generally requires low pH needed to protonate N3 of cytosine. The fourth strand is not base paired. i-Motif: This is a four-stranded structure that can form on the Crich strand in regions of GC skew. It contains C:C+ base pairs in parallel strands of DNA. In this structure C:C+ base pairs are intercalated so as to link sets of two strands of parallel DNA in a fourstranded structure. The term i-motif or intercalated motif describes its unique base pairing property. Like H-DNA: these structures generally form at low pH; however, a significant number of sequences have recently been identified that form at neutral pH. G-quadruplex: This is a four-stranded structure that can form on the G-rich strand in regions of GC skew. In this structure four G residues are Hoogsteen base-paired in a planar structure. Stacking interactions supplemented by ions chelated between base planes favor the formation of these structures and they are typically highly stable under physiological conditions. Hairpin: Hairpin or foldback structures can occur on either strand in self complementary regions in duplex DNA. They form duplexes that contain Watson-Crick base pairs. This figure was modified from reference (42).
Figure 2
Figure 2. Representative Mathematica program code. In this example, the genome of the Asian Elephant was loaded as upper-case letters into Mathematica and searched using StringCount for each of the six step repeating elements studied and the number instances of uncalled bases “N” in the genome. StringLength was used to determine the number of base-pairs in the genome.
Figure 3
Figure 3. RACK7 sequences obtained from patients without a diagnosis of cancer. (A) DNA from the bone marrow of a healthy 22-year-old bone marrow donor. (B) DNA from peripheral blood B-Cells from a healthy 41-year-old bone marrow donor. (C) DNA from peripheral blood B-cells from a healthy 60-year-old bone marrow donor. (D) DNA from a 69-year-old prostate patient with a negative biopsy for prostate cancer with high-grade prostatic intraepithelial neoplasia (HGPIN). (E) DNA from a 74-year-old prostate patient with a negative biopsy for prostate cancer with proliferative inflammatory atrophy (PIA). Sequences from the GC-Skew region of each cloned representative are shown for clarity. Regions flanking the region of GC-skew match the reference sequence (*) given in GRCh38.p12. However, in the region of the concatemer [(CCTG)8-CC-(TCCC)9-(TTCC)9], none of the sequences from these patients matches the reference sequence (*) for RACK7 from Human chromosome 20, GRCh38.p12 primary assembly.
Figure 4
Figure 4. RACK7 sequences obtained from cancer patients. (A) DNA from a 53-year-old prostate cancer patient with a Gleason Score 7 prostate cancer. (B) DNA from a 68-year-old prostate patient with a Gleason Score 8 prostate cancer. (C) DNA from a 68-year-old prostate patient with a Gleason Score 8 prostate cancer. Sequences from the GC-Skew region of each cloned representative are shown for clarity. Regions flanking the region of GC-skew match the reference sequence given in GRCh38.p12. However, none of the sequences from these patients match the reference sequence for RACK7 from Human chromosome 20, GRCh38.p12 primary assembly in the region of the concatemer [(CCTG)n-CC-(TCCC)n-(TTCC)n]. Although deletions characterize the vast majority of mutations at RACK7, the Patient in panel C had a short insertion in the region as well.
Figure 5
Figure 5. Length dependent d eletions at (TC CC)n in cancer pati ent DNA. DNA from a 68-yea r-old prostate patient wi th a Gleas on Score 8 prostate cancer shows deletio ns a t (TCC C)n elements whe re n≥7. The BCR reg ion of chromosome 22 (n=7) and th e PLA2G4C gene from chromosome 19 (n =15), as well as the RACK7 gene of c hromo some 20 (n= 9) all show deletions in this patient DNA. Note that he (TCCC)5 reg ion of RACK7 do es not show deletions.
Figure 6
Figure 6. Frequencies of (TCCC)6 and (TGG)6 elements in mammalian genomes. The total count of (TCCC)6 elements and the sequenced genome size were determined using a program written in Mathematica. (A) Frequencies determined as the ratio of the total (TCCC)6 count to the sequenced genome size are plotted vs. the Maximum lifespan of each of the 74 mammalian genomes in the study. Many of the mammals in the full data set were from the same phylogenetic family. (B) This graph confines the (TCCC)6 data to a single member of each of 46 mammalian families. (C) Frequencies determined as the ratio of the total (TGG)6 count to the sequenced genome size are plotted vs. the Maximum lifespan of each mammal in the study. (D) This graph confines the (TGG)6 data to a single member of each of 46 mammalian families. Common names for several of the mammals studied are given at their respective lifespans above panels A and B.
Figure 7
Figure 7. Frequencies of (GT)10 and (CG)10 elements in mammalian genomes. The total count of (GT)10 and (CG)10 elements and the sequenced genome size were determined using a program written in Mathematica. (A) Frequencies determined as the ratio of the total (GT)10 count to the sequenced genome size are plotted vs. the Maximum lifespan of each mammal in the study. Many of the mammals in the full data set were from the same phylogenetic family. (B) The graph confines the (GT)10 data to a single member of each of 46 mammalian families. (C) Frequencies determined as the ratio of the total (CG)10 count to the sequenced genome size are plotted vs. the Maximum lifespan of each mammal in the study. (D) This graph confines the (CG)10 data to a single member of each of 46 mammalian families.
Figure 8
Figure 8. Frequencies of (GA)10 and (GGA)6 elements in mammalian genomes. The total count of (GA)10 and (GGA)6 elements and the sequenced genome size were determined using a program written in Mathematica. (A) Frequencies determined as the ratio of the total (GA)10 count to the sequenced genome size are plotted vs. the Maximum lifespan of each mammal in the study. Many of the mammals in the full data set were from the same phylogenetic family. (B) This graph confines the (GA)10 data to a single member of each of 46 mammalian families. (C) Frequencies determined as the ratio of the total (GGA)6 count to the sequenced genome size are plotted vs. the Maximum lifespan of each mammal in the study. (D) The graph confines the (GGA)6 data to a single member of each of 46 mammalian families.

References

    1. Yeeles JT, Poli J, Marians KJ, Pasero P. Rescuing stalled or damaged replication forks. Cold Spring Harb Perspect Biol. 2013;5(5):a012815. doi: 10.1101/cshperspect.a012815. - DOI - PMC - PubMed
    1. Amparo C, Clark J, Bedell V, Murata-Collins JL, Martella M, Pichiorri F, Warner EF, Abdelhamid MAS, Waller ZAE, Smith SS. Duplex DNA from sites of helicase-polymerase uncoupling links non-B DNA structure formation to replicative stress. Cancer Genomics Proteomics. 2020;17(2):101–115. doi: 10.21873/cgp.20171. - DOI - PMC - PubMed
    1. Martella M, Pichiorri F, Chikhale RV, Abdelhamid MAS, Waller ZAE, Smith SS. i-Motif formation and spontaneous deletions in human cells. Nucleic Acids Res. 2022;50(6):3445–3455. doi: 10.1093/nar/gkac158. - DOI - PMC - PubMed
    1. Shen H, Xu W, Guo R, Rong B, Gu L, Wang Z, He C, Zheng L, Hu X, Hu Z, Shao ZM, Yang P, Wu F, Shi YG, Shi Y, Lan F. Suppression of enhancer overactivation by a RACK7-histone demethylase complex. Cell. 2016;165(2):331–342. doi: 10.1016/j.cell.2016.02.064. - DOI - PMC - PubMed
    1. Edgren H, Murumagi A, Kangaspeska S, Nicorici D, Hongisto V, Kleivi K, Rye IH, Nyberg S, Wolf M, Borresen-Dale AL, Kallioniemi O. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol. 2011;12(1):R6. doi: 10.1186/gb-2011-12-1-r6. - DOI - PMC - PubMed

Substances