Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 3;22(5):qzae076.
doi: 10.1093/gpbjnl/qzae076.

DeOri 10.0: An Updated Database of Experimentally Identified Eukaryotic Replication Origins

Affiliations

DeOri 10.0: An Updated Database of Experimentally Identified Eukaryotic Replication Origins

Yu-Hao Zeng et al. Genomics Proteomics Bioinformatics. .

Abstract

DNA replication is a complex and crucial biological process in eukaryotes. To facilitate the study of eukaryotic replication events, we present a database of eukaryotic DNA replication origins (DeOri), which collects genome-wide data on eukaryotic DNA replication origins currently available. With the rapid development of high-throughput experimental technology in recent years, the number of datasets in the new release of DeOri 10.0 increased from 10 to 151 and the number of sequences increased from 16,145 to 9,742,396. Besides nucleotide sequences and browser extensible data (BED) files, corresponding annotation files, such as coding sequences (CDSs), mRNAs, and other biological elements within replication origins, are also provided. The experimental techniques used for each dataset, as well as related statistical data, are also presented on web page. Differences in experimental methods, cell lines, and sequencing technologies have resulted in distinct replication origins, making it challenging to differentiate between cell-specific and non-specific replication origins. Based on multiple replication origin datasets at the species level, we scored and screened replication origins in Homo sapiens, Gallus gallus, Mus musculus, Drosophila melanogaster, and Caenorhabditis elegans. The screened regions with high scores were considered as species-conservative origins, which are integrated and presented as reference replication origins (rORIs). Additionally, we analyzed the distribution of relevant genomic elements associated with replication origins at the genome level, such as CpG island (CGI), transcription start site (TSS), and G-quadruplex (G4). These analysis results can be browsed and downloaded as needed at http://tubic.tju.edu.cn/deori/.

Keywords: DNA replication; Database; DeOri; Eukaryote; Replication origin.

PubMed Disclaimer

Conflict of interest statement

The authors have declared no competing interests.

Figures

Figure 1
Figure 1
Extraction process of rORIs Prepare data: converting datasets of multiple different genome versions in an organism into the same version. Process data: (1) input multiple files of the same genome version after conversion, and use multiinter of BEDTools to score the overlapping parts of multiple datasets; (2) get the scored file; (3) screen the regions that are no less than 1/2 of the maximum overlapping number in the chromosome (rounded up) as the conservative replication origin; (4) merge the adjacent fragments and output the BED file and sequences.
Figure 2
Figure 2
General information on datasets in DeOri A. Heatmap on the home page. Comparison of CGI distribution among different datasets. B. Charts of dataset. Visualization of statistics of GR00030030 dataset, such as the distributions of sequences and genomic elements. C. Statistical information. Basic information and statistical results on GR00030030 dataset. CGI, CpG island; TSS, transcription start site; G4, G-quadruplex.
Figure 3
Figure 3
Characteristic analysis of human replication-related sequences A. Statistic of human replication-related datasets in DeOri 10.0. B. GC content in three types of sequences. C. The Venn diagram of the three types of sequences. D. Distributions of four genomic elements around the replication-related sequences. E. Statistics on the distributions of genes around the replication-related sequences in each dataset.

Similar articles

References

    1. Dong MJ, Luo H, Gao F.. Ori-Finder 2022: a comprehensive web server for prediction and analysis of bacterial replication origins. Genomics Proteomics Bioinformatics 2022;20:1207–13. - PMC - PubMed
    1. Song S, Wang Y, Liu P.. DNA replication licensing factors: novel targets for cancer therapy via inhibiting the stemness of cancer cells. Int J Biol Sci 2022;18:1211–9. - PMC - PubMed
    1. Böhly N, Schmidt AK, Zhang X, Slusarenko BO, Hennecke M, Kschischo M, et al.Increased replication origin firing links replication stress to whole chromosomal instability in human cancer. Cell Rep 2022;41:111836. - PubMed
    1. Foss M, McNally FJ, Laurenson P, Rine J.. Origin recognition complex (ORC) in transcriptional silencing and DNA replication in S. cerevisiae. Science 1993;262:1838–44. - PubMed
    1. Micklem G, Rowley A, Harwood J, Nasmyth K, Diffley JF.. Yeast origin recognition complex is involved in DNA replication and transcriptional silencing. Nature 1993;366:87–9. - PubMed

LinkOut - more resources