Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 6;53(D1):D30-D44.
doi: 10.1093/nar/gkae978.

Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2025

Collaborators

Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2025

CNCB-NGDC Members and Partners. Nucleic Acids Res. .

Abstract

The National Genomics Data Center (NGDC), which is a part of the China National Center for Bioinformation (CNCB), offers a comprehensive suite of database resources to support the global scientific community. Amidst the unprecedented accumulation of multi-omics data, CNCB-NGDC is committed to continually evolving and updating its core database resources through big data archiving, integrative analysis and value-added curation. Over the past year, CNCB-NGDC has expanded its collaborations with international databases and established new subcenters focusing on biodiversity, traditional Chinese medicine and tumor genetics. Substantial efforts have been made toward encompassing a broad spectrum of multi-omics data, developing innovative resources and enhancing existing resources. Notably, new resources have been developed for single-cell omics (scTWAS Atlas), genome and variation (VDGE), health and disease (CVD Atlas, CPMKG, Immunosenescence Inventory, HemAtlas, Cyclicpepedia, IDeAS), biodiversity and biosynthesis (RefMetaPlant, MASH-Ocean) and research tools (CCLHunter). All resources and services are publicly accessible at https://ngdc.cncb.ac.cn.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
The core database resources of CNCB-NGDC are organized into various categories. These database resources are publicly accessible and searchable through the CNCB-NGDC home page at https://ngdc.cncb.ac.cn. A full list of data resources is shown at https://ngdc.cncb.ac.cn/databases.
Figure 2.
Figure 2.
The interconnectivity of CNCB-NGDC’s core databases. The data submission system, multi-omics databases, analytical tools and knowledge repositories are interconnected, allowing users to easily navigate between databases and access relevant information. For instance, the BioProject ID for lung cancer research in multi-omics is PRJCA016612 (https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA016612), which corresponds to the omics data in GSA-human (https://ngdc.cncb.ac.cn/gsa-human/browse/HRA004887). The related gene aryl-hydrocarbon receptor repressor(AHRR)of lung cancer is cross-referenced in the GenBase (https://ngdc.cncb.ac.cn/genbase/search/gb/NM_001377236.1). Leveraging these data, CNCB-NGDC has developed omics databases covering lung cancer, including genome variation database (GVM), as well as databases for single-cell and spatial transcriptomics (GEN, CSEM and CROST), and epigenetics (MethBank and EWAS Open Platform). Users can utilize bioinformatics toolkits like BIT to mine multi-omics data associated with lung cancer. Data analysis and publication curation have further identified changes in AHRR methylation linked to lung cancer (https://ngdc.cncb.ac.cn/ewas/browse?target=traits,https://ngdc.cncb.ac.cn/ewas/datahub/gene/15524), along with a knowledge graph illustrating changes in AHRR expression (https://ngdc.cncb.ac.cn/twas/knowledgegraph). Additionally, literature in OpenLB is associated with the application of the AHRR-based lung cancer risk model (https://ngdc.cncb.ac.cn/openlb/publication/OLB-PM-37150141).
Figure 3.
Figure 3.
Statistics of data submissions to CNCB-NGDC. (A) Data statistics of BioProject and BioSample. (B) Data statistics of Experiments and Runs in GSA. (C) Timeline of data growth in GSA. (D) Statistics of genome assemblies in GWH. All statistics are regularly updated and publicly accessible at https://ngdc.cncb.ac.cn/bioproject, https://ngdc.cncb.ac.cn/biosample and https://ngdc.cncb.ac.cn/gsa and https://ngdc.cncb.ac.cn/gwh.

References

    1. Bao Y., Xue Y.. From BIG Data Center to China National Center for Bioinformation. Genom. Proteom. Bioinform. 2023; 21:900–903. - PMC - PubMed
    1. Wang R., Peng G., Tam P.P.L., Jing N.. Integration of computational analysis and spatial transcriptomics in single-cell studies. Genom. Proteom. Bioinform. 2023; 21:13–23. - PMC - PubMed
    1. Fang S., Chen B., Zhang Y., Sun H., Liu L., Liu S., Li Y., Xu X.. Computational approaches and challenges in spatial transcriptomics. Genom. Proteom. Bioinform. 2023; 21:24–47. - PMC - PubMed
    1. Rozenblatt-Rosen O., Stubbington M.J.T., Regev A., Teichmann S.A.. The Human Cell Atlas: from vision to reality. Nature. 2017; 550:451–453. - PubMed
    1. Lewin H.A., Robinson G.E., Kress W.J., Baker W.J., Coddington J., Crandall K.A., Durbin R., Edwards S.V., Forest F., Gilbert M.T.P.et al. .. Earth BioGenome Project: sequencing life for the future of life. Proc. Natl Acad. Sci. USA. 2018; 115:4325–4333. - PMC - PubMed

Grants and funding

LinkOut - more resources