The status of the human gene catalogue
- PMID: 37794265
- PMCID: PMC10575709
- DOI: 10.1038/s41586-023-06490-x
The status of the human gene catalogue
Abstract
Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we examine the need for a universal annotation standard that includes all medically significant genes and maintains their relationships with different reference genomes for the use of the human gene catalogue in clinical settings.
© 2023. Springer Nature Limited.
Figures


Update of
-
The status of the human gene catalogue.ArXiv [Preprint]. 2023 Mar 24:arXiv:2303.13996v1. ArXiv. 2023. Update in: Nature. 2023 Oct;622(7981):41-47. doi: 10.1038/s41586-023-06490-x. PMID: 36994150 Free PMC article. Updated. Preprint.
References
-
- Pertea M & Salzberg SL Between a chicken and a grape: estimating the number of human genes. Genome Biol 11, 206, doi:10.1186/gb-2010-11-5-206 (2010). - DOI - PMC - PubMed
-
Reviews the history of efforts to estimate the human gene count and highlights different computational methods that were used to help with the human gene annotation.
-
- Understanding our genetic inheritance: The US Human Genome Project, the first five years 1991–1995. (U.S. Department of Health and Human Services and U.S. Department of Energy, 1990).
-
- Nurk S et al. The complete sequence of a human genome. Science 376, 44–53, doi:10.1126/science.abj6987 (2022). - DOI - PMC - PubMed
-
Describes the first-ever complete, gap-free assembly and annotation of a human genome, which added 140 protein-coding genes and several thousand additional noncoding genes to the human gene catalogue.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Medical