GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
- PMID: 28025344
- PMCID: PMC5199132
- DOI: 10.1093/database/baw153
GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics
Abstract
We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a 'mean' human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves.Database URL: http://apollo11.isto.unibo.it/software/.
© The Author(s) 2016. Published by Oxford University Press.
Figures



Similar articles
-
Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank.DNA Res. 2015 Dec;22(6):495-503. doi: 10.1093/dnares/dsv028. Epub 2015 Nov 17. DNA Res. 2015. PMID: 26581719 Free PMC article.
-
Human protein-coding genes and gene feature statistics in 2019.BMC Res Notes. 2019 Jun 4;12(1):315. doi: 10.1186/s13104-019-4343-8. BMC Res Notes. 2019. PMID: 31164174 Free PMC article.
-
search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information.BMC Bioinformatics. 2013 Mar 1;14:73. doi: 10.1186/1471-2105-14-73. BMC Bioinformatics. 2013. PMID: 23452691 Free PMC article.
-
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.BMC Bioinformatics. 2006 Jan 24;7:36. doi: 10.1186/1471-2105-7-36. BMC Bioinformatics. 2006. PMID: 16433928 Free PMC article.
-
Advances in the Exon-Intron Database (EID).Brief Bioinform. 2006 Jun;7(2):178-85. doi: 10.1093/bib/bbl003. Epub 2006 Mar 9. Brief Bioinform. 2006. PMID: 16772261 Review.
Cited by
-
Cost and time-efficient construction of a 3'-end mRNA library from unpurified bulk RNA in a single tube.Exp Mol Med. 2024 Feb;56(2):453-460. doi: 10.1038/s12276-024-01164-8. Epub 2024 Feb 27. Exp Mol Med. 2024. PMID: 38413820 Free PMC article.
-
Structural Characterization of the Highly Restricted Down Syndrome Critical Region on 21q22.13: New KCNJ6 and DSCR4 Transcript Isoforms.Front Genet. 2021 Dec 8;12:770359. doi: 10.3389/fgene.2021.770359. eCollection 2021. Front Genet. 2021. PMID: 34956324 Free PMC article.
-
Characterization of transcript enrichment and detection bias in single-nucleus RNA-seq for mapping of distinct human adipocyte lineages.Genome Res. 2022 Feb;32(2):242-257. doi: 10.1101/gr.275509.121. Epub 2022 Jan 18. Genome Res. 2022. PMID: 35042723 Free PMC article.
-
Determinants of gastric cancer immune escape identified from non-coding immune-landscape quantitative trait loci.Nat Commun. 2024 May 21;15(1):4319. doi: 10.1038/s41467-024-48436-5. Nat Commun. 2024. PMID: 38773080 Free PMC article.
-
A survey to assess the incidence of Down syndrome risk in rural southwestern Romania.Exp Ther Med. 2021 Oct;22(4):1066. doi: 10.3892/etm.2021.10500. Epub 2021 Jul 27. Exp Ther Med. 2021. PMID: 34434280 Free PMC article.
References
-
- Piovesan A., Vitale L., Pelleri M.C. et al. (2013) Universal tight correlation of codon bias and pool of RNA codons (codonome): the genome is optimized to allow any distribution of gene expression values in the transcriptome from bacteria to humans. Genomics, 101, 282–289. - PubMed
-
- Vitale L., Lenzi L., Huntsman S.A. et al. (2006) Differential expression of alternatively spliced mRNA forms of the insulin-like growth factor 1 receptor in human neuroendocrine tumors. Oncol. Rep., 15, 1249–1256. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases