NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

doi:10.1093/nar/gkr1079

. 2012 Jan;40(Database issue):D130-5.

doi: 10.1093/nar/gkr1079. Epub 2011 Nov 24.

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

Kim D Pruitt¹, Tatiana Tatusova, Garth R Brown, Donna R Maglott

Affiliations

Affiliation

¹ National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA. pruitt@ncbi.nlm.nih.gov

PMID: 22121212
PMCID: PMC3245008
DOI: 10.1093/nar/gkr1079

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

Kim D Pruitt et al. Nucleic Acids Res. 2012 Jan.

. 2012 Jan;40(Database issue):D130-5.

doi: 10.1093/nar/gkr1079. Epub 2011 Nov 24.

Authors

Kim D Pruitt¹, Tatiana Tatusova, Garth R Brown, Donna R Maglott

Affiliation

¹ National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA. pruitt@ncbi.nlm.nih.gov

PMID: 22121212
PMCID: PMC3245008
DOI: 10.1093/nar/gkr1079

Abstract

The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the volume of data archived by the International Nucleotide Sequence Database Collaboration. The database includes over 16,00 organisms, 2.4 × 0(6) genomic records, 13 × 10(6) proteins and 2 × 10(6) RNA records spanning prokaryotes, eukaryotes and viruses (RefSeq release 49, September 2011). The RefSeq database is maintained by a combined approach of automated analyses, collaboration and manual curation to generate an up-to-date representation of the sequence, its features, names and cross-links to related sources of information. We report here on recent growth, the status of curating the human RefSeq data set, more extensive feature annotation and current policy for eukaryotic genome annotation via the NCBI annotation pipeline. More information about the resource is available online (see http://www.ncbi.nlm.nih.gov/RefSeq/).

PubMed Disclaimer

Figures

**Figure 1.**
NM_145204.3 is shown in the Nucleotide Graphical display format. The display was configured to show the six-frame translation track restricted to the sense strand, and to add three markers highlighting the annotated upstream in-frame stop codon, the translation initiation codon and a second in-frame AUG codon located further downstream. The observation of a stop codon upstream of, and in the same reading frame, suggests the annotated CDS is 5′ complete.

See this image and copyright information in PMC

Cited by

Demographic History, Adaptation, and NRAP Convergent Evolution at Amino Acid Residue 100 in the World Northernmost Cattle from Siberia.
Buggiotti L, Yurchenko AA, Yudin NS, Vander Jagt CJ, Vorobieva NV, Kusliy MA, Vasiliev SK, Rodionov AN, Boronetskaya OI, Zinovieva NA, Graphodatsky AS, Daetwyler HD, Larkin DM. Buggiotti L, et al. Mol Biol Evol. 2021 Jul 29;38(8):3093-3110. doi: 10.1093/molbev/msab078. Mol Biol Evol. 2021. PMID: 33784744 Free PMC article.
Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.
Matvienko M, Kozik A, Froenicke L, Lavelle D, Martineau B, Perroud B, Michelmore R. Matvienko M, et al. PLoS One. 2013;8(2):e55913. doi: 10.1371/journal.pone.0055913. Epub 2013 Feb 8. PLoS One. 2013. PMID: 23409088 Free PMC article.
Towards precision medicine.
Ashley EA. Ashley EA. Nat Rev Genet. 2016 Aug 16;17(9):507-22. doi: 10.1038/nrg.2016.86. Nat Rev Genet. 2016. PMID: 27528417 Review.
ATF3-Induced Mammary Tumors Exhibit Molecular Features of Human Basal-Like Breast Cancer.
Yan L, Gaddis S, Coletta LD, Repass J, Powell KL, Simper MS, Chen Y, Byrom M, Zhong Y, Lin K, Liu B, Lu Y, Shen J, MacLeod MC. Yan L, et al. Int J Mol Sci. 2021 Feb 26;22(5):2353. doi: 10.3390/ijms22052353. Int J Mol Sci. 2021. PMID: 33652981 Free PMC article.
The Analysis, Description, and Examination of the Maize LAC Gene Family's Reaction to Abiotic and Biotic Stress.
Wang T, Liu Y, Zou K, Guan M, Wu Y, Hu Y, Yu H, Du J, Wu D. Wang T, et al. Genes (Basel). 2024 Jun 6;15(6):749. doi: 10.3390/genes15060749. Genes (Basel). 2024. PMID: 38927685 Free PMC article.

See all "Cited by" articles

References

1. Pruitt KD, Katz KS, Sicotte H, Maglott DR. Introducing RefSeq and LocusLink: curated human genome resources at the NCBI. Trends Genet. 2000;16:44–47. - PubMed
1. Pruitt KD, Tatusova T, Klimke W, Maglott DR. NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Res. 2009;37:D32–D36. - PMC - PubMed
1. Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2011;39:D52–D57. - PMC - PubMed
1. Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, et al. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011;39:D225–D229. - PMC - PubMed
1. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–311. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

[1] Pruitt KD, Katz KS, Sicotte H, Maglott DR. Introducing RefSeq and LocusLink: curated human genome resources at the NCBI. Trends Genet. 2000;16:44–47. - PubMed

[2] Pruitt KD, Katz KS, Sicotte H, Maglott DR. Introducing RefSeq and LocusLink: curated human genome resources at the NCBI. Trends Genet. 2000;16:44–47. - PubMed

[3] Pruitt KD, Tatusova T, Klimke W, Maglott DR. NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Res. 2009;37:D32–D36. - PMC - PubMed

[4] Pruitt KD, Tatusova T, Klimke W, Maglott DR. NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Res. 2009;37:D32–D36. - PMC - PubMed

[5] Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2011;39:D52–D57. - PMC - PubMed

[6] Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2011;39:D52–D57. - PMC - PubMed

[7] Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, et al. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011;39:D225–D229. - PMC - PubMed

[8] Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, et al. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011;39:D225–D229. - PMC - PubMed

[9] Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–311. - PMC - PubMed

[10] Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–311. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

Affiliation

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases