Base-calling of automated sequencer traces using phred. I. Accuracy assessment
- PMID: 9521921
- DOI: 10.1101/gr.8.3.175
Base-calling of automated sequencer traces using phred. I. Accuracy assessment
Abstract
The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while impressive, is not adequate to keep pace with growing demand and, in particular, is far short of what will be required to obtain the 3-billion-base human genome sequence by the target date of 2005. To reach this goal, improved automation will be essential, and it is particularly important that human involvement in sequence data processing be significantly reduced or eliminated. Progress in this respect will require both improved accuracy of the data processing software and reliable accuracy measures to reduce the need for human involvement in error correction and make human review more efficient. Here, we describe one step toward that goal: a base-calling program for automated sequencer traces, phred, with improved accuracy. phred appears to be the first base-calling program to achieve a lower error rate than the ABI software, averaging 40%-50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.
Similar articles
-
Base-calling of automated sequencer traces using phred. II. Error probabilities.Genome Res. 1998 Mar;8(3):186-94. Genome Res. 1998. PMID: 9521922
-
PhredEM: a phred-score-informed genotype-calling approach for next-generation sequencing studies.Genet Epidemiol. 2017 Jul;41(5):375-387. doi: 10.1002/gepi.22048. Epub 2017 May 31. Genet Epidemiol. 2017. PMID: 28560825 Free PMC article.
-
preAssemble: a tool for automatic sequencer trace data processing.BMC Bioinformatics. 2006 Jan 17;7:22. doi: 10.1186/1471-2105-7-22. BMC Bioinformatics. 2006. PMID: 16417643 Free PMC article.
-
Model-based quality assessment and base-calling for second-generation sequencing data.Biometrics. 2010 Sep;66(3):665-74. doi: 10.1111/j.1541-0420.2009.01353.x. Biometrics. 2010. PMID: 19912177 Free PMC article. Review.
-
Review of alignment and SNP calling algorithms for next-generation sequencing data.J Appl Genet. 2016 Feb;57(1):71-9. doi: 10.1007/s13353-015-0292-7. Epub 2015 Jun 9. J Appl Genet. 2016. PMID: 26055432 Review.
Cited by
-
Complete genome sequence of the motile actinomycete Actinoplanes missouriensis 431(T) (= NBRC 102363(T)).Stand Genomic Sci. 2012 Dec 19;7(2):294-303. doi: 10.4056/sigs.3196539. Epub 2012 Dec 18. Stand Genomic Sci. 2012. PMID: 23407331 Free PMC article.
-
Complete genome sequence of Clostridium sp. strain BNL1100, a cellulolytic mesophile isolated from corn stover.J Bacteriol. 2012 Dec;194(24):6982-3. doi: 10.1128/JB.01908-12. J Bacteriol. 2012. PMID: 23209234 Free PMC article.
-
Complete genome sequence of Brucella abortus A13334, a new strain isolated from the fetal gastric fluid of dairy cattle.J Bacteriol. 2012 Oct;194(19):5444. doi: 10.1128/JB.01124-12. J Bacteriol. 2012. PMID: 22965076 Free PMC article.
-
Cassava (Manihot esculenta) transcriptome analysis in response to infection by the fungus Colletotrichum gloeosporioides using an oligonucleotide-DNA microarray.J Plant Res. 2016 Jul;129(4):711-726. doi: 10.1007/s10265-016-0828-x. Epub 2016 May 2. J Plant Res. 2016. PMID: 27138000
-
Genome sequencing of 18 francisella strains to aid in assay development and testing.Genome Announc. 2015 Apr 30;3(2):e00147-15. doi: 10.1128/genomeA.00147-15. Genome Announc. 2015. PMID: 25931589 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials