Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 4;2(12):apps.1400044.
doi: 10.3732/apps.1400044. eCollection 2014 Dec.

Gene prediction and annotation in Penstemon (Plantaginaceae): A workflow for marker development from extremely low-coverage genome sequencing

Affiliations

Gene prediction and annotation in Penstemon (Plantaginaceae): A workflow for marker development from extremely low-coverage genome sequencing

Paul D Blischak et al. Appl Plant Sci. .

Abstract

Premise of the study: Penstemon (Plantaginaceae) is a large and diverse genus endemic to North America. However, determining the phylogenetic relationships among its 280 species has been difficult due to its recent evolutionary radiation. The development of a large, multilocus data set can help to resolve this challenge. •

Methods: Using both previously sequenced genomic libraries and our own low-coverage whole-genome shotgun sequencing libraries, we used the MAKER2 Annotation Pipeline to identify gene regions for the development of sequencing loci from six extremely low-coverage Penstemon genomes (∼0.005×-0.007×). We also compared this approach to BLAST searches, and conducted analyses to characterize sequence divergence across the species sequenced. •

Results: Annotations and gene predictions were successfully added to more than 10,000 contigs for potential use in downstream primer design. Primers were then designed for chloroplast, mitochondrial, and nuclear loci from these annotated sequences. MAKER2 identified longer gene regions in all six Penstemon genomes when compared with BLASTN and BLASTX searches. The average level of sequence divergence among the six species was 7.14%. •

Discussion: Combining bioinformatics tools into a workflow that produces annotations can be useful for creating potential phylogenetic markers from thousands of sequences even when genome coverage is extremely low and reference data are only available from distant relatives. Furthermore, the output from MAKER2 contains information about important gene features, such as exon boundaries, and can be easily integrated with visualization tools to facilitate the process of marker development.

Keywords: 454 pyrosequencing; BLAST; MAKER2; Penstemon; bioinformatics.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Workflow used for marker development from six low-coverage Penstemon genomes using the MAKER2 Annotation Pipeline.
Fig. 2.
Fig. 2.
Comparing MAKER2 annotations to best-hit BLAST searches against ESTs (BLASTN) and protein sequences (BLASTX) for the six species of Penstemon sequenced. Mean sequence lengths are plotted as dashed, vertical lines. Means ± SEs are also given in the upper right corner of each graph.
Fig. 3.
Fig. 3.
Plot of pairwise comparisons of sequence variation among the six low-coverage genomes using BLASTN (Cent = P. centranthifolius, Cyan = P. cyananthus, Davs = P. davidsonii, Diss = P. dissectus, Frut = P. fruticosus, Grin = P. grinnellii). Rows represent the species used as the database, and columns represent the species used as the query (e.g., row Cent, column Grin represents a BLASTN search with P. grinnellii as the query and P. centranthifolius as the database). Mean sequence variation ± SEs and sample size are shown in the upper right corner of each graph. Note that the matrix is not symmetric due to differences between using the same set of sequences as both a query and as a database for a BLAST search (e.g., Frut vs. Cyan ≠ Cyan vs. Frut).

Similar articles

Cited by

References

    1. Altschul S. F., Gish W., Miller W., Meyer E. W., Lipman D. J. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410. - PubMed
    1. Altschul S. F., Madden T. L., Schäffer A. A., Zhang J., Zhang Z., Miller W., Lipman D. J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25: 3389–3402. - PMC - PubMed
    1. Blischak P. D., Wenzel A. J., Wolfe A. D. 2014. Data from: Gene prediction and annotation in Penstemon (Plantaginaceae): A workflow for marker development from extremely low-coverage genome sequencing. Dryad Digital Repository. http://doi.org/10.5061/dryad.f6s22. - DOI - PMC - PubMed
    1. Broderick S. R., Stevens M. R., Geary B., Love S. L., Jellen E. N., Dockter R. B., Daley S. L., Lindgren D. T. 2011. A survey of Penstemon’s genome size. Genome 54: 160–173. - PubMed
    1. Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T. L. 2009. BLAST+: Architecture and applications. BMC Bioinformatics 10: 421. - PMC - PubMed