Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Aug 14:7:207.
doi: 10.1186/1471-2164-7-207.

A general pipeline for the development of anchor markers for comparative genomics in plants

Affiliations
Comparative Study

A general pipeline for the development of anchor markers for comparative genomics in plants

Jakob Fredslund et al. BMC Genomics. .

Abstract

Background: Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. In order to effectively transfer this information to species lacking sequence information, comparative genomic tools need to be developed. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. To identify potential anchor marker sequences more efficiently, we have established an automated bioinformatic pipeline that combines multi-species Expressed Sequence Tags (EST) and genome sequence data.

Results: Taking advantage of sequence data from related species, the pipeline identifies evolutionarily conserved sequences that are likely to define unique orthologous loci in most species of the same phylogenetic clade. The key features are the identification of evolutionarily conserved sequences followed by automated design of intron-flanking Polymerase Chain Reaction (PCR) primer pairs. Polymorphisms can subsequently be identified by size- or sequence variation of PCR products, amplified from mapping parents or populations. We illustrate our procedure in legumes and grasses and exemplify its application in legumes, where model plant studies and the genome- and EST-sequence data available have a potential impact on the breeding of crop species and on our understanding of the evolution of this large and diverse family.

Conclusion: We provide a database of 459 candidate anchor loci which have the potential to serve as map anchors in more than 18,000 legume species, a number of which are of agricultural importance. For grasses, the database contains 1335 candidate anchor loci. Based on this database, we have evaluated 76 candidate anchor loci with respect to marker development in legume species with no sequence information available, demonstrating the validity of this approach.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The pipeline of the marker candidate algorithm. In the first step, EST collections of selected species are compared with the proteome of the reference species in order to estimate the copy number. Sequences with one or two homologs in the Arabidopsis proteome are considered because Arabidopsis has undergone a recent whole genome duplication whereas legumes have not. EST sequences passing this criterion are compared to L. japonicus and M. truncatula genomic sequences in order to score the presence and length of introns. Sequences with the same Arabidopsis reference are then aligned and primers are designed using this alignment as input. For this purpose, the PriFi software [13] is used.
Figure 2
Figure 2
Phylogenetic trees of legumes and grasses. Phylogenetic relationship of a) legumes and b) grasses. Species with sequence information used in this study are shown together with selected other species. (Modified after [40])
Figure 3
Figure 3
Distribution of CATS on a) L. japonicus b) M. truncatula chromosomes. Red and green triangles indicate positions of markers with one and two homologous gene sequences in Arabidopsis, respectively. Chromosomes scale according to their genetic length.
Figure 4
Figure 4
Distribution of CATS on rice chromosomes. Red and blue marks indicate the positions of markers with one and two rice homologous gene sequences, respectively. The scale of chromosome diagrams reflects their relative physical sizes.

Similar articles

Cited by

References

    1. McCouch SR. Genomics and synteny. Plant Physiol. 2001;125:152–5. doi: 10.1104/pp.125.1.152. - DOI - PMC - PubMed
    1. Schmidt R. Synteny: recent advances and future prospects. Curr Opin Plant Biol. 2000;3:97–102. doi: 10.1016/S1369-5266(99)00048-5. - DOI - PubMed
    1. Delseny M. Re-evaluating the relevance of ancestral shared synteny as a tool for crop improvement. Curr Opin Plant Biol. 2004;7:126–31. doi: 10.1016/j.pbi.2004.01.005. - DOI - PubMed
    1. The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. doi: 10.1038/35048692. - DOI - PubMed
    1. Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H. A draft sequence of the rice genome (Oryza sativa L. ssp. indica) Science. 2002;296:79–92. doi: 10.1126/science.1068037. - DOI - PubMed

Publication types