Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2008 Feb 27:9:103.
doi: 10.1186/1471-2164-9-103.

Sequencing and analysis of the gene-rich space of cowpea

Affiliations
Comparative Study

Sequencing and analysis of the gene-rich space of cowpea

Michael P Timko et al. BMC Genomics. .

Abstract

Background: Cowpea, Vigna unguiculata (L.) Walp., is one of the most important food and forage legumes in the semi-arid tropics because of its drought tolerance and ability to grow on poor quality soils. Approximately 80% of cowpea production takes place in the dry savannahs of tropical West and Central Africa, mostly by poor subsistence farmers. Despite its economic and social importance in the developing world, cowpea remains to a large extent an underexploited crop. Among the major goals of cowpea breeding and improvement programs is the stacking of desirable agronomic traits, such as disease and pest resistance and response to abiotic stresses. Implementation of marker-assisted selection and breeding programs is severely limited by a paucity of trait-linked markers and a general lack of information on gene structure and organization. With a nuclear genome size estimated at ~620 Mb, the cowpea genome is an ideal target for reduced representation sequencing.

Results: We report here the sequencing and analysis of the gene-rich, hypomethylated portion of the cowpea genome selectively cloned by methylation filtration (MF) technology. Over 250,000 gene-space sequence reads (GSRs) with an average length of 610 bp were generated, yielding ~160 Mb of sequence information. The GSRs were assembled, annotated by BLAST homology searches of four public protein annotation databases and four plant proteomes (A. thaliana, M. truncatula, O. sativa, and P. trichocarpa), and analyzed using various domain and gene modeling tools. A total of 41,260 GSR assemblies and singletons were annotated, of which 19,786 have unique GenBank accession numbers. Within the GSR dataset, 29% of the sequences were annotated using the Arabidopsis Gene Ontology (GO) with the largest categories of assigned function being catalytic activity and metabolic processes, groups that include the majority of cellular enzymes and components of amino acid, carbohydrate and lipid metabolism. A total of 5,888 GSRs had homology to genes encoding transcription factors (TFs) and transcription associated factors (TAFs) representing about 5% of the total annotated sequences in the dataset. Sixty-two (62) of the 64 well-characterized plant transcription factor (TF) gene families are represented in the cowpea GSRs, and these families are of similar size and phylogenetic organization to those characterized in other plants. The cowpea GSRs also provides a rich source of genes involved in photoperiodic control, symbiosis, and defense-related responses. Comparisons to available databases revealed that about 74% of cowpea ESTs and 70% of all legume ESTs were represented in the GSR dataset. As approximately 12% of all GSRs contain an identifiable simple-sequence repeat, the dataset is a powerful resource for the design of microsatellite markers.

Conclusion: The availability of extensive publicly available genomic data for cowpea, a non-model legume with significant importance in the developing world, represents a significant step forward in legume research. Not only does the gene space sequence enable the detailed analysis of gene structure, gene family organization and phylogenetic relationships within cowpea, but it also facilitates the characterization of syntenic relationships with other cultivated and model legumes, and will contribute to determining patterns of chromosomal evolution in the Leguminosae. The micro and macrosyntenic relationships detected between cowpea and other cultivated and model legumes should simplify the identification of informative markers for marker-assisted trait selection and map-based gene isolation necessary for cowpea improvement.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of molecular function assignment for cowpea GSRs by GO annotation. Gene Ontology (GO) annotations of cowpea GSRs were generated by Arabidopsis refseq BLAST searches and GSRs were assigned molecular functions using the complex search function, level 3 in the tree. A total of 77,591 cowpea sequences were annotated. Shown next to each functional category is the percentage of GSRs in each named category, followed in parenthesis by the number of annotated GSRs in the group.
Figure 2
Figure 2
Mapping of cowpea assemblies and singletons to the M. truncatula pseudomolecules. GSR assemblies and singltons were mapped by tblastx searches to the M. truncatula chromosome-scale pseudomolecules available on the TIGR M. truncatula database. The broad green lines represent tblastx alignments; narrow lines connect High-scoring Segment Pairs (HSPs) derived from the same cowpea sequence. An HSP consists of two sequence fragments of arbitrary but equal length whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score. A: An example of mapping cowpea contigs and singletons to a 40 kb region of chromosome 0 (which represents BACs that have not been anchored to the genetic map). B: A closer view of the same region from 396 k to 404 k. C: A region of M. truncatula chromosome 6 where a single cowpea GSR spans and has high quality tblastx matches to three distinct IMGAG gene models, indicating microsynteny. M. truncatula gene model AC134521_19 has no match in that region of the cowpea genome. D: A region of M. truncatula chromosome 2 where there are several GSR matches, but no M. truncatula gene model.
Figure 3
Figure 3
The ERF gene family of cowpea transcription factors. GSRs encoding the conserved DNA binding domain of ERFs were identified, the 111 cowpea ERF genes were arbitrarily assigned names, and the conserved domains were aligned using ClustalW. An unrooted phylogenetic tree was produced using the PHYLIP program based on the neighbor-joining method and presented using PhyloDraw. The cowpea ERF family is separated into two major clades. A line divides the CBF/DREB subfamily from the ERF subfamily. Subgroups, indicated by roman numerals, were identified as described in [58]. For additional information see Additional file 5 and Additional file 6.
Figure 4
Figure 4
The WRKY gene family of cowpea transcription factors. GSRs encoding the conserved WRKY domains were identified, the 79 WRKY genes were arbitrarily assigned names, and the conserved domains were aligned using ClustalW. An unrooted phylogenetic tree was produced using the PHYLIP program based on the neighbor-joining method and presented using PhyloDraw. The comparison includes a small number of Arabidopsis WRKY TF genes representative of each group. Cowpea WRKY genes are indicated by the prefix Vu, and Arabidopsis genes by the prefix At followed by their number and group. Groups are indicated by roman numerals. Group I sequences include both the N- and C-terminal domains (I NTF and I CTD, respectively); subgroup IIb* is an artifact in the ClustalW sequence alignment caused by the truncated nature of some of the domains. For additional information see Additional file 7 and Additional file 8.
Figure 5
Figure 5
The CONSTANS (CO) and CO-like gene family from cowpea. GSRs encoding the conserved DNA binding domains of CONSTANS (CO) and CO-like TFs were identified and assembled into contigs, the putative genes were arbitrarily assigned names, and the B1 and/or B2 domains (depending on the gene) were manually excised and aligned using CLUSTALW. An unrooted phylogenetic tree was produced using the PHYLIP program based on the neighbor-joining method and presented using PhyloDraw. The comparison includes a small number of Arabidopsis, barley, pea, rice and M. truncatula CO-like genes. Cowpea genes are indicated by the prefix Vu; Arabidopsis genes by the prefix At; M. truncatula genes by the prefix Mt, pea by the prefix Ps, barley by the prefix Hv, and rice by the prefix Hd. The major groups are indicated by roman numerals. The bar indicates the percent sequence divergence. For additional information see Additional file 9.

Similar articles

Cited by

References

    1. Singh BB. Cowpea Vigna unguiculata (L.) Walp. In: Singh RJ, Jauhar PP, editor. Genetic Resources, Chromosome Engineering and Crop Improvement. Vol. 1. Boca Raton: CRC Press; 2005. pp. 117–162.
    1. Timko MP, Ehlers JD, Roberts PA. Cowpea. In: Kole C, editor. Genome Mapping and Molecular Breeding in Plants, Pulses, Sugar and Tuber Crops. Vol. 3. Berlin: Springer-Verlag; 2007. pp. 49–68.
    1. Phillips RD, McWatters KH, Chinnan J, Komey NS, Liu K, Mensa-Wilmot Y, Nnanna IA, Okeke C, Prinyawiwatkul W, Saalia FK. Utilization of cowpea for human food. Field Crops Res. 2003;82:193–213.
    1. Lewis G, Schire B, Mackinder B, Lock M. Legumes of the World. London: Kew Publishing; 2005.
    1. Lavin M, Herendeen PS, Wojciechowski MF. Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the Tertiary. Syst Biol. 2005;54:575–594. - PubMed

Publication types

Substances