Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug 23;18(1):654.
doi: 10.1186/s12864-017-4062-2.

High-density 80 K SNP array is a powerful tool for genotyping G. hirsutum accessions and genome analysis

Affiliations

High-density 80 K SNP array is a powerful tool for genotyping G. hirsutum accessions and genome analysis

Caiping Cai et al. BMC Genomics. .

Abstract

Background: High-throughput genotyping platforms play important roles in plant genomic studies. Cotton (Gossypium spp.) is the world's important natural textile fiber and oil crop. Upland cotton accounts for more than 90% of the world's cotton production, however, modern upland cotton cultivars have narrow genetic diversity. The amounts of genomic sequencing and re-sequencing data released make it possible to develop a high-quality single nucleotide polymorphism (SNP) array for intraspecific genotyping detection in cotton.

Results: Here we report a high-throughput CottonSNP80K array and its utilization in genotyping detection in different cotton accessions. 82,259 SNP markers were selected from the re-sequencing data of 100 cotton cultivars and used to produce the array on the Illumina Infinium platform. 77,774 SNP loci (94.55%) were successfully synthesized on the array. Of them, 77,252 (99.33%) had call rates of >95% in 352 cotton accessions and 59,502 (76.51%) were polymorphic loci. Application tests using 22 cotton accessions with parent/F1 combinations or with similar genetic backgrounds showed that CottonSNP80K array had high genotyping accuracy, good repeatability, and wide applicability. Phylogenetic analysis of 312 cotton cultivars and landraces with wide geographical distribution showed that they could be classified into ten groups, irrelevant of their origins. We found that the different landraces were clustered in different subgroups, indicating that these landraces were major contributors to the development of different breeding populations of modern G. hirsutum cultivars in China. We integrated a total of 54,588 SNPs (MAFs >0.05) associated with 10 salt stress traits into 288 G. hirsutum accessions for genome-wide association studies (GWAS), and eight significant SNPs associated with three salt stress traits were detected.

Conclusions: We developed CottonSNP80K array with high polymorphism to distinguish upland cotton accessions. Diverse application tests indicated that the CottonSNP80K play important roles in germplasm genotyping, variety verification, functional genomics studies, and molecular breeding in cotton.

Keywords: Array; Genome-wide association studies (GWAS); Genotyping identification; Molecular breeding; Single nucleotide polymorphism (SNP); Upland cotton.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

A total of 352 cotton materials, including 332 G. hirsutum cotton accessions, two G. barbadense cotton accessions, 5 wild cotton and 13 semi-wild G. hirsutum race materials, were collected in this study. All necessary permits for collecting 332 G. hirsutum and two G. barbadense cotton accessions were obtained from Nanjing Agricultural University, China. All necessary permits for 5 wild cotton and 13 semi-wild G. hirsutum race materials were obtained from the Institute of Cotton Research, Chinese Academy of Agricultural Science, China. These 352 materials were planted in the Jiangpu experimental station of Nanjing Agricultural University, Nanjing, Jiangsu Province, China, for reproduction and sampling. All necessary permits for the field evaluations of these accessions were obtained from Nanjing Agricultural University, China. All the field evaluations were not relevant to human subject or animal research. Therefore, they did not involve any endangered or protected species.

Consent for publication

Not applicable.

Competing interests

The authors declared that they had no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Description of SNPs in CottonSNP80K array. a SNPs distributions on the 26 chromosomes of upland cotton. A01-A13 and D01-D13 in vertical axis are the serial number of 26 chromosomes; the horizontal axis shows chromosome length (Mb); the red region depicts SNP density (the number of SNPs per window). b Distances between the SNPs. The vertical axis represents distances range (Kb) of SNPs. c Distribution of genic and intergenic regions of selected SNPs
Fig. 2
Fig. 2
The typical cluster graph of SNP markers in CottonSNP80K array. a-d bi-allelic SNPs, can be accurately recognize by GenomeStudio software. e CNV, noted as “NG”; f InDel, noted as “--”. g-h Complex cluster graph difficult to group accurately and as missing data. i-l Corrected SNPs, 1 and 2 indicated default clustering using GenomeStudio software and adjusted clustering, respectively
Fig. 3
Fig. 3
Verification of heterozygous loci in three F1 combinations. a (V1 × TM-1) F1; b (V3 × TM-1) F1; c (V8 × TM-1) F1. The red labels on the chromosomes displayed the expected heterozygous loci which showed different homozygous alleles in two parents. Otherwise labeled in blue
Fig. 4
Fig. 4
Phylogenetic analysis of 332 cotton accessions based on the CottonSNP80K genotyping array. A neighbor-joining tree was constructed using 57,071 polymorphic SNP markers
Fig. 5
Fig. 5
PCA and linkage disequilibrium analysis of the 332 cotton accessions. a PCA for the 332 cotton accessions based on CottonSNP80K genotyping data. b LD decay of r2 and physical distance between SNP markers in different cotton groups. The 332 cotton accessions were classified into the introduced landraces (named as Introduced), the Chinese modern improved cultivars respectively from the Yellow River Valley, the Yangtze River Valley, the Northwestern inland region and the Northern specifically early maturation region (named as Cultivated), and the outgroup (named as Outgroup). Samples from the same group are represented by the same color
Fig. 6
Fig. 6
GWAS analysis for salt-tolerance related traits in cotton. Local Manhattan plot (top) and LD heatmap (bottom) surrounding the peak of candidate loci. The significant SNPs (P < 1 × 10−5) were marked in red. The pair-wise LD between the SNP markers is indicated as D’ values, where dark red indicated a value of 1 and light yellow indicated 0. a SNPs associated with relative chlorophyll content (RCC) at the peak region (8.41–9.41 Mb) on chromosome D05. b SNPs associated with relative MDA content (RMDA) at the peak region (79.77–80.77 Mb) on chromosome A02. c SNPs associated with relative MDA content (RMDA) at the peak region (3.10–4.31 Mb) on chromosome D09. d SNPs associated with relative germination rate (RGR) at the peak region (83.73–84.73 Mb) on chromosome A12

References

    1. Bowman DT, May OL, Calhoun DS. Genetic base of upland cotton cultivars released between 1970 and 1990. Crop Sci. 1996;36(3):577–81.
    1. Wang Q, Fang L, Chen JD, Hu Y, Si ZF, Wang S, Chang LJ, et al. Genome-wide mining, characterization, and development of microsatellite markers in Gossypium species. Sci Rep. 2015;5:10638. doi: 10.1038/srep10638. - DOI - PMC - PubMed
    1. Deschamps S, Llaca V, May GD. Genotyping-by-sequencing in plants. Biology. 2012;1(3):460–483. doi: 10.3390/biology1030460. - DOI - PMC - PubMed
    1. Logan-Young CJ, Yu JZ, Verma SK, Percy RG, Pepper AE. SNP discovery in complex allotetraploid genomes (Gossypium Spp., Malvaceae) using genotyping by sequencing. Appl Plant Sci. 2015;3(3):1400077. doi: 10.3732/apps.1400077. - DOI - PMC - PubMed
    1. Hulse-Kemp AM, Ashrafi H, Stoffel K, Zheng X, Saski CA, Scheffler BE, Fang DD, et al. BAC-end sequence-based SNP mining in allotetraploid cotton (Gossypium) utilizing resequencing data, phylogenetic inferences, and perspectives for genetic mapping. Genes Genomes Genetics. 2015;5(6):1095–1105. - PMC - PubMed