Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Sep 17;19(1):679.
doi: 10.1186/s12864-018-5055-5.

Association mapping by aerial drone reveals 213 genetic associations for Sorghum bicolor biomass traits under drought

Affiliations

Association mapping by aerial drone reveals 213 genetic associations for Sorghum bicolor biomass traits under drought

Jennifer E Spindel et al. BMC Genomics. .

Abstract

Background: Sorghum bicolor is the fifth most commonly grown cereal worldwide and is remarkable for its drought and abiotic stress tolerance. For these reasons and the large size of biomass varieties, it has been proposed as a bioenergy crop. However, little is known about the genes underlying sorghum's abiotic stress tolerance and biomass yield.

Results: To uncover the genetic basis of drought tolerance in sorghum at a genome-wide level, we undertook a high-density phenomics genome wide association study (GWAS) in which 648 diverse sorghum lines were phenotyped at two locations in California once per week by drone over the course of a growing season. Biomass, height, and leaf area were measured by drone for individual field plots, subjected to two drought treatments and a well-watered control. The resulting dataset of ~ 171,000 phenotypic data-points was analyzed along with 183,989 genotype by sequence markers to reveal 213 high-quality, replicated, and conserved GWAS associations.

Conclusions: The genomic intervals defined by the associations include many strong candidate genes, including those encoding heat shock proteins, antifreeze proteins, and other domains recognized as important to plant stress responses. The markers identified by our study can be used for marker assisted selection for drought tolerance and biomass. In addition, our results are a significant step toward identifying specific sorghum genes controlling drought tolerance and biomass yield.

Keywords: Biomass; Drone; Drought; GWAS; Phenomics; Sorghum.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Seeds were obtained from the USDA-ARS Plant Genetic Resources Conservation Unit and increased at UC-ANR-KARE in Parlier, CA and by Chromatin, Inc. at winter facilities in Puerto Rico. No field permission was required to collect plant samples.

Consent for publication

Not applicable.

Competing interests

M.C. works for Blue River Technology which plans to provide the drone phenotyping technology utilized in this study as a commercial service. S.S. works for Chromatin Inc., which seeks to commercialize some of the biomass Sorghum lines included in the research project. All other authors declare no competing financial interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Genetic relationship between lines. Neighbor joining trees for 648 diverse Sorghum bicolor lines including 622 inbred sorghum lines and 26 proprietary Chromatin Inc. hybrids colored by morphology type (a) and country of origin of exotic parents (b). Together, morphology type and country of origin explain most of the genetic structure of this sorghum panel
Fig. 2
Fig. 2
Linkage disequilibrium. Gaussian kernel smoothed pairwise linkage disequilibrium (LD), r2, by SNP pair distance (bp) for chromosomes 1 and 6. Pairwise LD was calculated for all pairs of SNPs on each chromosome using Plinkv1.9, and a Gaussian kernel smoother (σ = 500) fit to model the relationship between SNP distance and pairwise LD on each chromosome. Most chromosomes resembled chromosome 1 (top) in that LD quickly decayed to a baseline of ~ 0.1 r2. Several chromosomes such as chromosome 6 (bottom), however, were subject to large linkage blocks as a result of linkage drag around conversion loci on these chromosomes. Similar plots for the other chromosomes can be found in Additional file 2: Figure S2
Fig. 3
Fig. 3
Overview of HDP-GWAS peak definition pipeline. To define preliminary GWAS peaks, all SNPs identified as significant in at least one out of 460 individual GWAS were consolidated into a single file. Local pairwise LD (r2) was calculated by first calculating the relationship between pairwise LD and SNP pair distance using a Gaussian kernel smoother (σ = 500), after which, for every SNP in a particular linkage block, the SNP position was found in the pairwise LD table and all linked SNPs identified (a, 1–2). A SNP was considered linked if r2 ≥ 0.2 for all chromosomes except chromosomes 6 and 9, for which a SNP was considered linked if r2 ≥ 0.3. (a, 2). Max distance (Max dist) was then defined as the largest bp distance between linked SNPs (a, 4). This process was repeated for all linkage blocks. Once max dist was defined for each significant SNP, the upper boundary of each preliminary GWAS peak could be defined as SNP position + max dist, and the lower boundary of each GWAS peak could be defined as SNP position – max dist (a 5–7). All SNPs falling in between the boundaries were then considered to be within the same GWAS peak (a, 8). This process was repeated for all peaks. In the event that more than one peak contained the same SNPs, they were merged into a single peak (a, 9). After defining preliminary peaks in this manner, peaks were refined by drawing ‘zoomed’ Manhattan plots around peaks, i.e., SNPs +/− 50 Kb from the preliminary peak boundaries (b, 10). Each zoomed Manhattan plot was then assessed visually to determine if the peak was, in fact, a single peak, or if the pattern of linkage indicated that the peak should be split into two or more peaks (b, 11). If it was determined that a preliminary peak should be split into two or more peaks, the diagnosis was confirmed by drawing second zoomed Manhattan plot including SNPs +/− 2 Mb around the peak boundaries (b, 12). After peaks were refined in this way, each individual zoomed Manhattan plot was rated either 1, 2, or 3 based on the evidence suggesting the peak was not an artifact, using visual assessment, where a rating on 1 indicated a peak with no evidence to suggest it was not an artifact, and a rating of 3 indicated a peak with very strong evidence it was not an artifact. All other peaks were rated as 2 (c, 14–15). Any GWAS peaks with only ‘1’ ratings were removed from the final set of significant GWAS peaks (c, 16). The final step of the pipeline is results analysis, i.e., identifying the combinations of trait, treatment, time point, and location that resulted in each significant GWAS peak (d, 17)
Fig. 4
Fig. 4
Manhattan blots (M-blots) for chromosomes 1–10. We designed the Manhattan blot as a new method for viewing the results of a large number of single-variate GWAS, as might be performed to utilize HDP-GWAS data -- the results of 460 single-variate GWAS results in the case of the current study. Each point on an M-blot represents a SNP that was significant in at least one of the 460 GWAS where the x-axis is the physical position of the SNP in Mb (by chromosome), the left y-axis gives median of the –log (FDR corrected p-value) across all GWAS where the SNP was significant, and where the size of the point is proportional to the number of independent GWAS in which the SNP registered as significant (N). These M-blots show the combination of the results for all trait by treatment by time-point by location combinations. The alternating blue and green vertical lines delineate the physical positions of distinct peaks defined using our peak definition pipeline. The right y-axis gives the value of the red star – the highest median percent variance explained (PVE) calculated for the SNPs within the interval of each peak
Fig. 5
Fig. 5
Average allele effects. Heat maps showing the average allele effects across locations and time-points of significant SNPs on chromosome 1 for B65 (left) and B65 deviation (right, where post-flowering = control – post-flowering data and pre-flowering = control – pre-flowering data), for each treatment. Note that each heat map has its own scale, but in all cases, darker red indicates that the minor allele confers an increase in the trait measurement (i.e., increased drought tolerance), darker blue indicates that the minor allele confers a decrease in the trait measure (i.e., decreased drought tolerance), and yellow indicates an effect close to or at 0. SNPs between dashed lines are in the same GWAS peak
Fig. 6
Fig. 6
Distribution of effect alleles among genetic subgroups. Average percent of genotypes with effect alleles at GWAS loci by genetic subgroup across 1673 SNPs within 213 GWAS peaks. For each genetic subgroup (K = 1–5), for each SNP, the number and percent genotypes homozygous for the effect (minor) allele was tabulated. The mean, median, range, and interquartile range (IQR) were then calculated for each subgroup. Box whiskers show range, green triangles show mean, lines show median, and box outlines the IQR. The difference between the medians by one-sample ANOVA was highly significant (f = 16.1, p = 3.84E-13). Box for group K = 6 shows statistics across all individuals
Fig. 7
Fig. 7
Distribution of GWAS peaks. Pie charts showing the proportion of GWAS peaks identified using deviation versus non-deviation phenotype data (a), at location = KARE (UC Kearney) versus location = WREC (Westside Research and Extension) (b)., and for all combinations of DV vs NDV at the two locations (c). a All peaks were classified as DV and/or NDV, where DV peaks were those identified using the phenotype data calculated as the deviation between either the pre-flowering stress treatment or the post-flowering stress treatment and the control, and the NDV peaks were those identified using the raw phenotype data. The majority of peaks were identified regardless of whether DV or NDV data were used. c Each wedge shows the proportion of the peaks that were only identified for a particular combination of data, e.g., the largest proportion of peaks (~ 20%) were identified only using NDV data from KARE, DV data from WREC and NDV data from WREC
Fig. 8
Fig. 8
Distribution of peaks by date and location. Number of DV and NDV peaks identified at each location, by time-point. All peaks were classified as either DV, NDV, or as belonging to both groups, where DV peaks were those identified using the phenotype data calculated as the deviation between either the pre-flowering stress treatment or the post-flowering stress treatment and the control, and the NDV peaks were those identified using the raw phenotype data. Times surrounded by a red box and underlined with a red line are dates where the pre-flowering drought stress plots received no irrigation, and all other plots were irrigated. During the times not surrounded by the red boxes, the post-flowering drought stress plots received no irrigation, while all other plots were irrigated

References

    1. Ray DK, Mueller ND, West PC, Foley JA. Yield trends are insufficient to double global crop production by 2050. PLoS One. 2013;8:e66428. doi: 10.1371/journal.pone.0066428. - DOI - PMC - PubMed
    1. Cline WR. Global warming and agriculture: End-of-century estimates by country. Washington D.C.: Peterson Institute; 2007.
    1. Lobell DB, Schlenker W, Costa-Roberts J. Climate trends and global crop production since 1980. Science. 2011;333:616–620. doi: 10.1126/science.1204531. - DOI - PubMed
    1. Rooney WL, Blumenthal J, Bean B, Mullet JE. Designing sorghum as a dedicated bioenergy feedstock. Biofuels Bioprod Biorefin. 2007;1:147–157. doi: 10.1002/bbb.15. - DOI
    1. Saballos A. In: Genetic Improvement of Bioenergy Crops. Vermerris W, editor. New York: Springer; 2008. pp. 211–248.