Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul;21(5):1460-1474.
doi: 10.1111/1755-0998.13351. Epub 2021 Mar 9.

Regarding the F-word: The effects of data filtering on inferred genotype-environment associations

Affiliations

Regarding the F-word: The effects of data filtering on inferred genotype-environment associations

Collin W Ahrens et al. Mol Ecol Resour. 2021 Jul.

Abstract

Genotype-environment association (GEA) methods have become part of the standard landscape genomics toolkit, yet, we know little about how to best filter genotype-by-sequencing data to provide robust inferences for environmental adaptation. In many cases, default filtering thresholds for minor allele frequency and missing data are applied regardless of sample size, having unknown impacts on the results, negatively affecting management strategies. Here, we investigate the effects of filtering on GEA results and the potential implications for assessment of adaptation to environment. We use empirical and simulated data sets derived from two widespread tree species to assess the effects of filtering on GEA outputs. Critically, we find that the level of filtering of missing data and minor allele frequency affect the identification of true positives. Even slight adjustments to these thresholds can change the rate of true positive detection. Using conservative thresholds for missing data and minor allele frequency substantially reduces the size of the data set, lessening the power to detect adaptive variants (i.e., simulated true positives) with strong and weak strengths of selection. Regardless, strength of selection was a good predictor for GEA detection, but even some SNPs under strong selection went undetected. False positive rates varied depending on the species and GEA method, and filtering significantly impacted the predictions of adaptive capacity in downstream analyses. We make several recommendations regarding filtering for GEA methods. Ultimately, there is no filtering panacea, but some choices are better than others, depending on the study system, availability of genomic resources, and desired objectives.

Keywords: Eucalyptus; GEA; SNP analysis; climate adaptation; genome sequencing; genomic simulation; reduced representation.

PubMed Disclaimer

Similar articles

Cited by

References

REFERENCES

    1. Ahrens, C. W., Byrne, M., & Rymer, P. D. (2019). Standing genomic variation within coding and regulatory regions contributes to the adaptive capacity to climate in a foundation tree species. Molecular Ecology, 28(10), 2502-2516.
    1. Ahrens, C. W., James, E. A., Miller, A. D., Ferguson, S., Aitken, N. C., Jones, A. W., Lu-Irving, P., Borevitz, J. O., Cantrill, D. J., & Rymer, P. D. (2020). Spatial, climate, and ploidy factors drive genomic diversity and resilience in the widespread grass Themeda triandra. Molecular Ecology, 29(20), 3872-3888. https://doi.org/10.1111/mec.15614
    1. Ahrens, C. W., Rymer, P. D., Stow, A., Bragg, J., Dillon, S., Umbers, K. D. L., & Dudaniec, R. Y. (2018). The search for loci under selection: trends, biases and progress. Molecular Ecology, 27(6), 1342-1356.
    1. Andrews, K. R., & Luikart, G. (2014). Recent novel approaches for population genomics data analysis. Molecular Ecology, 23(7), 1661-1667.
    1. Bay, R. A., Harrigan, R. J., Le Underwood, V., Gibbs, H. L., Smith, T. B., & Ruegg, K. (2018). Genomic signals of selection predict climate-driven population declines in a migratory bird. Science, 359(6371), 83-86.

LinkOut - more resources