Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Oct 18:4:734990.
doi: 10.3389/fdata.2021.734990. eCollection 2021.

Testing a Generalizable Machine Learning Workflow for Aquatic Invasive Species on Rainbow Trout (Oncorhynchus mykiss) in Northwest Montana

Affiliations
Review

Testing a Generalizable Machine Learning Workflow for Aquatic Invasive Species on Rainbow Trout (Oncorhynchus mykiss) in Northwest Montana

S Carter et al. Front Big Data. .

Abstract

Biological invasions are accelerating worldwide, causing major ecological and economic impacts in aquatic ecosystems. The urgent decision-making needs of invasive species managers can be better met by the integration of biodiversity big data with large-domain models and data-driven products. Remotely sensed data products can be combined with existing invasive species occurrence data via machine learning models to provide the proactive spatial risk analysis necessary for implementing coordinated and agile management paradigms across large scales. We present a workflow that generates rapid spatial risk assessments on aquatic invasive species using occurrence data, spatially explicit environmental data, and an ensemble approach to species distribution modeling using five machine learning algorithms. For proof of concept and validation, we tested this workflow using extensive spatial and temporal hybridization and occurrence data from a well-studied, ongoing, and climate-driven species invasion in the upper Flathead River system in northwestern Montana, USA. Rainbow Trout (RBT; Oncorhynchus mykiss), an introduced species in the Flathead River basin, compete and readily hybridize with native Westslope Cutthroat Trout (WCT; O. clarkii lewisii), and the spread of RBT individuals and their alleles has been tracked for decades. We used remotely sensed and other geospatial data as key environmental predictors for projecting resultant habitat suitability to geographic space. The ensemble modeling technique yielded high accuracy predictions relative to 30-fold cross-validated datasets (87% 30-fold cross-validated accuracy score). Both top predictors and model performance relative to these predictors matched current understanding of the drivers of RBT invasion and habitat suitability, indicating that temperature is a major factor influencing the spread of invasive RBT and hybridization with native WCT. The congruence between more time-consuming modeling approaches and our rapid machine-learning approach suggest that this workflow could be applied more broadly to provide data-driven management information for early detection of potential invaders.

Keywords: big data analytics; early detection and rapid response; invasive species; machine learning; remote sensing; species distribution modeling.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Overview of study area, including a sample data product (Land Surface Temperature) aggregated by hydrologic units.
FIGURE 2
FIGURE 2
Overall workflow, model inputs, and model outputs. Yellow box indicates model inputs. Green boxes indicate steps as referenced in the methods. Purple box indicates each model output. RBT presence and absence observation locations are denoted by respective red and blue points on the associated study area maps.
FIGURE 3
FIGURE 3
Predicted RBT hybridization for the second decade (2010–2018) composite, with dimensionless hybridization levels ranging from low (0) to high (1); black lines delineate individual HUCs within the larger study basin.
FIGURE 4
FIGURE 4
Top predictors of RBT occurrence as identified by the occurrence model.
FIGURE 5
FIGURE 5
(A) Predicted RBT relative suitability of first decade (2002–2010) and (B) second decade (2010–2018) vector composites within the Flathead basin study region; black lines delineate individual HUCs within the larger basin.
FIGURE 6
FIGURE 6
Normalized predicted relative RBT suitability change between the second and first decades of the study period (2002–2018) within the Flathead basin. The Salish Mountains and Lewis Range sub-regions decreased in suitability (blue-green shades; blue arrow), while suitability marginally increased in other regions and increased more drastically in portions of the northern Mission Range and east Glacier National Park regions (red shades; pink arrow).
FIGURE 7
FIGURE 7
Multivariate Environmental Similarity Surface in the Flathead basin for the 2nd decade (2010–2018) vector composite, which was consistent with the first decade (2002–2010) composite. Greener shades in the similarity surface indicate that most HUCs fall within a reasonable extrapolation distance from RBT training locations.
FIGURE 8
FIGURE 8
Partial dependency plots for surface water flashiness in both the RBT occurrence ensemble (A) and the hybridization ensemble (B) models.
FIGURE 9
FIGURE 9
Partial dependency plots for Heat Insolation Load in both the RBT occurrence ensemble (A) and the hybridization ensemble (B) models.
FIGURE 10
FIGURE 10
Partial dependency plot showing RBT occurrence model performance against stream-temperature adjusted Land Surface temperature in the Flathead River basin (A) versus predicted water temperature (wtemp) niche requirements of RBT (B) from Wenger et al. (2011).
FIGURE 11
FIGURE 11
Partial dependency plot showing RBT occurrence model performance against Gross Primary Productivity in the Flathead basin study region.

References

    1. Abatzoglou J. T. (2013). Development of gridded surface meteorological data for ecological applications and modelling. Int. J. Climatol. 33, 121–131. 10.1002/joc.3413 - DOI
    1. Allendorf F. W., Leary R. F., Hitt N. P., Knudsen K. L., Lundquist L. L., Spruell P. (2004). Intercrosses and the U.S. Endangered Species Act: Should Hybridized Populations be Included as Westslope Cutthroat Trout. Conservation Biol. 18, 1203–1213. 10.1111/j.1523-1739.2004.00305.x - DOI
    1. Altmann A., Toloşi L., Sander O., Lengauer T. (2010). Permutation importance: a corrected feature importance measure. Bioinformatics 26, 1340–1347. 10.1093/bioinformatics/btq134 - DOI - PubMed
    1. Bayliss H., Stewart G., Wilcox A., Randall N. (2013). A perceived gap between invasive species research and stakeholder priorities. Nb 19, 67–82. 10.3897/neobiota.19.4897 - DOI
    1. Bear E. A., Mcmahon T. E., Zale A. V. (2007). Comparative Thermal Requirements of Westslope Cutthroat Trout and Rainbow Trout: Implications for Species Interactions and Development of Thermal Protection Standards. Trans. Am. Fish. Soc. 136, 1113–1121. 10.1577/t06-072.1 - DOI