Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 13:11:e15425.
doi: 10.7717/peerj.15425. eCollection 2023.

Comparing quantile regression spline analyses and supervised machine learning for environmental quality assessment at coastal marine aquaculture installations

Affiliations

Comparing quantile regression spline analyses and supervised machine learning for environmental quality assessment at coastal marine aquaculture installations

Kleopatra Leontidou et al. PeerJ. .

Abstract

Organic enrichment associated with marine finfish aquaculture is a local stressor of marine coastal ecosystems. To maintain ecosystem services, the implementation of biomonitoring programs focusing on benthic diversity is required. Traditionally, impact-indices are determined by extracting and identifying benthic macroinvertebrates from samples. However, this is a time-consuming and expensive method with low upscaling potential. A more rapid, inexpensive, and robust method to infer the environmental quality of marine environments is eDNA metabarcoding of bacterial communities. To infer the environmental quality of coastal habitats from metabarcoding data, two taxonomy-free approaches have been successfully applied for different geographical regions and monitoring goals, namely quantile regression splines (QRS) and supervised machine learning (SML). However, their comparative performance remains untested for monitoring the impact of organic enrichment introduced by aquaculture on marine coastal environments. We compared the performance of QRS and SML using bacterial metabarcoding data to infer the environmental quality of 230 aquaculture samples collected from seven farms in Norway and seven farms in Scotland along an organic enrichment gradient. As a measure of environmental quality, we used the Infaunal Quality Index (IQI) calculated from benthic macrofauna data (reference index). The QRS analysis plotted the abundance of amplicon sequence variants (ASVs) as a function to the IQI from which the ASVs with a defined abundance peak were assigned to eco-groups and a molecular IQI was subsequently calculated. In contrast, the SML approach built a random forest model to directly predict the macrofauna-based IQI. Our results show that both QRS and SML perform well in inferring the environmental quality with 89% and 90% accuracy, respectively. For both geographic regions, there was high correspondence between the reference IQI and both the inferred molecular IQIs (p < 0.001), with the SML model showing a higher coefficient of determination compared to QRS. Among the 20 most important ASVs identified by the SML approach, 15 were congruent with the good quality spline ASV indicators identified via QRS for both Norwegian and Scottish salmon farms. More research on the response of the ASVs to organic enrichment and the co-influence of other environmental parameters is necessary to eventually select the most powerful stressor-specific indicators. Even though both approaches are promising to infer environmental quality based on metabarcoding data, SML showed to be more powerful in handling the natural variability. For the improvement of the SML model, addition of new samples is still required, as background noise introduced by high spatio-temporal variability can be reduced. Overall, we recommend the development of a powerful SML approach that will be onwards applied for monitoring the impact of aquaculture on marine ecosystems based on eDNA metabarcoding data.

Keywords: Bacterial indicators; Benthic monitoring; Organic enrichment; Quantile regression splines; Salmon farms; Supervised machine learning; eDNA metabarcoding.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1. Number of indicators assigned to each eco-group in (A) Norway (n = 138 samples) and (B) Scotland (n = 92 samples), with Eco-Group I corresponding to very sensitive taxa and Eco-Group V to opportunistic ones.
Figure 2
Figure 2. Top 20 ASVs with the highest importance value which was assigned by random forest (RF) for (A) Norway and (B) Scotland.
Indicated with grey are the ASVs which were identified as indicators with quantile regression splines (QRS).
Figure 3
Figure 3. Linear regression plots showing the relationship between the infaunal quality index (IQI) and the molecular IQI as estimated by quantile regression splines (QRS) and random forest (RF) for (A) Norway and (B) Scotland salmon farms.
The boxes indicate the two environmental quality categories that IQI assigns the samples (i.e., blue for very good to good environmental quality samples and gray for moderate to poor environmental quality samples). Samples that are found inside the boxes are samples accurately predicted by the molecular IQI. The regression equation and the corresponding R2 values are given for each regression plot.
Figure 4
Figure 4. Erroneously predicted samples by quantile regression splines (QRS), random forest (RF) and both methods (RF+QRS) for (A) Norway and (B) Scotland salmon farms.
The vertical dotted line corresponds to the IQI threshold set to the 0.64 IQIMA good/moderate threshold.

References

    1. Apothéloz-Perret-Gentil L, Cordonier A, Straub F, Iseli J, Esling P, Pawlowski J. Taxonomy-free molecular diatom index for high-throughput eDNA biomonitoring. Molecular Ecology Resources. 2017;17(6):1231–1242. doi: 10.1111/1755-0998.12668. - DOI - PubMed
    1. Armstrong E, Verhoeven J. Machine learning analyses of bacterial oligonucleotide frequencies to assess the benthic impact of aquaculture. Aquaculture Environment Interactions. 2020;12:131–137. doi: 10.3354/aei00353. - DOI
    1. Aylagas E, Atalah J, Sánchez-Jerez P, Pearman JK, Casado N, Asensi J, Toledo-Guedes K, Carvalho S. A step towards the validation of bacteria biotic indices using DNA metabarcoding for benthic monitoring. Molecular Ecology Resources. 2021;21(6):1889–1903. doi: 10.1111/1755-0998.13395. - DOI - PubMed
    1. Aylagas E, Borja Á, Tangherlini M, Dell’Anno A, Corinaldesi C, Michell CT, Irigoien X, Danovaro R, Rodríguez-Ezpeleta N. A bacterial community-based index to assess the ecological status of estuarine and coastal environments. Marine Pollution Bulletin. 2017;114(2):679–688. doi: 10.1016/j.marpolbul.2016.10.050. - DOI - PubMed
    1. Aylagas E, Mendibil I, Borja Á, Rodríguez-Ezpeleta N. Marine sediment sample pre-processing for macroinvertebrates metabarcoding: mechanical enrichment and homogenization. Frontiers in Marine Science. 2016;3(96):203. doi: 10.3389/fmars.2016.00203. - DOI

Publication types

LinkOut - more resources