Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 20:912:168872.
doi: 10.1016/j.scitotenv.2023.168872. Epub 2023 Nov 25.

Setting nutrient boundaries to protect aquatic communities: The importance of comparing observed and predicted classifications using measures derived from a confusion matrix

Affiliations
Free article

Setting nutrient boundaries to protect aquatic communities: The importance of comparing observed and predicted classifications using measures derived from a confusion matrix

Geoff Phillips et al. Sci Total Environ. .
Free article

Abstract

Defining nutrient thresholds that protect and support the ecological integrity of aquatic ecosystems is a fundamental step in maintaining their natural biodiversity and preserving their resilience. With increasing catchment pressures and climate change, it is more important than ever to develop clear methods to establish thresholds for status classification and management of waters. This must often be achieved using complex data and should be robust to interference from additional pressures as well as ameliorating or confounding conditions. We use both artificial and real data to examine challenges in setting nutrient thresholds in unbalanced and skewed data. We found significant advantages to using binary logistic regression over other techniques. However, one of the key challenges is objectively selecting a probability from which to derive the nutrient threshold. For this purpose, the examination of the proportions of matching and mismatching status classifications of nutrients and a biological quality element using a confusion matrix is a key step that should be more widely adopted in threshold selection. We examined a large array of statistical measures of classification accuracy and their performance over combinations of skewness and imbalance in the data. The most appropriate threshold probability is a compromise between maximising overall classification accuracy and reducing mismatches expressed as commission (false positives) without excessive omission (false negatives). An application to a lake type indicated total phosphorus thresholds that would be around 50 μg l-1 lower than the threshold achieved by an 'unguided' approach, indicating that this approach is a very significant development meriting attention from national authorities responsible for water management.

Keywords: Accuracy; Binary logistic regression; Categorical models; Classification measures; Lakes; Nutrient standards; Rivers; WFD.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

LinkOut - more resources