Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 15:7:1469809.
doi: 10.3389/fdata.2024.1469809. eCollection 2024.

Analysis and prediction of atmospheric ozone concentrations using machine learning

Affiliations

Analysis and prediction of atmospheric ozone concentrations using machine learning

Stephan Räss et al. Front Big Data. .

Abstract

Atmospheric ozone chemistry involves various substances and reactions, which makes it a complex system. We analyzed data recorded by Switzerland's National Air Pollution Monitoring Network (NABEL) to showcase the capabilities of machine learning (ML) for the prediction of ozone concentrations (daily averages) and to document a general approach that can be followed by anyone facing similar problems. We evaluated various artificial neural networks and compared them to linear as well as non-linear models deduced with ML. The main analyses and the training of the models were performed on atmospheric air data recorded from 2016 to 2023 at the NABEL station Lugano-Università in Lugano, TI, Switzerland. As a first step, we used techniques like best subset selection to determine the measurement parameters that might be relevant for the prediction of ozone concentrations; in general, the parameters identified by these methods agree with atmospheric ozone chemistry. Based on these results, we constructed various models and used them to predict ozone concentrations in Lugano for the period between January 1, 2024, and March 31, 2024; then, we compared the output of our models to the actual measurements and repeated this procedure for two NABEL stations situated in northern Switzerland (Dübendorf-Empa and Zürich-Kaserne). For these stations, predictions were made for the aforementioned period and the period between January 1, 2023, and December 31, 2023. In most of the cases, the lowest mean absolute errors (MAE) were provided by a non-linear model with 12 components (different powers and linear combinations of NO2, NOX, SO2, non-methane volatile organic compounds, temperature and radiation); the MAE of predicted ozone concentrations in Lugano was as low as 9 μgm-3. For the stations in Zürich and Dübendorf, the lowest MAEs were around 11 μgm-3 and 13 μgm-3, respectively. For the tested periods, the accuracy of the best models was approximately 1 μgm-3. Since the aforementioned values are all lower than the standard deviations of the observations we conclude that using ML for complex data analyses can be very helpful and that artificial neural networks do not necessarily outperform simpler models.

Keywords: Air Pollution Monitoring; Keras; artificial neural networks; atmospheric ozone; data analysis; machine learning; multilayer perceptron.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Daily averages of ozone concentrations recorded at the NABEL station in Lugano between January 1, 2016, and December 31, 2023.
Figure 2
Figure 2
Correlations of daily averages of all measurement parameters recorded at the NABEL station in Lugano between January 1, 2023, and December 31, 2023. The parameters are abbreviated as indicated in Section 1.
Figure 3
Figure 3
Ozone concentrations and residuals (absolute differences between predictions and targets). Predictions were made for the period between January 1, 2024, and October 31, 2024 for the NABEL station in Lugano using (A) Model 8 (recurrent neural network) and (B) Model 4 (non-linear model with 12 predictors). For model training, data recorded between January 1, 2016, and December 31, 2023, at the same station were used. The 50-day averages of the residuals were calculated using the 50 preceding the time at which they are plotted.

References

    1. Abbot J., Marohasy J. (2017). The application of machine learning for evaluating anthropogenic versus natural climate change. Geo. Res. J. 14, 36–46. 10.1016/j.grj.2017.08.001 - DOI
    1. Almeida L. B. (2020). Multilayer Perceptrons. Boca Raton, FL: CRC Press.
    1. Ballaman R., Weber R., Emmenegger L., Hüglin C., Reimann S. (2020). Nationales Beobachtungsnetz für Luftfremdstoffe (NABEL) Messkonzept 2020–2030. Available at: https://www.bafu.admin.ch/bafu/de/home/themen/luft/zustand/daten/nationa... (accessed April 26, 2024).
    1. Biehl M. (2023). The Shallow and the Deep: A Biased Introduction to Neural Networks and Old School Machine Learning. Groningen: University of Groningen Press.
    1. Bochenek B., Ustrnul Z. (2022). Machine learning in weather prediction and climate analyses—applications and perspectives. Atmosphere 13:180. 10.3390/atmos13020180 - DOI