Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 27.
doi: 10.1021/acs.est.5c09687. Online ahead of print.

Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images

Affiliations

Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images

Zhendong Yuan et al. Environ Sci Technol. .

Abstract

Mobile monitoring campaigns combined with land use regression (LUR) models effectively capture fine-scale spatial variations in urban air pollution. However, traditional predictor variables often fail to capture the nuances of the built environment and undocumented emission sources. To address this, we developed a framework integrating customizable object-level and segmentation-level visual features from street-view images into stepwise regression and random-forest-based LUR models. Using 5.7 million mobile air pollution measurements (2019-2020) and 0.37 million street-view images (2008-2024), we mapped nitrogen dioxide (NO2), black carbon (BC), and ultrafine particles (UFP) across 46,664 road segments in Amsterdam, The Netherlands. Incorporating street-view images improved model performance, increasing R2 by 0.01-0.05 and reducing mean absolute errors by 0.7-10.3%. Sensitivity analyses indicated that key street-view-derived visual features remained stable across years and seasons. Using images from nearby years expanded training instances, thereby enhancing alignment with mobile measurements at fine granularity. Our open-vocabulary object detection module identified influential but previously unrecognized object predictors, such as chimneys, traffic lights, and shops. Combined with segmentation-derived features (e.g., walls, roads, grass), street-view images contributed 8-18% feature importance to model predictions. These findings highlight the potential of visual data in enhancing hyperlocal air pollution mapping and exposure assessment.

Keywords: air pollution; deep learning; exposure assessment; land use regression (LUR); mobile sensing; street-view image; vision-language model (VLM); vision-transformer models (ViT).

PubMed Disclaimer

LinkOut - more resources