Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images
- PMID: 41014621
- PMCID: PMC12509304
- DOI: 10.1021/acs.est.5c09687
Advancing Air Pollution Exposure Models with Open-Vocabulary Object Detection and Semantic Segmentation of Street-View Images
Abstract
Mobile monitoring campaigns combined with land use regression (LUR) models effectively capture fine-scale spatial variations in urban air pollution. However, traditional predictor variables often fail to capture the nuances of the built environment and undocumented emission sources. To address this, we developed a framework integrating customizable object-level and segmentation-level visual features from street-view images into stepwise regression and random-forest-based LUR models. Using 5.7 million mobile air pollution measurements (2019-2020) and 0.37 million street-view images (2008-2024), we mapped nitrogen dioxide (NO2), black carbon (BC), and ultrafine particles (UFP) across 46,664 road segments in Amsterdam, The Netherlands. Incorporating street-view images improved model performance, increasing R2 by 0.01-0.05 and reducing mean absolute errors by 0.7-10.3%. Sensitivity analyses indicated that key street-view-derived visual features remained stable across years and seasons. Using images from nearby years expanded training instances, thereby enhancing alignment with mobile measurements at fine granularity. Our open-vocabulary object detection module identified influential but previously unrecognized object predictors, such as chimneys, traffic lights, and shops. Combined with segmentation-derived features (e.g., walls, roads, grass), street-view images contributed 8-18% feature importance to model predictions. These findings highlight the potential of visual data in enhancing hyperlocal air pollution mapping and exposure assessment.
Keywords: air pollution; deep learning; exposure assessment; land use regression (LUR); mobile sensing; street-view image; vision-language model (VLM); vision-transformer models (ViT).
Figures
References
-
- Huang J., Fei T., Kang Y., Li J., Liu Z., Wu G.. Estimating Urban Noise along Road Network from Street View Imagery. Int. J. Geogr. Inf. Sci. 2023;38(1):128–155. doi: 10.1080/13658816.2023.2274475. - DOI
-
- Song L., Liu D., Kwan M.-P., Liu Y., Zhang Y.. Machine-Based Understanding of Noise Perception in Urban Environments Using Mobility-Based Sensing Data. Comput. Environ. Urban Syst. 2024;114:102204. doi: 10.1016/j.compenvurbsys.2024.102204. - DOI
-
- Yang S., Chong A., Liu P., Biljecki F.. Thermal Comfort in Sight: Thermal Affordance and Its Visual Assessment for Sustainable Streetscape Design. Build. Environ. 2025;271:112569. doi: 10.1016/j.buildenv.2025.112569. - DOI
-
- Qi Q., Meng Q., Wang J., Ren P.. Developing an Optimized Method for the ‘Stop-and-Go’ Strategy in Mobile Measurements for Characterizing Outdoor Thermal Environments. Sustain. Cities Soc. 2021;69:102837. doi: 10.1016/j.scs.2021.102837. - DOI
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
