Neural networks for geospatial data
- PMID: 40255678
- PMCID: PMC12007808
- DOI: 10.1080/01621459.2024.2356293
Neural networks for geospatial data
Abstract
Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a Gaussian process covariance model, encoding the spatial dependence. While nonlinear machine learning algorithms like neural networks are increasingly being used for spatial analysis, current approaches depart from the model-based setup and cannot explicitly incorporate spatial covariance. We propose NN-GLS, embedding neural networks directly within the traditional Gaussian process (GP) geostatistical model to accommodate non-linear mean functions while retaining all other advantages of GP, like explicit modeling of the spatial covariance and predicting at new locations via kriging. In NN-GLS, estimation of the neural network parameters for the non-linear mean of the Gaussian Process explicitly accounts for the spatial covariance through use of the generalized least squares (GLS) loss, thus extending the linear case. We show that NN-GLS admits a representation as a special type of graph neural network (GNN). This connection facilitates the use of standard neural network computational techniques for irregular geospatial data, enabling novel and scalable mini-batching, backpropagation, and kriging schemes. We provide methodology to obtain uncertainty bounds for estimation and predictions from NN-GLS. Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes. We also provide a finite sample concentration rate, which quantifies the need to accurately model the spatial covariance in neural networks for dependent data. To our knowledge, these are the first large-sample results for any neural network algorithm for irregular spatial data. We demonstrate the methodology through numerous simulations and an application to air pollution modeling. We develop a software implementation of NN-GLS in the Python package geospaNN.
Keywords: Gaussian process; consistency; geostatistics; graph neural networks; kriging; machine learning; neural networks.
References
-
- Abramowitz M and Stegun IA (1948), Handbook of mathematical functions with formulas, graphs, and mathematical tables, Vol. 55, US Government printing office.
-
- Banerjee S, Carlin BP and Gelfand AE (2014), Hierarchical modeling and analysis for spatial data, CRC press.
-
- Breiman L. (2001), ‘Random forests’, Machine learning 45(1), 5–32.
-
- Chen W, Li Y, Reich BJ and Sun Y (2024), ‘Deepkriging: Spatially dependent deep neural networks for spatial prediction’, Statistica Sinica 34, 291–311.
-
- Cressie N and Wikle CK (2015), Statistics for spatio-temporal data, John Wiley & Sons.
Grants and funding
LinkOut - more resources
Miscellaneous