A Novel Flexible Geographically Weighted Neural Network for High-Precision PM2.5 Map** across the Contiguous United States

Wang, Dongchao; Cao, Jianfei; Zhang, Baolei; Zhang, Ye; **e, Lei

doi:10.3390/ijgi13070217

Open AccessArticle

A Novel Flexible Geographically Weighted Neural Network for High-Precision PM2.5 Map** across the Contiguous United States

by

Dongchao Wang

¹

,

Jianfei Cao

¹

,

Baolei Zhang

^1,*,

Ye Zhang

² and

Lei **e

²

¹

College of Geography and Environment, Shandong Normal University, **an 250358, China

²

Shandong Provincial Territorial Spatial Ecological Restoration Center, **an 250014, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2024, 13(7), 217; https://doi.org/10.3390/ijgi13070217

Submission received: 6 May 2024 / Revised: 15 June 2024 / Accepted: 20 June 2024 / Published: 22 June 2024

(This article belongs to the Special Issue HealthScape: Intersections of Health, Environment, and GIS&T)

Download

Browse Figures

Versions Notes

Abstract

:

Air quality degradation has triggered a large-scale public health crisis globally. Existing machine learning techniques have been used to attempt the remote sensing estimates of PM2.5. However, many machine learning models ignore the spatial non-stationarity of predictive variables. To address this issue, this study introduces a Flexible Geographically Weighted Neural Network (FGWNN) to estimate PM2.5 based on multi-source remote sensing data. FGWNN incorporates the Flexible Geographical Neuron (FGN) and Geographical Activation Function (GWAF) within the framework of Artificial Neural Network (ANN) to capture the intricate spatial non-stationary relationships among predictive variables. A robust air quality remote sensing estimation model was constructed using remote sensing data of Aerosol Optical Depth (AOD), Normalized Difference Vegetation Index (NDVI), Temperature (TMP), Specific Humidity (SPFH), Wind Speed (WIND), and Terrain Elevation (HGT) as inputs, and Ground-Based PM2.5 as the observation. The results indicated that FGWNN successfully generates PM2.5 remote sensing data with a 2.5 km spatial resolution for the contiguous United States (CONUS) in 2022. It exhibits higher regression accuracy compared to traditional ANN and Geographically Weighted Regression (GWR) models. FGWNN holds the potential for applications in high-precision and high-resolution remote sensing scenarios.

Keywords:

FGWNN; spatial non-stationary; high-precision PM2.5 inversion; multi-source remote sensing

1. Introduction

Atmospheric pollution is primarily composed of PM2.5 particles, which can persist in the atmosphere and exert widespread and profound impacts on human health and the environment [1]. Environmental remote sensing provides a means for globally, continuously, and in real-time retrieving PM2.5 concentrations [2]. The PM2.5 data from the United States (U.S.) Environmental Protection Agency (EPA), as a product of the U.S. ground-based air quality monitoring network, have undergone rigorous quality validation [3], and are widely utilized in environmental science [4], atmospheric science [5], public health [6], disaster management [7], and other fields. This dataset furnishes researchers and policymakers with crucial data support, aiding in our better understanding and addressing of atmospheric pollution issues [8]. Nevertheless, existing remote sensing-derived PM2.5 products commonly suffer from low spatial resolution, failing to delineate local details [9]. To overcome this limitation, acquiring high-resolution PM2.5 spatial distribution data is of paramount importance for the dynamic monitoring and control of atmospheric PM2.5 pollution [10,11,12].

In the task of retrieving PM2.5 concentrations, the selection of predictive factors is a critical step. The choice of predictors should be based on atmospheric physical and chemical processes, ecological environment quality, meteorological factors, and geographic information. Common parameters include Aerosol Optical Depth (AOD), Normalized Difference Vegetation Index (NDVI), meteorological conditions, and topographic features [13]. AOD provides global atmospheric optical information and is a significant indicator of the atmospheric physical and chemical evolution of air pollutants [14]. Numerous studies have demonstrated a significant correlation between AOD and surface PM2.5 concentration, making AOD one of the most reliable explanatory factors in PM2.5 prediction [15,16,17]. NDVI reflects vegetation health, land use changes, and ecosystem productivity, serving as an essential measure of ecological environment quality. Studies have shown that NDVI significantly impacts PM2.5 concentrations by reducing dust, adsorbing particles, improving microclimate conditions, reducing pollution sources, and enhancing ecosystem purification functions [18]. Meteorological conditions actively influence the dispersion, dilution, and deposition of pollutants, significantly affecting the spatiotemporal distribution of PM2.5 concentrations [19]. Topographic features alter atmospheric morphology by adjusting air flow, forming the temperature inversion layers, influencing meteorological conditions, and creating urban heat island effects, indirectly affecting the spatial distribution of PM2.5 concentrations [20].

The estimation of PM2.5 through remote sensing involves methods such as AOD inversion [21], atmospheric chemical transport models [22], spatiotemporal interpolation [23], data assimilation [24], among others. The most widely used parameter is the satellite-monitored AOD. To estimate ground-level PM2.5 from AOD, a typical strategy is to establish the statistical relationship between AOD and PM2.5 [25]. The accuracy of these methods is constrained by the number of monitoring stations, remote sensing data resolution, and model quality [26], and a comprehensive approach incorporating multiple data sources and methods is often necessary to enhance inversion accuracy. However, the reality is that traditional spatial statistical tools tend to focus on detecting spatial relationships in sample data [27], and when the spatial density and uniformity of sampling points are insufficient, the estimation accuracy and confidence significantly decrease [28]. The air quality products derived from ground-based stations can solve these problems.

In recent years, new research has emerged in which geostatistical tools and machine learning methods are used for PM2.5 inversion. Scholars have designed geographically weighted regression (GWR) models [29,30] and mixed-effects models [31] for detecting geographical relationships between PM2.5 and data such as AOD, meteorological parameters, and land use information. The novel convolutional neural network (CNN) model can utilize the spatial correlation between predictor variables to increase the ground-level PM2.5 estimation accuracy to some extent [32]. Combining AOD and big data, the PM2.5 regression model using the random forest algorithm can assess the risk of air pollution exposure in the Yangtze River Delta urban agglomeration region during COVID-19 [33]. These inverse models are not perfectly compatible with nonlinear fitting and spatial relationship detection. Traditional geostatistical models cannot fit complex nonlinear relationships, while machine learning methods cannot express spatial non-stationarity.

As studies on PM2.5 spatial patterns increase, various machine learning-related methods (e.g., CNN, Artificial Neural Network (ANN) and Generalized Regression Neural Network (GRNN)) have gradually been introduced. To more accurately calculate geographically weighted kernels, the Geographically Neural Network Weighted Regression (GNNWR) innovatively combines Ordinary Least Squares (OLS) and neural networks to successfully estimate complex geographical processes [34]. In addition to spatial relationships, temporal series are also important research objects in the field of GWR. The Geographically and Temporally Weighted Neural Network (GTWNN) accounts for both spatial and temporal non-stationarity and has been applied in high-precision crop yield prediction modeling [35]. To address nonlinearity and spatiotemporal heterogeneity, researchers have proposed another GTWNN using GRNN, which shows a superior performance in exploring the spatiotemporal relationship between AOD and PM2.5 [36]. However, these GWR-ANN methods mainly focus on improving the accuracy of regression relationships without considering the impact of training samples on the accuracy of spatial dependence, resulting in a certain degree of discount in predictive performance.

Many studies have used GWR or neural networks for PM2.5 inversion, with their data processing methods being essentially similar. When faced with imperfect training samples, the common approach is to first train the optimal regression model [37]. If the prediction samples are also not ideal, one can choose to enhance the density of prediction samples using interpolation techniques, effectively filling the entire target resolution space, or proceed without any further adjustments [38]. Finally, the prediction samples are input into the regression model to obtain the prediction results. If non-ideal prediction samples are left unprocessed, interpolation methods are used to complete the prediction data [39]. This posteriori method results in predictions with high specificity, greatly limiting the model’s generalization capability [40]. In contrast, constructing a uniform and dense spatial network would lead to a more comprehensive and accurate understanding of spatial non-stationarity.

This study endeavors to incorporate spatial non-stationary into a machine learning model for the high-precision estimation of PM2.5 via remote sensing data. The proposed Flexible Geographically Weighted Neural Network (FGWNN) model is designed with the Flexible Geographical Neuron (FGN) and Geographically Weighted Activation Function (GWAF) to mitigate the negative impacts of uniform and sparse samples on regression accuracy. It enables the simultaneous learning of spatial non-stationarity and global non-linear relationships within the neural network. The 2.5 km spatial resolution PM2.5 data over the contiguous U.S. (CONUS) can be predicted by FGWNN with conventional satellite remote sensing product data. The organization of this paper is as follows. Section 2 elaborates on the study region and data materials associated with this study. Section 3 provides a detailed description of the FGWNN model design and evaluation. Section 4 demonstrates the FGWNN’s performance and spatiotemporal patterns of PM2.5. Recommendations and further discussions based on this research will be presented in Section 5. Finally, we conclude in Section 6.

2. Study Region and Material

2.1. Study Area

The CONUS (Figure 1), excluding Alaska and Hawaii, comprises 48 states and the District of Columbia, with a total area of approximately 7.6 million square kilometers, representing over 80% of the nation’s land area. The terrain generally exhibits a west-to-east elevation gradient, featuring high mountains and plateaus such as the Rocky Mountains, the Cascade Range, and the Colorado Plateau in the west, and low mountains and plains including the Appalachian Mountains, the Great Plains, and coastal plains in the east. The population of the CONUS is predominantly concentrated in the eastern and western coastal regions, as well as some inland states in the south and west. California, Texas, Florida, and New York are the four most populous states in the CONUS [41].

The influence factors of air quality exhibit regional variations in the CONUS (Figure 2). These variations are attributed to factors such as meteorological conditions, terrain features, and the distribution of emission sources [42]. Extreme events, such as forest fires, dust storms, and volcanic eruptions, can also impact air quality in the CONUS. In recent years, large-scale forest fires in Canada have led to a surge in PM2.5 concentrations, affecting millions of people in the CONUS [43].

2.2. Data Sources

In the field of air pollution research, obtaining large-scale and long-term remote sensing data is crucial for understanding the spatiotemporal patterns of air pollutants. Remote sensing datasets from MODIS, Landsat, Sentinel, and others, combined with ground-level air quality monitoring data, have greatly facilitated collaboration and research in the field of air pollution [44]. However, existing air quality data sources exhibit significant differences in spatial resolution, with low-resolution data diminishing the utility and quality of high-resolution data.

2.2.1. EPA PM2.5

Ground-level PM2.5 monitoring data for the CONUS are derived from the EPA’s Outdoor Air Quality Data (https://aqs.epa.gov/aqsweb/airdata/download_files.html, accessed on 21 March 2024) [45]. We selected data records from 20 March 2022, to 21 March 2023 as the initial dataset for our study. The annual and seasonal averages of pollutant concentrations were calculated based on 24 h average PM2.5 values (pollutant standard: PM25 24 h 2012). Furthermore, sites included in the average calculation were required to have monitoring data for more than 100 days. A total of 473 valid PM2.5 ground-level monitoring sites were obtained within the study area. The average PM2.5 concentrations for all sites are depicted in Figure 1, reflecting the overall spatial distribution of PM2.5 in the CONUS in 2022.

2.2.2. MODIS AOD

The MCD19A2 V6.1 data product is a Level-2 gridded product of land-based AOD from the MODIS Terra and Aqua instruments, generated daily at a pixel resolution of 1 km (https://lpdaac.usgs.gov/products/mcd19a2v061/, accessed on 21 March 2024) [46]. The product includes relevant AOD layers, such as 0.47 μm blue band AOD, 0.55 μm green band AOD, and AOD uncertainty. In this study, the 0.47 μm blue band AOD is selected as the research data.

2.2.3. MODIS NDVI

The MODIS Vegetation Index (MYD13Q1) V6.1 data are generated every 16 days (https://lpdaac.usgs.gov/products/myd13q1v061/, accessed on 21 March 2024) [47] with a spatial resolution of 250 m for Level-3 products. MODIS NDVI products are calculated from atmospherically corrected bi-directional surface reflectance and are processed to mask water, clouds, heavy aerosols, and cloud shadows.

2.2.4. NOAA RTMA

The Real-Time Mesoscale Analysis (RTMA) is part of the NOAA Analysis and Observation (AoR) project (https://www.nco.ncep.noaa.gov/pmb/products/rtma/, accessed on 21 March 2024) [48]. It is a high spatiotemporal resolution near-surface weather analysis approach. The product provides hourly analysis data at 2.5 km resolution for CONUS grid cells. The analysis product includes surface-observable weather elements and accounts for terrain effects. It also provides analysis for total cloud cover and visibility. In this study, RTMA is responsible for providing accurate weather conditions (temperature, specific humidity, wind speed) and model terrain elevation data (model terrain elevation) for PM2.5 inversion.

2.3. Data Preprocessing and Integration

Before modeling, it is necessary to preprocess and integrate the acquired remote sensing and ground-level datasets to ensure the data quality and consistency. Firstly, the spatial projection system of the remote sensing datasets was integrated and reprojected to the Albers Equal Area Conic projection (ESRI:102003). Secondly, the meteorological and topographic datasets were processed using the nearest neighbor method, sampling aerosol and meteorological variables based on the coordinates of the PM2.5 monitoring points. Then, a square buffer with a 2.5 km spatial resolution was created for each PM2.5 monitoring station, and the mean values of AOD and NDVI within the buffer were resampled onto the corresponding PM2.5 monitoring point. The preprocessing and integration steps are crucial for generating reliable and robust input data for subsequent modeling work.

2.4. AOD-PM2.5 Model Structure

In this study, PM2.5 is used as the dependent variable, and AOD, NDVI, TMP, SPFH, WIND, and HGT are used as independent variables to construct the PM2.5 inversion model as follows:

P M 2.5 ~ A O D + N D V I + T M P + S P F H + W I N D + H G T

(1)

We used the average data of 2022 to conduct multicollinearity diagnosis (Table 1). The analysis results indicate that the variance inflation factor (VIF) for all independent variables are less than 10, suggesting no significant collinearity among the variables. After standardization of the regression coefficients (Beta), TMP, NDVI, and AOD have the most significant positive impacts on PM2.5.

3. FGWNN Model for PM2.5 Estimation

3.1. Model Development

To overcome the influence on modeling from the spatial distribution and density of training data, this study designed a specific network structure for FGWNN (Figure 3). The input layer with n represents the number of training samples, while it stands for the number of prediction data when the model is used for prediction. It’s important to note that the number m in the hidden layer always represents the number of FGN and does not change with the working mode. The new network architecture brings two significant advantages, namely significant savings in computer storage space and computation time.

The input layer is responsible for assembling the independent variables (AOD, NDVI, TMP, SPFH, WIND, and HGT) as inputs. The hidden layer stores FGNs, and the GWAF is synchronously set above the corresponding FGNs. The output layer contains only one neuron corresponding to the dependent variable (PM2.5) output. The network weights (

w_{j}^{[1]}

) and biases (

b_{j}^{[1]}

) in the input layer are consistent with ANN.

X_{i} = [\begin{matrix} 1 & x_{i 1} & \dots & x_{i p} \end{matrix}]

(2)

W^{[1]} = [\begin{matrix} \begin{matrix} b_{1}^{[1]} & \dots & b_{n}^{[1]} \end{matrix} \\ \begin{matrix} w_{11}^{[1]} & \dots & w_{1 n}^{[1]} \\ ⋮ & ⋱ & ⋮ \\ w_{p 1}^{[1]} & \dots & w_{p n}^{[1]} \end{matrix} \end{matrix}]

(3)

W_{o r i g i n}^{[2]} = [\begin{matrix} w_{1}^{[2]} \\ ⋮ \\ w_{m}^{[2]} \end{matrix}]

(4)

Existing neural network activation functions primarily focus on global function transformations of input signals without considering the influence of spatial location on output signals. The GWAF can utilize spatial weighting to establish spatial connections between neurons on the same layer, where different input samples activate the neurons differently. It is a local activation function, and it controls the activation level of neurons through geographical weighting. The specific formula is as follows:

\emptyset^{G W} (n e t_{i j}) = n e t_{i j} \times g w_{i j}

(5)

The GWAF utilizes spatial weighting to measure spatial correlations, and the activation level of neurons depends on the spatial distance between the neuron and the target sample. This achieves feature transformation of input signals and spatial smoothing. The GWAF couples neurons with spatial positions, transforming ordinary neurons into FGNs. The contribution of the output signal to the result depends on the height of spatial weights. In other words, the GWAF reflects the spatial non-stationarity characteristics of samples.

The regression process of FGWNN can be represented by a matrix equation as follows:

y_{i} = X_{i} W^{[1]} (G W_{i}^{T} \otimes W_{o r i g i n}^{[2]}) + b^{[2]}

(6)

In equation, the logical multiplication symbol ⊗ represents the element-wise multiplication of the corresponding sub-elements of the matrices on both sides. This operation results in a new matrix with the original dimension.

3.2. Model Hyperparameters Setting

The FGWNN model has various hyperparameters required for neural network learning, and including the spatial bandwidth (bw) among them would increase the computational cost of model training. Considering that the new model inherits the characteristics of GWR, we can use GWR for bandwidth selection, which can save a significant amount of computational resources and time.

In GWR, spatial weighting is a tool used to measure geographical proximity. It defines the strength of the relationship between each geographical location and its surrounding neighboring locations, reflecting the spatial correlation of geographical data in the study area. The commonly used Fixed-Gaussian spatial weight calculation formula is as follows:

g w_{i j} = e x p (- {(\frac{d i s t_{i j}}{b w})}^{2})

(7)

The choice of the optimal bw directly determines the estimation accuracy of the GWR model, and different diagnostic indicators yield different optimal bandwidths. In the GWR field, the AICc criterion is typically used for bandwidth selection [49], and the mathematical formula is as follows:

A I C c = 2 n l n ({\hat{σ}}^{2}) + n l n (2 π) + n [\frac{n + t r (S)}{n - 2 - t r (S)}]

(8)

Different information feature recognition results at different spatial scales exhibit some differences, where a smaller spatial resolution provides more details but consumes more computational resources. When conducting geographical observations, the observation scale needs to match the spatial scale of the geographical phenomenon. The setting of the FGN number in FGWNN should depend on the real situation. To construct the ideal state of the geographical neural network, the FGN’s positions need to be homogeneous (uniformly distributed) and compact (moderate density) in the study area.

FGWNN uses a single-neuron output layer with an L2 norm loss function. In the field of gradient descent, many new algorithms have been developed, and in practice, there is no strict categorization of algorithms. Multiple algorithms can be used together. The SGD algorithm (Mini-Batch SGD) [50] is widely used for distributing the training set. The NAdam algorithm [51] has stronger constraints on the learning rate and more direct impact on gradient updates. The learning rate can control the step size of hyperparameter iteration in neural networks. A high value can lead to non-convergence, while a low value can increase the learning cost. In the FGWNN model, the learning rate between network layers is used to fine-tune the overall learning rate of the neural network.

3.3. Model Evaluation

The error function compares the predicted output with the expected output, calculating the difference between them. Commonly used error functions include Residual Sum of Squares (RSS), Mean Squared Error (MSE), and Cross-Entropy Error. When employing an ANN to handle regression problems, the model’s loss function typically uses the RSS. Dividing by 2 in the formula make it easier to calculate the derivative. The specific formula is as follows:

L O S S = \frac{1}{2} \sum_{i = 1}^{n} {(t a r g e t_{i} - o u t p u t_{i})}^{2}

(9)

where LOSS denotes the amount of loss, target denotes the amount of goal, and output denotes the amount of output.

Common evaluation metrics for machine learning regression problems include Root Mean Squared Error (RMSE), LOSS, R², and adjusted R² (

R_{adj}^{2}

). These evaluation metrics are suitable for different scenarios in regression problems, and the choice of the appropriate metric depends on the nature of the problem and the specific requirements for model performance. Given FGWNN’s ability to detect spatial non-stationarity, considering introducing local R² and local RMSE to jointly evaluate the model.

This study uses K-Fold cross-validation [52] to evaluate the model’s performance. In this process, the dataset is divided into K subsets, and the model undergoes K rounds of training and validation. K-Fold cross-validation can reduce the risk of overfitting, improve model robustness, and provide reliable performance estimates. During each round of K-Fold training, FGWNN employs the early-stop** strategy [53]. It stops the model’s training when the performance on the validation set no longer improves, hel** to avoid local optima and overfitting traps. The combination of K-Fold cross-validation and Early-Stop** strategy significantly enhances the model’s performance and stability. It ensures that the model not only learns effective features on the training data but also generalizes well to new data.

4. Results

4.1. Model Parameter Optimization

4.1.1. Uniformization Strategy Performance

The FGWNN model can operate in two modes: nonuniform mode and uniform mode. Before activating the uniformization strategy, the geographic neurons (GNs) in the neural network are aligned with the positions of ground monitoring stations (473 in total). Their spatial distribution is uneven, with dense coverage along coastal areas and sparse coverage in inland regions (Figure 4a). After uniformization, GNs are uniformly distributed across the CONUS (Figure 4b). Comparing the learning curves under the two modes (Figure 4c,d), it is observed that the convergence period for non-uniformization is later than that for uniformization, and the fluctuation in learning is higher for non-uniformization. The optimal global R² for the uniformization mode is slightly higher than that for the non-uniformization mode. To quantify the fitting performance of the FGWNN model under different strategies, we introduce the local R² metric in Figure 4e,f. The results show that, under the non-uniformization mode, the local R² values are higher in densely sampled areas but significantly lower in sparsely sampled areas. After activating the uniformization strategy, the difference in local R² is notably reduced, with a decrease in local R² in dense areas and an increase in local R² in sparse areas. In other words, the uniformization strategy transforms GNs into FGWs, effectively eliminating regression disparities caused by uneven samples.

4.1.2. FGN Number Optimization

Figure 5 illustrates the diagnostic results of the training effectiveness of the FGWNN model under different numbers of FGNs after activating the uniformization strategy. The upper part of the figure illustrates the change patterns of LOSS and R². As the number of FGNs increases, LOSS initially decreases rapidly, reaching a turning point (5000), and then gradually converges. In contrast, R² exhibits an opposite change pattern to LOSS. To quantify the memory consumption and runtime associated with a different FGN size, this experiment focuses on the data storage of the spatial weight matrix and the learning time of the FGWNN model. The results indicate that both memory consumption and runtime follow a logarithmic growth pattern. When the number of FGNs is set to 5000, both memory consumption and runtime remain at relatively low levels. This indicates that the process of optimizing the number of FGNs needs to consider a balance between learning effectiveness and training costs, aiming to minimize the dependence on computational resources while ensuring high learning effectiveness.

4.2. Comparison with Other Models

4.2.1. Overall Comparison of Different Models

Table 2 presents the regression results of five models (MLR, ANN, GWR, GNNWR, and FGWNN) on the PM2.5 data cross the CONUS region in 2022. Specifically, the learning rate of all three models (ANN, GNNWR, and FGWNN) is 0.001, with 5000 neurons in the hidden layer of ANN and FGWNN, compared to 473 in GNNWR. A comparison of the computational time across different models reveals that FGWNN consumes the most time at 161 s, followed by GNNWR at 109 s, ANN at 93 s, GWR at 86 s, and MLR with the lowest consumption at 28 s. The RMSE and LOSS metrics exhibit decreasing trend, with values ranking from highest to lowest as MLR, ANN, GWR, GNNWR, and FGWNN. Conversely, R² and

R_{adj}^{2}

demonstrate a progressively increasing trend. Overall, the regression performance of FGWNN notably surpasses that of MLR, ANN, and GWR models, with a small improvement over GNNWR.

Figure 6 depicts the scatter plots of PM2.5 estimated values versus observed values for the four models, along with their evaluation metric scores and linear fitting equations. Both GWR and FGWNN exhibit the ability to detect spatial non-stationarity, with significantly improved fitting accuracy compared to MLR and ANN. In comparison to the traditional GWR approach, FGWNN demonstrates an exceptional fitting capability. Its

R_{adj}^{2}

increases from less than 0.72 to nearly 0.92, with a reduction in LOSS by over 180 and RMSE decreased to within 0.60 μg/m³. These results indicate that the FGWNN method possesses a robust generalization performance, accurately reproducing the original state of the PM2.5-AOD model. In summary, FGWNN outperforms MLR, ANN, and GWR models in the 2022 assessment.

4.2.2. Seasonal Performance Comparison for Various Models

Figure 7 presents the seasonal performance of the MLR, ANN, GWR, and FGWNN models. From the graph, it is evident that, for an equal sample dataset, the FGWNN model outperforms the other three models in all four seasons. Following FGWNN, the GWR and ANN models rank next, while the MLR model performs the poorest, with R² values of 0.41, 0.35, 0.46, and 0.20 for the four seasons, respectively. The seasonal performance trend of the MLR model aligns with that of the ANN model, with the highest R² value in autumn and the lowest in winter. Conversely, the performance of the GWR model, like the FGWNN model, initially rises, stabilizes, and then declines. Despite the results shown in the RMSE figures, which are contrary to those of the R² metrics, the FGWNN model maintains optimal performance.

Utilizing the local R² and local RMSE results of each model across the four seasons, we constructed boxplots to illustrate the spatial performance of each model. Regarding the local RMSE results (Figure 8a,b,e,f), the average RMSE values for the four seasons are 0.64, 0.53, 0.68, and 0.99 for the FGWNN model. Additionally, minor spatial variability can be observed in the boxplots of the FGWNN model. By comparing the local R² values (Figure 8c,d,g,h), the FGWNN model once again demonstrates significant superiority, with average R² values of 0.89, 0.92, 0.88, and 0.81, respectively. Hence, from a statistical perspective, the FGWNN model is more suitable for satellite-based PM2.5 map** compared to the MLR, ANN, and GWR models.

4.2.3. Annual and Seasonal Performance of FGWNN Model

To further understand the annual and seasonal performance of the FGWNN model, we evaluated its spatial performance. Table 3 presents the RMSE results for both the annual and quarterly assessments in 2022, with global RMSE values consistently below 1.0 μg/m³ for all four seasons. From the results, the order of global RMSE magnitudes is as follows: summer < annual < spring < autumn < winter. Local RMSE values between observed and estimated PM2.5 values were calculated for each monitoring station (see Figure 9). Overall, the FGWNN model demonstrates a reliable spatial prediction capability. Indeed, the FGWNN model performs well in the central regions of station clusters but relatively poorer in the peripheral areas of these clusters. This phenomenon may be attributed to the FGWNN model’s ability to smooth regression variations between station points. Despite the uneven spatial distribution of the model performance, the FGWNN model demonstrates an excellent predictive capability overall, with over 64% of sites reporting local RMSE values below 1 μg/m³. A comparative analysis suggests that the FGWNN model proposed in this study holds significant potential for satellite-based PM2.5 map**.

4.3. PM2.5 Prediction over CONUS

Once the predictive capability of the FGWNN model is sufficiently validated, the continuous predictions of PM2.5 spatial concentrations in the CONUS region can be made. Figure 10 displays the annual and seasonal distribution of ground-level PM2.5, based on the inversion of FGWNN (with 5000 FGNs). Overall, the average annual PM2.5 concentration is 7.45 μg/m³, representing a 37.9% decrease compared to the Level 1 standard (12 μg/m³) defined by the US-EPA in 2016. Additionally, it is predicted that approximately 43% of pixel cells in CONUS have an annual PM2.5 concentrations exceeding 12 μg/m³. These findings suggest that CONUS still experiences mild PM2.5 pollution, and the combination of satellite remote sensing can provide a more detailed spatial distribution information of atmospheric pollutants than ground-based monitoring alone [54].

The results from the five periods reveal a general trend of the decreasing spatial distribution of PM2.5 from coastal to inland areas, with the western and southeastern regions being more susceptible to air pollution than the central and northern regions (Figure 10). PM2.5 concentrations are generally higher in regions with a Mediterranean climate and tropical desert climate. The western, northeastern, and southern regions constitute the three major industrial zones in the United States, emitting significant amounts of particulate pollutants. Across the first three seasons, the air quality in CONUS shows a worsening trend, with PM2.5 concentrations increasing from 6.16 μg/m³ to 7.80 μg/m³. By winter, the air pollution peaks at 7.81 μg/m³, covering almost the entire CONUS region. Air quality deteriorates progressively in the west as the seasons advance, while pollutants in the east shift from south to north. Additionally, Canadian forest wildfires serve as significant sources of air pollution in North America, with smoke particles transported into the CONUS airspace by atmospheric movements [55]. The northeastern and western coastal regions are major population and urban centers, yet PM2.5 accumulates continuously during this period, undoubtedly increasing health risks for local residents [56].

5. Discussion

The spatial distance between geographical objects determines the strength of their spatial relationships, referred to as spatial dependency [57]. Spatial weighting in GWR [58] describes the varying spatial dependency between individual objects and all objects. In this paper, we define the strong or weak variation in the region as the spatial dependency field (SDF). Although GWR can effectively detect spatial non-stationarity, the SDF used in the model has two limitations. First, in the sparse state (Figure 11a), training samples are homogeneously distributed in space but with low sample density. The SDF can only roughly reflect the spatial dependency pattern of the original data, overlooking the finer details. Second, when spatial density is insufficient, training samples are heterogeneously distributed in space (eccentric and uneven), which falls into the biased state (Figure 11b). This situation can cause the significant deformation of the SDF, affecting the accurate representation of the original spatial dependency pattern, and significantly diminishing the quality of the final model. In conclusion, if the ideal SDF (Figure 11c) adapted to the target spatial scale is constructed, its learned spatial non-stationarity will be more comprehensive and accurate.

Large-scale, low spatial resolution remote sensing images inevitably suffer from issues related to insufficient spatial details, challenging target identification, limited image quality, and application constraints [59]. The high-definition images inferred and predicted using FGWNN can facilitate the precise identification of areas with air pollution anomalies, providing strong evidence for the analysis of air pollution driving factors [60]. The quality of the SDF constructed by traditional geographical detectors depends on the density and uniformity of the spatial distribution of samples, which often suffer from sparsity and non-homogeneity in real-world data [61]. To overcome this limitation, FGWNN automatically allocates homogeneous and moderate FGNs to the hidden layer, achieving an ideal SDF state. The FGWNN method proposed in this paper realizes the effective detection of spatial relationships through the establishment of a flexible SDF, which can accurately reconstruct the real features of geographical data. Additionally, it significantly enhances the regression accuracy and spatial resolution of PM2.5 inversion, making it a reliable-efficient remote sensing map** technique.

In exploring the spatial and temporal patterns of PM2.5 using high-precision remote sensing products from FGWNN inversion, different levels of the study area require the use of products with a compatible spatial resolution. This aims to strike a balance between inversion accuracy and computational efficiency. Figure 12 shows the study area at four administrative levels, i.e., national level (CONUS), division-level (Pacific Division), state level (California State), and county level (Los Angeles County). When the spatial resolution of the remote sensing image is not less than 20 km, PM2.5 data within the CONUS can be obtained with clear image details and no obvious jaggedness. Increasing the spatial resolution of remote sensing products to 10 km can fully demonstrate the spatial distribution characteristics of PM2.5 in the Pacific Division. In order to clearly detect the air quality distribution pattern in California State, the spatial resolution of remote sensing images is required to be higher than 5 km. Remote sensing products inverted from existing data with a maximum spatial resolution of 2.5 km can roughly reflect the general situation in Los Angeles County. Theoretically, the FGWNN model is able to complete the inversion of remote sensing products with arbitrary target resolution when the spatial resolutions of the independent variables meet the requirements.

Although the FGWNN has made some progress, there are still some limitations. These potential issues should be further considered in subsequent research or when applying the method more widely. The choice of spatial bandwidth relies on GWR calculations [62], which may limit its practicality in spatiotemporal analysis scenarios. In high-resolution remote sensing data contexts, model training and geographical activation processes incur substantial computational costs [63]. In the future, we plan to optimize the acquisition of spatial bandwidth by embedding this process within FGWNN. Simultaneously, we aim to refine the FGWNN network structure to reduce its dependence on computer resources. It is our hope that the FGWNN model can be extended to the spatiotemporal analysis field, further advancing the development of remote sensing spatiotemporal map** technology.

6. Conclusion

In-depth research into the spatiotemporal patterns of air pollution risk in a region can help alleviate concerns about public health crises. Building upon the foundations of GWR and ANN, this study has introduced a novel neural network structure, FGWNN. It can automatically allocate the positions and quantities of FGNs based on the characteristics of the study area, providing simultaneous analysis of the spatial non-stationarity and global nonlinear relationships within the original data. Notably, the ideal state SDF constructed by the new method can perfectly fit complex spatial non-stationary relationships. We successfully predicted PM2.5 concentrations across the CONUS in 2022, with the regression model’s fitting accuracy improved to above 0.90. Despite variations in model performance across different seasons, the PM2.5 products generated at a 2.5 km resolution to maintain a high level of fidelity. The remote sensing data produced by FGWNN possess a high spatial resolution, meeting the needs of researchers for air quality assessments at different scales. In the future, it is planned to enhance the model’s temporal detection capability and expand its application prospects in the spatiotemporal remote sensing domain.

Author Contributions

Conceptualization, Dongchao Wang; methodology, Dongchao Wang; validation, Dongchao Wang; resources, Dongchao Wang, Ye Zhang and Lei **: Can Spatial Pattern Recognition Come with Modeling Accuracy? ISPRS J. Photogramm. Remote Sens. 2022, 184, 31–44. [Google Scholar] [CrossRef]

Ghasempour, F.; Sekertekin, A.; Kutoglu, S.H. Google Earth Engine Based Spatio-Temporal Analysis of Air Pollutants before and during the First Wave COVID-19 Outbreak over Turkey via Remote Sensing. J. Clean. Prod. 2021, 319, 128599. [Google Scholar] [CrossRef]

Shen, L.; Hu, W.; Zhao, T.; Bai, Y.; Wang, H.; Kong, S.; Zhu, Y. Changes in the Distribution Pattern of PM_2.5 Pollution over Central China. Remote Sens. 2021, 13, 4855. [Google Scholar] [CrossRef]

Deng, C.; Qin, C.; Li, Z.; Li, K. Spatiotemporal Variations of PM_2.5 Pollution and Its Dynamic Relationships with Meteorological Conditions in Bei**g-Tian**-Hebei Region. Chemosphere 2022, 301, 134640. [Google Scholar] [CrossRef] [PubMed]

Aguilera, R.; Luo, N.; Basu, R.; Wu, J.; Clemesha, R.; Gershunov, A.; Benmarhnia, T. A Novel Ensemble-Based Statistical Approach to Estimate Daily Wildfire-Specific PM_2.5 in California (2006–2020). Environ. Int. 2023, 171, 107719. [Google Scholar] [CrossRef] [PubMed]

Gui, K.; Che, H.; Wang, Y.; ** of Ambient PM_2.5 and PM10 over China from Sentinel-5P and Assimilated Datasets: Considering the Precursors and Chemical Compositions. Sci. Total Environ. 2021, 793, 148535. [Google Scholar] [CrossRef]

Xu, X.; Zhang, C.; Liang, Y. Review of Satellite-Driven Statistical Models PM_2.5 Concentration Estimation with Comprehensive Information. Atmos. Environ. 2021, 256, 118302. [Google Scholar] [CrossRef]

Feng, Y.; Fan, S.; ** of Ground-Level PM_2.5. ISPRS J. Photogramm. Remote Sens. 2020, 167, 178–188. [Google Scholar] [CrossRef]

**, X.; Ding, J.; Ge, X.; Liu, J.; **: Convergence and Consistency. Ann. Statist. 2005, 33, 1538–1579. [Google Scholar] [CrossRef]

Wei, J.; Li, Z.; Lyapustin, A.; Wang, J.; Dubovik, O.; Schwartz, J.; Sun, L.; Li, C.; Liu, S.; Zhu, T. First Close Insight into Global Daily Gapless 1 Km PM_2.5 Pollution, Variability, and Health Impact. Nat. Commun. 2023, 14, 8349. [Google Scholar] [CrossRef]

Wang, Z.; Wang, Z.; Zou, Z.; Chen, X.; Wu, H.; Wang, W.; Su, H.; Li, F.; Xu, W.; Liu, Z.; et al. Severe Global Environmental Issues Caused by Canada’s Record-Breaking Wildfires in 2023. Adv. Atmos. Sci. 2024, 41, 565–571. [Google Scholar] [CrossRef]

Liu, C.; Hu, H.; Zhou, S.; Chen, X.; Hu, Y.; Hu, J. Change of Composition, Source Contribution, and Oxidative Effects of Environmental PM_2.5 in the Respiratory Tract. Environ. Sci. Technol. 2023, 57, 11605–11611. [Google Scholar] [CrossRef]

Legendre, P.; Fortin, M.J. Spatial Pattern and Ecological Analysis. Vegetatio 1989, 80, 107–138. [Google Scholar] [CrossRef]

Brunsdon, C.; Fotheringham, A.S.; Charlton, M.E. Geographically Weighted Regression: A Method for Exploring Spatial Nonstationarity. Geogr. Anal. 1996, 28, 281–298. [Google Scholar] [CrossRef]

Wen, D.; Huang, X.; Bovolo, F.; Li, J.; Ke, X.; Zhang, A.; Benediktsson, J.A. Change Detection from Very-High-Spatial-Resolution Optical Remote Sensing Images: Methods, Applications, and Future Directions. IEEE Geosci. Remote Sens. Mag. 2021, 9, 68–101. [Google Scholar] [CrossRef]

Sokhi, R.S.; Singh, V.; Querol, X.; Finardi, S.; Targino, A.C.; Andrade, M.D.F.; Pavlovic, R.; Garland, R.M.; Massagué, J.; Kong, S.; et al. A Global Observational Analysis to Understand Changes in Air Quality during Exceptionally Low Anthropogenic Emission Conditions. Environ. Int. 2021, 157, 106818. [Google Scholar] [CrossRef]

Tian, Z.; Wei, J.; Li, Z. How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM_2.5 Using Machine Learning? Remote Sens. 2023, 15, 3780. [Google Scholar] [CrossRef]

Yang, X.; Yang, Y.; Xu, S.; Han, J.; Chai, Z.; Yang, G. A New Algorithm for Large-Scale Geographically Weighted Regression with K-Nearest Neighbors. IJGI 2023, 12, 295. [Google Scholar] [CrossRef]

Zhang, C.; Yue, P.; Tapete, D.; Shangguan, B.; Wang, M.; Wu, Z. A Multi-Level Context-Guided Classification Method with Object-Based Convolutional Neural Network for Land Cover Classification Using Very High Resolution Remote Sensing Images. Int. J. Appl. Earth Obs. Geoinf. 2020, 88, 102086. [Google Scholar] [CrossRef]

Figure 1. The spatial distribution of PM2.5 concentration monitored by ground-based stations over the CONUS in 2022.

Figure 2. Mean distribution of PM2.5 driving factors across the CONUS in 2022. (a) AOD: Aerosol Optical Depth; (b) NDVI: Normalized Difference Vegetation Index; (c) TMP: temperature; (d) SPFH: specific humidity; (e) WIND: wind speed; (f) HGT: terrain elevation.

Figure 3. The FGWNN network architecture for PM2.5 inversion.

Figure 4. Comparison of the effects before and after implementing the uniformization strategy. (a) Non-uniformization GN; (b) Uniformization GN; (c) Learning curve under non-uniformization mode; (d) Learning curve under uniformization mode; (e) Estimated local R² for non-uniformization mode; (f) Estimated local R² for uniformization mode.

Figure 5. FGN number optimization (marked by red dotted circle) and comparison of computational cost changes.

Figure 6. Scatter plots of estimated versus observed PM2.5 for the (a) MLR, (b) ANN, (c) GWR, and (d) FGWNN models.

Figure 7. Trends in global regression metrics across the four seasons of 2022.

Figure 8. Boxplots of local regression metrics across the four seasons of 2022. (a) Local RMSE in Spring; (b) Local RMSE in Summer; (c) Local R² in Spring; (d) Local R² in Summer; (e) Local RMSE in Autumn; (f) Local RMSE in Winter; (g) Local R² in Autumn; (h) Local R² in Winter.

Figure 9. Local RMSE results for year and seasons in 2022 via FGWNN model.

Figure 10. Satellite-derived map** of ground-level PM2.5 concentration over CONUS in 2022.

Figure 11. Three states of SDF. (a) Sparse state; (b) Biased state; (c) Ideal state: uniform and dense placement of FGNs.

Figure 12. Spatial resolutions corresponding to the different study region levels. (a) CONUS; (b) Pacific Division; (c) California State; (d) Los Angeles County.

Table 1. Multicollinearity diagnosis result.

	Unstandardized Coefficients		Standardized Coefficients	t	Sig.	Collinearity Statistics
	Beta	Std. Error	Beta	t	Sig.	Tolerance	VIF
(Constant)	−12.468	1.016		−12.268	0.000
AOD	70.799	2.727	0.589	25.958	0.000	0.510	1.960
NDVI	18.214	0.888	0.845	20.509	0.000	0.154	6.476
TMP	0.878	0.029	1.256	30.649	0.000	0.156	6.399
SPFH	−1656.648	69.137	−1.167	−23.962	0.000	0.111	9.039
WIND	0.192	0.095	0.051	2.014	0.044	0.408	2.452
HGT	0.000	0.000	0.095	3.554	0.000	0.364	2.746

Table 2. Comparison of PM2.5 regression with different models over the CONUS in 2022.

Model	lr	Time	RMSE	LOSS	R²	$R_{adj}^{2}$
MLR		28	1.794	741.982	0.168	0.158
ANN	0.001	93	1.512	527.249	0.409	0.401
GWR		86	1.061	259.493	0.709	0.705
GNNWR	0.001	109	0.604	84.085	0.906	0.905
FGWNN	0.001	161	0.581	77.887	0.913	0.912

Note: Time is in second (s), and RMSE is in μg/m³.

Table 3. RMSE results for year and seasons in 2022.

	Annual	Spring	Summer	Autumn	Winter
Stations	473	403	401	396	410
Global RMSE (μg/m³)	0.581	0.644	0.534	0.684	0.989
Local RMSE < 1.0 (%)	82.7	80.1	86.0	78.5	64.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, D.; Cao, J.; Zhang, B.; Zhang, Y.; **e, L. A Novel Flexible Geographically Weighted Neural Network for High-Precision PM2.5 Map** across the Contiguous United States. ISPRS Int. J. Geo-Inf. 2024, 13, 217. https://doi.org/10.3390/ijgi13070217

AMA Style

Wang D, Cao J, Zhang B, Zhang Y, **e L. A Novel Flexible Geographically Weighted Neural Network for High-Precision PM2.5 Map** across the Contiguous United States. ISPRS International Journal of Geo-Information. 2024; 13(7):217. https://doi.org/10.3390/ijgi13070217

Chicago/Turabian Style

Wang, Dongchao, Jianfei Cao, Baolei Zhang, Ye Zhang, and Lei **e. 2024. "A Novel Flexible Geographically Weighted Neural Network for High-Precision PM2.5 Map** across the Contiguous United States" ISPRS International Journal of Geo-Information 13, no. 7: 217. https://doi.org/10.3390/ijgi13070217

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Flexible Geographically Weighted Neural Network for High-Precision PM2.5 Map** across the Contiguous United States

Abstract

1. Introduction

2. Study Region and Material

2.1. Study Area

2.2. Data Sources

2.2.1. EPA PM2.5

2.2.2. MODIS AOD

2.2.3. MODIS NDVI

2.2.4. NOAA RTMA

2.3. Data Preprocessing and Integration

2.4. AOD-PM2.5 Model Structure

3. FGWNN Model for PM2.5 Estimation

3.1. Model Development

3.2. Model Hyperparameters Setting

3.3. Model Evaluation

4. Results

4.1. Model Parameter Optimization

4.1.1. Uniformization Strategy Performance

4.1.2. FGN Number Optimization

4.2. Comparison with Other Models

4.2.1. Overall Comparison of Different Models

4.2.2. Seasonal Performance Comparison for Various Models

4.2.3. Annual and Seasonal Performance of FGWNN Model

4.3. PM2.5 Prediction over CONUS

5. Discussion

6. Conclusion

Author Contributions

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI