Next Article in Journal
Global Navigation Satellite System/Inertial Measurement Unit/Camera/HD Map Integrated Localization for Autonomous Vehicles in Challenging Urban Tunnel Scenarios
Previous Article in Journal
Polar Stratospheric Cloud Observations at Concordia Station by Remotely Controlled Lidar Observatory
Previous Article in Special Issue
Polarimetric Measures in Biomass Change Prediction Using ALOS-2 PALSAR-2 Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Inversion of Forest Aboveground Biomass in Regions with Complex Terrain Based on PolSAR Data and a Machine Learning Model: Radiometric Terrain Correction Assessment

by
Yonghui Nie
1,
Rula Sa
1,
Sergey Chumachenko
2,
Yifan Hu
1,
Youzhu Wang
1 and
Wenyi Fan
1,*
1
Key Laboratory of Sustainable Forest Ecosystem Management (Ministry of Education), School of Forestry, Northeast Forestry University, Harbin 150040, China
2
Head of Department of Forest Management, GIS of Bauman Moscow State Technical University, 2-Ya Baumanskaya Ulitsa, 5c1, Moscow 105005, Russia
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(12), 2229; https://doi.org/10.3390/rs16122229
Submission received: 2 May 2024 / Revised: 10 June 2024 / Accepted: 18 June 2024 / Published: 19 June 2024
(This article belongs to the Special Issue SAR for Forest Map** III)

Abstract

:
The accurate estimation of forest aboveground biomass (AGB) in areas with complex terrain is very important for quantifying the carbon sequestration capacity of forest ecosystems and studying the regional or global carbon cycle. In our previous research, we proposed the radiometric terrain correction (RTC) process for introducing normalized correction factors, which has strong effectiveness and robustness in terms of the backscattering coefficient of polarimetric synthetic aperture radar (PolSAR) data and the monadic model. However, the impact of RTC on the correctness of feature extraction and the performance of regression models requires further exploration in the retrieval of forest AGB based on a machine learning multiple regression model. In this study, based on PolSAR data provided by ALOS-2, 117 feature variables were accurately extracted using the RTC process, and then Boruta and recursive feature elimination with cross-validation (RFECV) algorithms were used to perform multi-step feature selection. Finally, 10 machine learning regression models and the Optuna algorithm were used to evaluate the effectiveness and robustness of RTC in improving the quality of the PolSAR feature set and the performance of the regression models. The results revealed that, compared with the situation without RTC treatment, RTC can effectively and robustly improve the accuracy of PolSAR features (the Pearson correlation R between the PolSAR features and measured forest AGB increased by 0.26 on average) and the performance of regression models (the coefficient of determination R2 increased by 0.14 on average, and the rRMSE decreased by 4.20% on average), but there is a certain degree of overcorrection in the RTC process. In addition, in situations where the data exhibit linear relationships, linear models remain a powerful and practical choice due to their efficient and stable characteristics. For example, the optimal regression model in this study is the Bayesian Ridge linear regression model (R2 = 0.82, rRMSE = 18.06%).

Graphical Abstract

1. Introduction

Forest aboveground biomass (AGB) is a core parameter for monitoring forest ecosystems and assessing carbon sink capacity. Therefore, it plays an important role in quantifying the carbon sequestration capacity of forest ecosystems and studying regional and global carbon cycles [1,2]. However, it is challenging to accurately estimate forest AGB, especially in regions with complex and variable topography, which increases the uncertainty of assessments of forest ecosystem productivity. Accuracy is important for improving forest resource status surveys [3,4]. Remote sensing technology, including optical remote sensing, microwave remote sensing, and light detection and ranging (LiDAR), has become the most practical way to estimate forest AGB because it is long-term, efficient, and economical for acquiring large-scale spatial distribution information about forests [5,6,7]. Among the potential techniques, synthetic aperture radar (SAR) has been widely used for the dynamic monitoring of forest AGB in large-scale areas due to its good performance in all weathers, sustainable observation, and cloud penetration [8,9,10]. Although multi-source remote sensing data can significantly improve the inversion accuracy of forest AGB [11,12,13,14,15], it is still crucial to explore AGB estimation methods relying only on SAR data. Especially in areas where optical remote sensing is limited due to frequent cloud and fog, SAR becomes an irreplaceable and effective tool [16].
At present, there are four common technologies for forest AGB based on SAR data: interferometric SAR (InSAR) [17], polarimetric SAR (PolSAR) [18,19], polarimetric interferometric SAR (PolInSAR) [20,21,22], and tomographic SAR (TomoSAR) [23,24]. PolSAR technology is widely used in large-scale inversion of forest AGB because it does not rely on interference technology and does not need to meet the strict requirements of repeated orbit observation and accurate time synchronization. It has more relaxed data acquisition conditions and simplicity of data collection. In addition, the modeling methods for estimating forest AGB based on PolSAR data can be divided into scattering mechanism methods [25], machine learning methods [26,27,28], and deep learning methods [29,30,31]. Scattering mechanism methods (such as the water cloud model (WCM)) [32,33] can be useful because with a simplified physical model it is difficult to describe the real scattering characteristics of a forest with a complex structure. Therefore, the universal applicability of this kind of model may be poor in a forest area with strong heterogeneity. For deep learning methods, a large amount of labeled data are required for the model training. However, the actual measurement of forest biomass is often costly and difficult, and so such models are relatively poor in interpretation and prone to overfitting. Therefore, this study is based on PolSAR data, using a flexible and more adaptable machine learning method to estimate forest AGB.
The process of inversion of forest AGB by the machine learning method based on PolSAR data includes three key stages: firstly, accurate extraction and derivation of PolSAR feature parameters; secondly, selection of the optimal feature subset; and, finally, selection of the most suitable regression model and algorithm. Accurate extraction of PolSAR features allows us to mine the information contained in the original data and generate new features based on physical mechanisms or statistical methods. This enhances the ability of the statistical model to reflect the complex relationship between forest AGB and PolSAR features [34]. Backscattering coefficient and polarization decomposition parameters are the most commonly used PolSAR features in estimating forest AGB, with their extraction theories having been extensively studied in literature [35,36,37]. However, due to the side-view imaging principle of the SAR system and the characteristics of tilt distance being inferred based on echo delay, the actual topographic relief and other factors can cause serious interference to the scattered echo information received by SAR. This can limit the accuracy and reliability of the features extracted from PolSAR data [38]. Therefore, fully considering the influence of terrain and implementing effective terrain correction is essential to improving the accuracy of PolSAR features. At present, the radiometric terrain correction (RTC) process designed for PolSAR data is a relatively complete solution for terrain correction [39], which includes polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC). We introduce normalized ESAC factors on the basis of RTC to further improve the RTC process [40]. Although the RTC method has shown excellent performance in the correction of backscattering coefficient and its effectiveness and robustness have been widely recognized [41,42], there is a lack of in-depth research on the correction effect and quantitative analysis of the polarization decomposition parameters of PolSAR data by RTC. In addition, the effect of the fusion of the RTC process with different machine learning algorithms and potential improvements need to be discussed further.
Feature selection is essential to achieve robust and high-precision estimation of forest AGB based on PolSAR data [43,44,45]. The data and features determine the upper limit of machine learning, while models and algorithms only approach this upper limit [46]. The purpose of selecting the optimal feature subset is to remove invalid, redundant, and interference-producing variables from many potential feature variables, and solve the multicollinearity problem between the features synchronously. The aim is to select the feature set with the greatest information value and prediction efficiency so as to improve the efficiency and accuracy of the model construction [47,48,49,50]. At present, the methods of feature selection are mainly divided into three categories: filter, wrapper, and embedded algorithms [51,52]. In addition, a hybrid feature selection algorithm combining the advantages of multiple feature selection methods through multi-step screening has also been proposed [53,54,55].
The selection of appropriate statistical models can help to build models that accurately reflect the complex relationship between AGB and remote sensing data, thus improving the reliability and accuracy of carbon storage assessment. At present, according to the different construction methods of statistical models and the characteristics of parameter assumptions, regression models based on machine learning can be divided into two categories: parametric models and non-parametric models [5,56]. Non-parametric models can flexibly adapt to various complex data structures without pre-setting the specific form of the model, and thus show excellent adaptability and robustness. However, the parametric model is still a competitive regression analysis tool because of its clear explanations and high computational efficiency [57]. In addition, in the inversion of forest AGB based on PolSAR characteristics, a widely accepted view is that forest AGB has the best correlation with the backscattering coefficient of HV polarization channels, but there is no direct and simple linear relationship between the two; in fact, a logarithmic function may be the more appropriate relationship [58]. Moreover, our previous research focused on the potential of RTC processes to improve the accuracy of forest AGB inversion based on univariate models [40]. Therefore, this study will not focus on the construction of monadic linear or nonlinear models, but instead will seek multivariate and high-level nonlinear model structures that can capture this complex relationship.
This research aims to achieve three key tasks: (1) quantitatively evaluate the ability of the RTC process to enhance the correlation between forest AGB and PolSAR features (backscattering coefficient, polarization decomposition parameters, and related derived features); (2) explore the effect of RTC on polarization decomposition parameters and the underlying mechanism of action; and (3) evaluate the potential of the RTC process to improve the performance of forest AGB inversion based on machine learning models, and identify the optimal model by comparing the performance of different models.

2. Materials

2.1. Study Area

The study area is located in Saihanba Forest Farm (116°53′~117°43′E, 41°55′~42°35′N, Figure 1) in the border area between the northern mountains of Weichang County, Hebei Province, China, with an altitude of 1010 to 1940 m. The study area covers a wide range of vegetation types, including deciduous coniferous forest, evergreen coniferous forest, broad-leaved forest, scrubland, and grassland. Moreover, the topography of the study area is complex, with the forest cover being a combination of flat woodlands and steep hillsides with slopes up to 55.47°. This variety of terrain conditions provides an ideal natural laboratory for conducting comparative studies of complex terrain. In addition, the study area has a large-scale artificial forest, covering an area of about 93,461 hectares, and its large area of artificial forest is convenient for scientific researchers to collect real and rich test data on the spot.

2.2. PolSAR Data and Pre-Processing

The PolSAR data used in this study are L-band full-polarization single-look complex (SLC) data (Table 1) obtained by the Advanced Land Observing Satellite-2 (ALOS-2) (developed by the Japan Aerospace Exploration Agency (JAXA)) on 11 July, 25 July, and 8 August 2020. As the only platform currently equipped with a satellite-borne L-band SAR sensor, ALOS-2’s four-polarization observation mode can significantly enhance its ability to capture complex forest structures and scattering mechanisms. Moreover, compared with C-band, L-band has stronger penetration and can more accurately reveal the details of the forest internal structure, which is conducive to improving the accuracy of forest AGB.
In this study, the Sentinel Application Platform (SNAP_v9.0.0) software developed by the European Space Agency completed the pre-processing steps for the PolSAR data: cross-channel SNR correction, Faraday rotation correction, and radiometric calibration for complex data (Equation (2), derived from radiometric calibration of real data; Equation (1)) [35,40], multi-look averaging of 4 × 9 (range × azimuth), refined LEE filter, and range doppler terrain correction). The ensemble average size for the covariance matrix is crucial, especially when studying natural vegetation. For natural vegetation, when the reflection symmetry condition remains unchanged, the volume scattering power induced by vegetation will improve with the increase in the average ensemble size [59]. The purpose of choosing 4 × 9 (range × azimuth) as the core parameter of multi-look processing is to process the pixels of the PolSAR data into square pixels as close as possible to the reference DEM data and the measured plot.
σ s l c 0 = 10 log 10 I 2 + Q 2 + C F 1 A ,
where σ s l c 0 is the backscattering coefficient after radiation calibration; I and Q represent the real and imaginary parts of SLC 1.1 products, respectively; CF1 is −83.0 dB; and A is 32.0 dB.
I c a l = I 10 5.75 ; Q c a l = Q 10 5.75 ,
where Ical and Qcal are the real and imaginary parts of SLC 1.1 products after radiation calibration.

2.3. Ground-Measured Forest AGB

The measured forest AGB data were collected through a field survey conducted at Saihanba Forest Farm in August 2020 for the training and validation of the forest AGB inversion model. Through screening the key stand characteristics of the plots, including tree species composition, understory environment, topographic slope, and stand thinning density, 132 rhomboid plots (Figure 1) with an area of 0.06 ha (24.49 m × 24.49 m) were established and investigated, including 28 plots distributed on a 1.0 km grid and 104 plots with a relatively uniform distribution. All plots were located using differential GPS (Unistrong RTK-G10, Unistrong, Bei**g, China), and plot information such as plot type, plant number, and slope, as well as individual information of trees with a diameter at breast height (DBH) ≥ 5.0 cm, was recorded, including DBH, tree height, and tree species. Finally, the biomass per plant was calculated separately according to the allometric equation of different tree species (Table 2); then the total forest AGB of the plot was obtained by summing (Equation (3)) [40] all the biomass per plant. Table 3 provides statistical details of the field data for this study.
A G B j = i = 1 m W i / 1000 × S ,
where AGBj is the AGB (t/ha) of the jth sample plot, m is the number of trees in the plot, Wi is the AGB of the ith tree (kg), and S is the area of the sample plot (S = 0.06 ha).

2.4. SRTM DEM

Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) data with a resolution of 30 m were used to complement the PolSAR data to implement geocoding and calculate RTC-related factors.

3. Methods

The processing framework of this study (Figure 2) includes the following steps: (1) radiometric terrain correction (RTC) of PolSAR data, including polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC); (2) feature extraction and feature derivation based on PolSAR data, including backscattering coefficient, polarization decomposition parameters, and their related derived feature variables; and (3) feature selection and model training. All machine learning algorithms were executed in the Python 3.10 development environment. The programs that involved partitioning the dataset used the default parameter (test_size = 0.25), which was 75% for the training set and 25% for the test set. The following sections present the necessary theories and related formulas for each stage in the flowchart in more detail.

3.1. Radiometric Terrain Correction for PolSAR

3.1.1. Polarization Orientation Angle Correction

When there is topographic relief in the target area, the polarization orientation angle (POA)—that is, the angle between the long axis of the polarization ellipse and the horizontal axis of the ground plane—will be biased due to the slope (mainly the azimuth slope). In order to compensate for this deviation of POA caused by terrain, the POA extracted by the circular polarization method [39,65] is used first, and then the three-dimensional polarization covariance (C3) matrix is compensated (Equation (4)).
C 3 _ P O A C = U 3 η C 3 U 3 η 1 U 3 η = 1 2 1 + cos 2 η 2 sin 2 η 1 cos 2 η 2 sin 2 η 2 cos 2 η 2 sin 2 η 1 cos 2 η 2 sin 2 η 1 + cos 2 η η = 1 4 1 × A r g 4 Re S H H S V V S H V * S H H S V V 2 + 4 S H V 2 + π ,
where C3 and C3_POAC are the C3 matrix before and after polarization orientation angle correction (POAC), respectively; η is the polarization orientation angle (POA); Arg(·) is a phase function; and when η > π/4, η = ηπ/2.

3.1.2. Effective Scattering Area Correction

The effective scattering area is the projected area of the actual ground element on the isophase surface. However, due to the difference in terrain slope, ground units of the same SAR data in the geographical coordinate system correspond to different effective scattering areas. Therefore, effective scattering area correction (ESAC) takes flat land as a reference to correct the area effect of SAR data under different slope conditions so as to eliminate the influence of terrain slope on scattering information [39,40]. The C3 matrix can be corrected using Equation (5):
C 3 _ E S A C = C 3 sin θ l o c sin θ r e f ,
where C3_ESAC is the C3 matrix after ESAC; C3 refers to the C3 matrix without ESAC; θloc is the local incidence angle; and θref is the radar incidence angle.

3.1.3. Angular Variation Effect Correction

The angular variation effect (AVE) refers to the phenomenon whereby the local incidence angle changes due to a change in terrain, which leads to a change in the scattering mechanism of the target object (such as forest). The angular variation effect correction (AVEC) is typically based on the cosine model (Equation (6)) to correct forest-covered areas. In this study, the forest cover area was first extracted by Freeman3 decomposition and Wishart unsupervised classification [66], and then the optimal n value matrix of the C3 matrix (different elements correspond to different optimal n values) was obtained by calculating the optimal n value corresponding to the main diagonal elements of the C3 matrix. For each element of the three principal diagonal elements, the optimal n value was first determined by iterating the minimum value of the absolute value of f(n) when n takes different values [39].
C 3 _ A V E C = C 3 cos θ r e f cos θ l o c n p q = C 3 k n H H k n H H + n H V 2 k n H H + n V V 2 k n H H + n H V 2 k n H V k n H V + n V V 2 k n H H + n V V 2 k n H V + n V V 2 k n V V n p q = a r g m i n f n = a r g m i n ρ θ l o c , C 3 _ A V E C _ p q ,
where (p, q) = (H, V) is the polarization mode of incidence and scattering of electromagnetic waves; npq is the optimal n value of different polarization channels; ⊙ is the Hadamard product; argmin{·} is the operation that takes the minimum value; f(n) is the absolute value of the correlation coefficient between the local incidence angle θloc and C3_AVEC_pq; C3_AVEC_pq is the decibel state of the intensity value after AVEC is applied to one of the principal diagonal elements of the C3 matrix at a certain value of n; and ρ(·) is the correlation coefficient function.

3.2. Feature Extraction and Feature Derivation of PolSAR

All 118 PolSAR features extracted in this study were based on the Polarimetric SAR Data Processing and Educational Toolbox (PolSARpro_v5.1.1) and the Sentinel Application Platform (SNAP_v9.0.0).

3.2.1. Backscattering Coefficient Features of PolSAR

This study extracted the intensity values ( σ H H 0 , σ H V 0 , and σ V V 0 ) and decibel state values ( σ H H 0 _db, σ H V 0 _db, and σ V V 0 _db) of the three main diagonal elements of the C3 matrix as the six original backscatter coefficient features, and then calculated 20 classical derived features (Table 4).

3.2.2. Polarization Decomposition Features of PolSAR

In this study, 72 original polarization decomposition-related features in linear units and decibel units were extracted based on 12 polarization decomposition methods (Table 5), and then we calculated 20 derived features (Table 6). The Volume scattering component (Vol) of the Pauli three-component (PAU3) in this study is exactly the same as σ0HV, so only the Surface scattering component (Odd) and the Double-bounce scattering component (Dbl) of the PAU3 decomposition method were used.

3.3. Forest AGB Regression Modeling Algorithms and Model Evaluation

In this study, 10 multiple regression models (Table 7) were selected to evaluate the effectiveness and robustness of RTC on the performance of different machine learning regression models. Multiple regression models are statistical and machine learning methods used to analyze the relationship between a continuous dependent variable (response variable) and two or more independent variables (feature variables), which can be divided into multiple linear regression models and non-parametric regression models. Non-parametric regression models do not depend on the form of the model, while the multiple linear regression model has an explicit parametric form (Equation (7)) [67]:
Y = α 0 + α 1 X 1 + α 2 X 2 + + α n X n + ε ,
where Y is the target variable of the prediction (decibels of forest AGB); X1, X2, …, Xn are the predictor variables; α0 is a constant; α1, α2, …, α2n are the regression coefficients associated with the corresponding variables; n is the number of the predictor variables; and ε is the error term.
The machine learning process in this study included the following steps (implementation details shown in Table 8): (1) The random forest regression model (RF) based on the default parameters served as the base model for Boruta’s algorithm [87] and performed the initial feature filtering on the entire dataset instead of the training set. The goal is to identify and remove features that are irrelevant or weakly related to the target variable. (2) First, the Optuna algorithm (n_trials = 500, including 10-fold cross-validation) was used to obtain the optimal hyperparameters of 10 regression models on the feature set after the initial feature selection. Then, the 10 regression models based on the optimal hyperparameters corresponding to different models were used as the base model for recursive feature elimination with cross-validation (RFECV) feature selection, and the feature selection of the second step was completed. The aim is to iteratively search the optimal feature subset of the base model. (3) Again, the Optuna algorithm was used to obtain and output the corresponding evaluation indicators (based on the test set, i.e., 25% of all sample plots) of optimal performance of 10 regression models on the feature set after two-step feature selection. These evaluation indicators included the coefficient of determination (R2, Equation (8)), the root mean square error (RMSE, Equation (9)), and the relative root mean square error (rRMSE, Equation (10)) [18,27,42]. This was repeated 10 times and we calculated the average of the above indicators. This is because the performance of the model largely depends on the selected hyperparameters; however, the values of these hyperparameters are often sensitive to the dataset, so the hyperparameters of Optuna need to be tuned separately for different feature sets. In addition, since random states cannot be completely controlled, they were not fixed in this study. This means that the initial state of model training and possible random factors in each experiment followed a natural (random) process, which was more in line with the real situation. However, through cross-validation and repeated tests, randomness can be reduced and a more stable and representative evaluation index can be obtained.
R 2 = 1 i = 1 N y i y ^ i 2 i = 1 N y i y ¯ 2
R M S E = 1 n i = 1 N y i y ^ i 2 ,
r R M S E = R M S E / y ¯ × 100 %
where N is the sample size, y i represents the true observed value, y ¯ represents the average of the true observed value, and y ^ i represents the predicted value.

4. Results

4.1. The Impact of Radiometric Terrain Correction on PolSAR Features

In order to quantitatively analyze the impact of the radiometric terrain correction (RTC) process on the correlation between forest AGB and PolSAR features, we first set the non-RTC (NRTC) data as the blank control and the RTC data as the test data. The core correction factors of the RTC process are shown in our previous study [40]. Secondly, we extracted a total of 117 effective PolSAR features (as mRFDI_db is invalid) based on NRTC and RTC data. Subsequently, Pearson correlation coefficients (R) were calculated separately between forest AGB and each of the PolSAR features (Table A1, Appendix B). Finally, in order to compare and visually present the changes in the R-values of the NRTC data and RTC data and, at the same time, reveal the relationship between various PolSAR features and forest AGB correlation strength after RTC, we sorted and plotted them according to the absolute R-values of the data (25 July 2020) after RTC (Figure 3).
The R-values of the RTC data increased to varying degrees in all (Figure 3a,b) but three original features (α, α_db, and A_db). The R-values of the 75 original features (with increased R-value) increased by an average of 0.26, and the largest R-value increase was 0.48 ( σ H V 0 ), indicating that RTC processing can effectively improve the accuracy of most PolSAR features and the sensitivity of AGB inversion. However, among the original features, the three (H, H_db, and A) that belong to the same H-A-alpha (HAα) decomposition method as α, α_db, and A_db were relatively less affected by RTC. This shows that HAα decomposition can alleviate terrain disturbance. This is because the three parameters of the HAα decomposition are defined by a function of the ratio class of the three eigenvalues (the relative scattering power of different scattering mechanisms). Moreover, the R-values of the RTC data in all but four derived features (Span, Span_db, BMI, and BMI_db) (Figure 3c) were reduced to varying degrees. This suggests that the ratio operation (feature derived algorithm) of features can eliminate some multiplicative interference (including terrain interference). It also indicates that there is an overcorrection problem in the RTC process. Although the overcorrection problem of RTC is small, it causes an error of 0.05 in the mean value of the absolute value of the Pearson correlation coefficient between forest AGB and these 35 derived features.
In addition, the feature R_HH/HV (R = 0.76) has the largest R-value among all the features derived through ratio operations, but it is still 0.1 lower than σ H V 0 _db (R = 0.86), which has the highest R-value of all 117 PolSAR features. This suggests that, based only on the backscattering coefficient, the HV polarization mode carries more polarization information related to forest AGB. Although the ratio algorithm of features can mitigate the topographic effects, it will also weaken some AGB-related information. The second-largest R-value of all 117 PolSAR features is Y4V_db (R = 0.84), which indicates that the YAM4 decomposition method can decompose the scattering characteristics of the target into different scattering components relatively more accurately. Moreover, among the different components of the same polarization decomposition method, the Volume scattering component (Vol) usually has the largest R-value, indicating that the relationship between Vol and the scattering information in vegetation is relatively close. Furthermore, regardless of the decomposition method, the R-value of volume scattering in decibels is higher than that in linear units, which is consistent with the trend of previous studies [58].

4.2. The Impact of Radiometric Terrain Correction on Polarization Decomposition Component

In order to analyze the influence of different stages of RTC on the removal of topographic factors in the polarization decomposition component, we took the PolSAR data from 25 July 2020 as an example and plotted a scatter density plot between the decibel values of the three components of the Freeman three-decomposition method and the local incidence angle. As can be seen from Figure 4, there is a relatively obvious linear relationship between each component and the local incidence angle in the NRTC and POAC stages; but with the implementation of ESAC and AVEC, this linear relationship becomes significantly weaker, which is consistent with the results of our previous research regarding the backscattering coefficient [40]. The results show that RTC is also effective at removing the effect of terrain on the polarization decomposition.
In order to further analyze the specific mechanism of influence of different stages of RTC (POAC, ESAC, and AVEC) on each component of polarization decomposition, taking PolSAR data from 25 July 2020 as an example, we plotted the scatter density (Figure 5) of each component of Freeman3 decomposition at each stage of different RTC relative to the previous stage.
In the comparison stage between POAC and NRTC (Figure 5a,e,i), POAC caused the intensity values of almost all pixels of the Vol component (Figure 5a) to decrease, while the intensity values of almost all pixels of the Odd (Figure 5e) and Dbl components (Figure 5i) showed an increasing trend. This suggests that the absence of POAC will lead to an overestimation of Vol and an underestimation of Odd and Dbl. In addition, the correction of the ESAC and AVEC stages did not cause the pixel intensity value of the polarization decomposition component to undergo an overall increase or decrease, but there was an increase or decrease in the pixel value in each component. This is because the correction factors of ESAC and AVEC stages depend on both the value of the local incidence angle of each pixel and its relationship with the radar incidence angle, which also indicates that the correction effects of ESAC and AVEC stages are closely related to the ground conditions in the study area. However, in the ESAC phase, most of the pixels that cause a large change in the intensity value of the pixel (away from the 1:1 line) were decreasing, while in the AVEC phase, the situation was the opposite. This indicates that going without ESAC will lead to a more serious overestimation of some pixels, while going without AVEC will lead to a more serious underestimation of some pixels. Moreover, the effect of RTC on the three polarization decomposition components cannot be described by a simple law of increase and decrease because the correction effect of RTC is the superposition of the three stages (Figure 5d,h,l).
In addition, in order to comprehensively evaluate the universality and robustness of the effect of RTC on polarization decomposition, we used the Yamaguchi three-component (YAM3) method for analysis and map** (Figure A1, Appendix A). On the whole, the trend of the scatter density plots of the two polarization decomposition methods was the same in each stage except for slight differences caused by the different decomposition algorithms. This shows that the effect of RTC on the components of polarization decomposition is relatively robust in different polarization decomposition methods.

4.3. The Impact of Radiometric Terrain Correction on Regression Model Performance

In order to analyze the impact of RTC on the performance of machine learning models and select the optimal regression model, we conducted feature selection (Figure A2, Appendix A) and model training (Table A2, Appendix B) for 10 regression models based on the NRTC data and RTC data (25 July 2020), respectively. Then, we conducted a descending order analysis and graphed the results (Figure 6) based on the R2 value of the training results of the RTC data.
Compared with training using the RTC data (Figure 6a), R2 values obtained by different models based on training using the NRTC data are generally lower (0.14 lower on average), while rRMSE values are generally higher (4.2% higher on average), indicating that not implementing the RTC process will cause greater estimation error in forest AGB inversion. In addition, from the R2 values of different models trained on the NRTC data and RTC data, it can be found that the RTC process generally improves the performance of linear models more than non-parametric models, indicating that the existence of terrain factors will cause the real relationship between SAR information and forest AGB to become more complex and difficult to describe by a simple linear relationship. Meanwhile, RTC can effectively reduce the signal distortion caused by terrain factors and make SAR data more closely reflect the real situation of forest biomass, thus reducing the complexity of data.
Figure 6b is a scatter plot of the measured forest AGB of 132 plots and the predicted value obtained using the BysRidge regression model (the largest R2 value in 10 repetitions of Optuna hyperparameter optimizations, Equation (11)), which had the best performance on the data of this study, indicating that the model has a strong ability to explain the data trend (R2 = 0.82). Its prediction error was relatively small (rRMSE = 18.06%), but there were still some problems of overestimation and underestimation. In addition, the multivariate model has a higher R2 value (0.82) compared to the R2 value of the univariate model (0.73 for the data dated 25 July 2020 in our previous study [40]), indicating that the multivariate model has a stronger advantage in retrieving forest AGB. Moreover, we evaluated the generalization ability of the model (Equation (11)) with respect to the PolSAR data from different dates (11 July and 8 August 2020), and created a scatter plot of the measured values and the predicted values of forest AGB calculated based on the model (Figure A3). Finally, the forest AGB in the whole range of the study area was predicted and mapped (Figure 6c) based on the trained model.
A G B ( d b ) = 56.4598 + 3.5369 × σ H V 0 _ d b + 1.5128 × B M I _ d b + 3.5913 × V 3 R _ 2 _ d b           + 23.2417 × H + 2.0974 × m R F D I + 2.7412 × F 3 V _ d b 6.1481 × R V I 2.9493 × S p a n _ d b 0.5524 × R _ V V / H V _ d b 26.397 × A 3 V

5. Discussion

5.1. The Significance of Radiometric Terrain Correction

As an important part of the terrestrial ecosystem carbon pool [88], the accurate estimation of forest AGB is of great significance for quantifying the value of forest carbon storage and assessing the dynamics of the global carbon cycle, but the interference of topographic factors cannot be ignored [89]. This study found that the effective implementation of the RTC process can significantly suppress the interference of terrain factors with PolSAR data, thereby improving the quality and application potential of PolSAR data. This is specifically reflected in the following points. On the one hand, compared to the R-value between forest AGB and SAR features extracted from the NRTC data, the average R-value of 75 original features extracted from the RTC data increased by 0.26, and the maximum improvement was 0.48 ( σ H V 0 ). On the other hand, RTC is highly reasonable and robust at correcting components obtained by different polarization decomposition methods (FRE3 and YAM3 are examples). Moreover, results consistent with those of other researchers [90,91,92] were obtained during the POAC stage: the Vol component decreased, while the Odd and Dbl components increased. However, the correction effect on each ground unit during the ESAC and AVEC stages was closely related to the ground conditions in the study area and did not exhibit a monotonic trend. Finally, the fitting performance of 10 regression models based on RTC data was generally superior to that based on NRTC data (R2 increased by 0.14 and rRMSE decreased by 4.2% on average). These results indicate that RTC is effective and robust for improving the accuracy of PolSAR features and the reliability of forest AGB inversion. This is because RTC is a set of effective step-by-step terrain correction processes designed by in-depth analysis of the SAR physical mechanism, which accurately corrects the interference caused by terrain factors in terms of the scattering mechanism (POAC and AVEC) and intensity information (ESAC). The original PolSAR features after RTC processing carry more accurate polarization scattering information, thus improving the sensitivity of AGB inversion. In addition, if a suitable RTC process is created in the application of other SAR technologies to eliminate the interference of terrain factors, the application effect will be improved to a certain extent.

5.2. The Overcorrection of Radiometric Terrain Correction

This study also found that, without RTC processing, the R-value of the derived features extracted based on ratio arithmetic is significantly higher than that of the original features [86], but the original features showed significant improvement after RTC. This indicates that ratio-derived features can be prioritized for SAR data application when RTC is not feasible [93,94]. However, compared with the derived features without RTC, the average R-value of the 35 ratio-related PolSAR-derived features after RTC decreased by 0.05, indicating that there are some overcorrection issues in the RTC process that need to be improved. In addition, since the ratio operation can eliminate multiplicative noise, there is a greater likelihood of overcorrection due to POAC and AVEC. The extraction of POA needs to distinguish between the offset caused by forest targets and terrain factors, and only correct the offset caused by terrain, while the correction factor in the AVEC stage only considers the local incidence angle and radar incidence angle, without introducing factors such as surface roughness, terrain slope, and complex dielectric coefficient that are related to the backscatter coefficient in the AVEC correction process. Moreover, there is potential space for further improvement of the ESAC method. Therefore, exploring how to effectively solve the overcorrection problems caused by the above factors in the RTC process is a valuable future research direction.

5.3. Performance Comparison of Different Regression Models

Improving dataset quality (such as RTC) has a greater impact on improving AGB estimation performance than selecting regression algorithms [95]. However, after the significant improvement in data quality, the selection of machine learning regression models becomes particularly crucial. Selecting an appropriate regression model can ensure that the algorithm can fully utilize the features in the data, capture the relationships between variables, and avoid overfitting or underfitting, thereby further improving the prediction accuracy and generalization ability of the model. This study compared 10 different regression models and found that the linear model has superior complexity (feature dimension) and interpretability compared to the non-parametric model. This is because non-parametric models do not preset the specific form of the model but adjust flexibly based on the data. However, it is precisely because of their higher flexibility that these models become too complex. In the feature selection stage, the variance inflation factor (VIF) values of the features of different linear regression models are all less than 10 (Figure A2d), indicating that regularization methods (L1 or L2) can effectively alleviate the feature collinearity problem in multivariate linear models. However, the VIF value of many features is greater than 5, indicating that there is still a moderate degree of collinearity problem in the feature set. In addition, it is to be expected that there is some degree of collinearity in the features selected by the non-parametric model (Figure A2e), but the collinearity problem usually has less effect on the non-parametric model.
In addition, the R2 and rRMSE indicators of the non-parametric models based on the RTC data in this study are usually worse than those of the linear models. This may be because the target variable of this study is the forest AGB value in decibels, which has a better linear relationship with PolSAR features (higher accuracy) [7,58,96]. However, non-parametric models based on the NRTC data in this study typically have better R2 and rRMSE metrics than linear models. This is because terrain factors cause PolSAR features to have a more obviously nonlinear relationship with forest AGB. Moreover, the setting of hyperparameter spaces for different models directly affects the performance of the model, while the setting of hyperparameter spaces for non-parametric models is more complex, which may be one reason why non-parametric models performed slightly worse on the data in this study. Furthermore, the comparison between parametric and non-parametric models may be unfair because there is not a large amount of data to train the non-parametric model, resulting in a feature selection method that is not ideal. Therefore, k-Nearest Neighbor is a regression algorithm worth trying because it is more suitable for small datasets than random forest. In addition, deep learning models (such as Convolutional Neural Networks (CNN) and Deep Neural Networks (DNN)) have certain limitations (such as a large amount of data being required for training, the consumption of computing resources being large, and the model interpretation being poor) compared to traditional machine learning models. However, because of their strong feature learning ability, high accuracy, and excellent generalization ability, they have shown great potential for solving complex problems, and so are worth putting into use in scenarios with sufficient resources.

5.4. Potential Limitations

In this study, only PolSAR features were applied in AGB inversion. However, it is necessary to include more actual parameters that reflect the geographical environment and physical conditions when constructing a more comprehensive and accurate AGB model [44]. For example, climatic factors (temperature, rainfall, humidity, etc.) may directly affect the state of the land surface and vegetation, thereby altering the reflection characteristics of SAR signals [97]. Moreover, although RTC is carried out in the pre-processing of PolSAR data, it may also be worth taking slope, aspect, and local incidence angle into account in model training. In addition, more polarization parameters also contribute to the application of SAR data [67,98,99]. Through the comprehensive analysis of these multivariate parameters, we can expect a significant improvement in the predictive power of the model and a deeper understanding of the object of study.
In this study, data from only 132 plots were used to build and train machine learning models due to human constraints and real-world conditions. The relatively small forest AGB range (36.39–166.58 t/ha) of the real test sample may not be suitable for the forest in the study area (the maximum value is 254.6 t/ha). However, adding more plot data through unmanned aerial vehicle-light detecting and ranging (UAV-LiDAR) systems can significantly improve the generalization and reliability of the model, as larger datasets more accurately reflect the ecological diversity and complexity of the study area, hel** the model capture more patterns and trends [5]. Furthermore, the control of data quality is crucial. It is necessary to strictly control field measurement errors, AGB calculation errors, and matching errors between sample location and remote sensing image data to reduce the uncertainty caused by human factors, so as to comprehensively improve the overall quality of data and the accuracy of the model prediction [100].
In addition, this study did not conduct differentiated modeling for different forest types. However, the AGB modeling method based on forest type is a forest AGB modeling strategy that has attracted much attention in recent years. Because it is based on the basic characteristics of forest ecosystem diversity and heterogeneity, which is helpful to reveal the biomass distribution law and the fine modeling of forest AGB, so it has significant advantages in terms of the accuracy and practicability of the AGB inversion model [101,102].

6. Conclusions

Considering that the impact of RTC on the correctness of feature extraction and regression model performance needed to be further evaluated, this study took PolSAR data provided by ALOS-2 as the data source, and used 10 machine learning regression models and the Optuna hyperparameter optimization algorithm to verify the effectiveness and robustness of RTC in improving the correctness of the PolSAR feature set and the accuracy of forest AGB inversion. The detailed results are as follows: The RTC process effectively and robustly improved the correlation between the PolSAR features and the measured forest AGB (with an average increase of 0.26 in R-values) and the performance of the regression models (with an average increase of 0.14 in R2 values and an average decrease of 4.2% in rRMSE). The RTC process also showed prime robustness and rationality for the correction of different polarization decomposition components. The ratio operation of PolSAR features offset some multiplicative noises (including terrain), which indicates that there was a certain degree of overcorrection in the RTC process. Therefore, the ratio class-derived features were given priority when there was no condition for RTC; after RTC, σ0HV_db still had the highest R-value of all PolSAR features and forest AGB decibels. In addition, the PolSAR features had a closer linear relationship with the forest AGB of decibel states. In this case, the linear model remained a powerful and practical choice due to its efficiency and stability. For example, the optimal regression model in this study was BysRidge (R2 = 0.82, rRMSE = 18.06%).
However, this study treated all forest cover areas as the same type without differentiated modeling of different forest types, and confirmed that there is an overcorrection problem in the RTC process. More PolSAR features and actual parameters, as well as more accurate sample data, should be applied in machine learning inversion of forest parameters. Although this study had these limitations, it also provides insight for future research. It is expected to significantly improve the inversion accuracy of forest AGB by overcoming the above limitations, which can provide more reliable data support for the scientific management and protection of forest resources.
In summary, this study validated the effectiveness and robustness of RTC in improving the correct extraction of PolSAR features and accurately inverting forest AGB, and analyzed the potential mechanism of RTC on polarization decomposition parameters. The method used in this study has good robustness and universality. It can therefore provide assistance and reliable reference values for the inversion of forest AGB and related SAR applications based on machine learning models using PolSAR data in complex terrain areas.

Author Contributions

Conceptualization, Y.N. and W.F.; methodology, Y.N.; software, Y.N. and Y.W.; validation, Y.N.; formal analysis, Y.N.; investigation, Y.N. and W.F.; resources, W.F.; data curation, Y.N. and R.S.; writing—original draft preparation, Y.N.; writing—review and editing, Y.N., W.F. and S.C.; visualization, Y.N. and Y.H.; project administration, Y.N. and W.F.; funding acquisition, W.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (contract no. 31971654) and the Civil Aerospace Technology Advance Research Project (contract no. D040114).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy considerations.

Acknowledgments

Thanks to JAXA for providing ALOS-2 PALSAR-2 PolSAR data (contract no. AL-DQ-200521-01). We are grateful for the use of the Sentinel Application Platform (SNAP_v9.0.0) and Polarimetric SAR Data Processing and Educational Toolbox (PolSARpro_v5.1.1) software developed by the European Space Agency. The authors express their sincere thanks and appreciation to the reviewers, whose comments and expert advice were instrumental in sha** the contents of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Figure A1. The scatter density plot of each component of Yamaguchi three-component (YAM3) at different radiometric terrain correction (RTC) stages (Y-axis) relative to the previous stage (X-axis), and in AVEC stages (that is, after all processing of the RTC) with respect to non-RTC (NRTC). The three components of YAM3 are the Volume scattering component (Vol), Surface scattering component (Odd), and Double-bounce scattering component (Dbl). The three stages of RTC are polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC). The red line is a 1:1 line. (a) NRTC vs. POAC of Vol; (b) POAC vs. ESAC of Vol; (c) ESAC vs. AVEC of Vol; (d) NRTC vs. AVEC of Vol; (e) NRTC vs. POAC of Odd; (f) POAC vs. ESAC of Odd; (g) ESAC vs. AVEC of Odd; (h) NRTC vs. AVEC of Odd; (i) NRTC vs. POAC of Dbl; (j) POAC vs. ESAC of Dbl; (k) ESAC vs. AVEC of Dbl; (l) NRTC vs. AVEC of Dbl.
Figure A1. The scatter density plot of each component of Yamaguchi three-component (YAM3) at different radiometric terrain correction (RTC) stages (Y-axis) relative to the previous stage (X-axis), and in AVEC stages (that is, after all processing of the RTC) with respect to non-RTC (NRTC). The three components of YAM3 are the Volume scattering component (Vol), Surface scattering component (Odd), and Double-bounce scattering component (Dbl). The three stages of RTC are polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC). The red line is a 1:1 line. (a) NRTC vs. POAC of Vol; (b) POAC vs. ESAC of Vol; (c) ESAC vs. AVEC of Vol; (d) NRTC vs. AVEC of Vol; (e) NRTC vs. POAC of Odd; (f) POAC vs. ESAC of Odd; (g) ESAC vs. AVEC of Odd; (h) NRTC vs. AVEC of Odd; (i) NRTC vs. POAC of Dbl; (j) POAC vs. ESAC of Dbl; (k) ESAC vs. AVEC of Dbl; (l) NRTC vs. AVEC of Dbl.
Remotesensing 16 02229 g0a1aRemotesensing 16 02229 g0a1b
Figure A2. The result of feature selection: (a) the 32 features selected in preliminary feature selection (Boruta algorithm) based on radiative terrain correction (RTC) data, including the importance score given by the RF of the selected features, and absolute values of Pearson correlation coefficients (R) between the selected features and measured forest AGB; (b) the 21 features selected in preliminary feature selection (Boruta algorithm) based on non-RTC (NRTC) data, including the importance score given by the RF of the selected features, and absolute values of Pearson correlation coefficients (R) between the selected features and measured forest AGB; (c) the number of features selected in the second step feature selection (RFECV algorithm) based on RTC and NRTC data; (d) the features selected in different multivariate linear models and the variance inflation factor (VIF) value corresponding to each feature; (e) the features selected in different non-parametric models.
Figure A2. The result of feature selection: (a) the 32 features selected in preliminary feature selection (Boruta algorithm) based on radiative terrain correction (RTC) data, including the importance score given by the RF of the selected features, and absolute values of Pearson correlation coefficients (R) between the selected features and measured forest AGB; (b) the 21 features selected in preliminary feature selection (Boruta algorithm) based on non-RTC (NRTC) data, including the importance score given by the RF of the selected features, and absolute values of Pearson correlation coefficients (R) between the selected features and measured forest AGB; (c) the number of features selected in the second step feature selection (RFECV algorithm) based on RTC and NRTC data; (d) the features selected in different multivariate linear models and the variance inflation factor (VIF) value corresponding to each feature; (e) the features selected in different non-parametric models.
Remotesensing 16 02229 g0a2aRemotesensing 16 02229 g0a2b
Figure A3. Scatter plot of measured forest AGB and predicted forest AGB. The prediction model is an optimal regression model based on the PolSAR data processed by radiometric terrain correction (RTC) from 25 July 2020. (a) The independent variable of the prediction model was derived from the PolSAR data (after RTC processing) from 11 July 2020. (b) The independent variable of the prediction model was derived from the PolSAR data (after RTC processing) from 8 August 2020.
Figure A3. Scatter plot of measured forest AGB and predicted forest AGB. The prediction model is an optimal regression model based on the PolSAR data processed by radiometric terrain correction (RTC) from 25 July 2020. (a) The independent variable of the prediction model was derived from the PolSAR data (after RTC processing) from 11 July 2020. (b) The independent variable of the prediction model was derived from the PolSAR data (after RTC processing) from 8 August 2020.
Remotesensing 16 02229 g0a3

Appendix B

Table A1. Pearson correlation coefficient (R) between forest AGB and the 118 PolSAR features based on the data with radiometric terrain correction (RTC) and non-RTC data (NRTC), obtained from 11 July, 25 July, and 8 August 2020.
Table A1. Pearson correlation coefficient (R) between forest AGB and the 118 PolSAR features based on the data with radiometric terrain correction (RTC) and non-RTC data (NRTC), obtained from 11 July, 25 July, and 8 August 2020.
IDFeaturesNRTC/RTCSignificance
R_200711R_200725R_200808
1 σ HH 0 0.12/0.580.16/0.630.14/0.60-/**
2 σ HV 0 0.25/0.700.28/0.730.27/0.71*/**
3 σ VV 0 0.15/0.610.18/0.660.17/0.62-/**
4Span0.18/0.660.21/0.710.20/0.69-/**
5R_HH/VV−0.09/−0.09−0.10/−0.10−0.10/−0.10-/-
6R_HH/HV−0.74/−0.73−0.77/−0.76−0.76/−0.75**/**
7R_VV/VH−0.70/−0.69−0.74/−0.73−0.72/−0.70**/**
8BMI0.14/0.620.17/0.660.15/0.65-/**
9VSI0.70/0.670.73/0.710.71/0.68**/**
10RVI0.70/0.670.73/0.710.71/0.68**/**
11CSI0.091/0.0890.096/0.0940.096/0.095-/-
12RFDI−0.65/−0.64−0.68/−0.66−0.67/−0.65**/**
13mRFDI−0.63/−0.61−0.66/−0.64−0.65/−0.62**/**
14F2O0.14/0.510.19/0.560.16/0.53-/**
15F2V0.14/0.510.17/0.540.14/0.52-/**
16F3D0.15/0.300.17/0.310.19/0.34-/*
17F3O0.04/0.110.05/0.140.03/0.10-/-
18F3V0.30/0.720.32/0.740.31/0.74*/**
19Y3D0.22/0.340.26/0.360.24/0.35-/*
20Y3O−0.03/0.10−0.04/0.15−0.04/0.11-/-
21Y3V0.25/0.690.29/0.730.28/0.70-/**
22Y4D−0.04/0.21−0.05/0.23−0.04/0.21-/-
23Y4O−0.24/−0.44−0.25/−0.44−0.27/−0.45-/*
24Y4V0.31/0.710.30/0.710.30/0.70*/**
25Y4H0.08/0.150.09/0.160.09/0.18-/-
26M3D0.21/0.490.23/0.510.20/0.51-/**
27M3O0.01/0.250.01/0.290.01/0.28-/-
28M3V0.32/0.720.30/0.710.35/0.74*/**
29M4D0.25/0.600.24/0.580.22/0.57-/**
30M4O0.01/0.280.01/0.280.01/0.30-/-
31M4V0.29/0.710.30/0.720.32/0.74*/**
32M4H0.06/0.130.06/0.120.09/0.15-/-
33A3D0.22/0.450.20/0.430.18/0.40-/*
34A3O0.03/0.320.03/0.310.03/0.30-/-
35A3V0.31/0.640.29/0.630.30/0.63-/**
36V3D0.20/0.460.22/0.490.22/0.51-/*
37V3O0.15/0.540.16/0.540.18/0.55-/**
38V3V0.32/0.680.31/0.660.31/0.65*/**
39H0.61/0.620.61/0.630.60/0.63**/**
40A−0.07/−0.23−0.06/−0.23−0.07/−0.25-/-
41α0.55/0.510.57/0.550.57/0.56**/**
42C3D0.24/0.480.24/0.490.26/0.50-/*
43C3O0.30/0.600.32/0.630.31/0.60-/**
44C3V−0.01/0.15−0.02/0.19−0.02/0.20-/-
45K3S0.18/0.520.20/0.550.21/0.55-/**
46K3D0.31/0.700.30/0.680.30/0.67*/**
47K3H0.05/0.080.05/0.080.05/0.07-/-
48P3D0.20/0.600.22/0.630.22/0.65-/**
49P3O0.13/0.520.14/0.540.15/0.55-/**
50A3R_10.60/0.550.60/0.560.61/0.58**/**
51A3R_20.62/0.590.62/0.580.63/0.60**/**
52F2R_1−0.31/−0.12−0.33/−0.13−0.37/−0.15-/-
53F2R_2−0.32/−0.15−0.34/−0.15−0.38/−0.17-/-
54F3R_10.61/0.580.63/0.590.59/0.55**/**
55F3R_20.69/0.600.70/0.610.71/0.61**/**
56V3R_10.70/0.660.71/0.680.69/0.66**/**
57V3R_20.72/0.700.73/0.700.71/0.69**/**
58Y3R_10.65/0.610.65/0.620.64/0.60**/**
59Y3R_20.70/0.640.72/0.640.70/0.63**/**
60 σ HH 0 _ db 0.30/0.700.31/0.720.31/0.71*/**
61 σ HV 0 _ db 0.50/0.810.54/0.860.52/0.82**/**
62 σ VV 0 _ db 0.32/0.720.34/0.750.34/0.73*/**
63Span_db0.38/0.790.40/0.820.40/0.80*/**
64R_HH/VV_db−0.01/−0.01−0.098/−0.096−0.10/−0.10-/-
65R_HH/HV_db−0.57/−0.65−0.71/−0.69−0.70/−0.67**/**
66R_VV/VH_db−0.62/−0.60−0.66/−0.65−0.64/−0.63**/**
67BMI_db0.30/0.730.32/0.760.30/0.75*/**
68VSI_db0.75/0.740.77/0.750.78/0.76**/**
69RVI_db0.75/0.740.77/0.750.78/0.76**/**
70CSI_db0.10/0.090.099/0.0970.10/0.10-/-
71RFDI_db−0.62/−0.60−0.64/−0.61−0.65/−0.62**/**
72mRFDI_dbNaN/NaNNaN/NaNNaN/NaN-/-
73F2O_db0.36/0.660.40/0.690.38/0.67*/**
74F2V_db0.23/0.530.25/0.560.25/0.54-/**
75F3D_db0.35/0.450.37/0.460.38/0.48*/*
76F3O_db−0.05/0.14−0.06/0.16−0.04/0.11-/-
77F3V_db0.45/0.810.47/0.830.46/0.83*/**
78Y3D_db0.36/0.470.39/0.490.37/0.47*/*
79Y3O_db−0.03/0.10−0.05/0.14−0.05/0.12-/-
80Y3V_db0.42/0.720.45/0.750.43/0.72*/**
81Y4D_db−0.05/0.27−0.07/0.29−0.05/0.27-/-
82Y4O_db−0.39/−0.65−0.40/−0.65−0.42/−0.67*/**
83Y4V_db0.54/0.840.53/0.840.53/0.83**/**
84Y4H_db0.16/0.170.18/0.190.18/0.20-/-
85M3D_db0.39/0.600.40/0.620.38/0.62*/**
86M3O_db0.04/0.320.05/0.350.05/0.34-/*
87M3V_db0.55/0.830.54/0.820.58/0.85**/**
88M4D_db0.41/0.630.40/0.620.39/0.60*/**
89M4O_db0.05/0.350.05/0.350.05/0.37-/*
90M4V_db0.55/0.820.55/0.830.56/0.85**/**
91M4H_db0.12/0.170.12/0.160.15/0.18-/-
92A3D_db0.39/0.570.38/0.560.35/0.52*/**
93A3O_db0.05/0.380.05/0.370.04/0.35-/*
94A3V_db0.50/0.790.49/0.790.49/0.80**/**
95V3D_db0.38/0.550.41/0.570.41/0.59*/**
96V3O_db0.28/0.630.29/0.630.30/0.64-/**
97V3V_db0.46/0.790.46/0.780.45/0.76**/**
98H_db0.62/0.630.62/0.640.61/0.64**/**
99A_db−0.09/−0.060.08/−0.060.09/−0.08-/-
100α_db0.54/0.500.56/0.530.56/0.54**/**
101C3D_db0.33/0.570.33/0.580.35/0.59*/**
102C3O_db0.37/0.700.40/0.720.38/0.70*/**
103C3V_db0.02/0.230.04/0.260.04/0.27-/-
104K3S_db0.31/0.580.34/0.610.35/0.61*/**
105K3D_db0.47/0.780.46/0.760.46/0.75*/**
106K3H_db0.09/0.090.089/0.0920.09/0.09-/-
107P3D_db0.35/0.630.38/0.650.38/0.67*/**
108P3O_db0.27/0.550.28/0.570.28/0.58-/**
109A3R_1_db0.65/0.590.65/0.600.66/0.63**/**
110A3R_2_db0.66/0.630.66/0.620.67/0.64**/**
111F2R_1_db−0.32/−0.15−0.36/−0.16−0.39/−0.18*/-
112F2R_2_db−0.39/−0.19−0.39/−0.19−0.39/−0.19*/-
113F3R_1_db0.64/0.620.66/0.630.61/0.60**/**
114F3R_2_db0.71/0.640.72/0.650.73/0.65**/**
115V3R_1_db0.73/0.690.75/0.710.73/0.70**/**
116V3R_2_db0.75/0.720.76/0.720.75/0.71**/**
117Y3R_1_db0.69/0.640.69/0.650.67/0.62**/**
118Y3R_2_db0.73/0.670.75/0.670.73/0.66**/**
Note: ** means extremely significant (p ≤ 0.01), * means significant (p ≤ 0.05), - means did not pass the significance test (p > 0.05). NaN means Not a Number, because mRFDI has a partial negative value, so it is impossible to calculate the feature of its decibel unit (mRFDI_db).
Table A2. The average value of the corresponding evaluation index of the optimal performance obtained by repeating 10 times: two-step feature selection was carried out using the radiometric terrain correction (RTC) and non-RTC (NRTC) datasets, and then 10 regression models and the Optuna algorithm were used to obtain the corresponding evaluation indicators of optimal performance.
Table A2. The average value of the corresponding evaluation index of the optimal performance obtained by repeating 10 times: two-step feature selection was carried out using the radiometric terrain correction (RTC) and non-RTC (NRTC) datasets, and then 10 regression models and the Optuna algorithm were used to obtain the corresponding evaluation indicators of optimal performance.
ModelR2_RTCR2_NRTCrRMSE_RTC (%)rRMSE_NRTC (%)
BysRidge0.863340.6945318.1175622.89453
ARD0.862130.7153418.3186722.57860
Lasso0.855860.6845318.1360323.37506
ElasticNet0.847830.6812418.5131622.56761
Ridge0.847040.6475318.7461123.45273
CatBoost0.825210.7301318.1450122.43760
XGBoost0.822020.7427319.0155021.37638
AdaBoost0.819340.6957418.3730722.45734
ET0.798590.7075018.5678622.89376
RF0.789880.6854518.5337522.45703

References

  1. ** Global Forest Biomass to Better Understand the Terrestrial Carbon Cycle. Remote Sens. Environ. 2011, 115, 2850–2860. [Google Scholar] [CrossRef]
  2. Lu, D.; Chen, Q.; Wang, G.; Liu, L.; Li, G.; Moran, E. A Survey of Remote Sensing-Based Aboveground Biomass Estimation Methods in Forest Ecosystems. Int. J. Digit. Earth 2016, 9, 63–105. [Google Scholar] [CrossRef]
  3. Migolet, P.; Goïta, K.; Pambo, A.F.K.; Mambimba, A.N. Estimation of the Total Dry Aboveground Biomass in the Tropical Forests of Congo Basin Using Optical, LiDAR, and Radar Data. Gisci. Remote Sens. 2022, 59, 431–460. [Google Scholar] [CrossRef]
  4. Chen, Q.; McRoberts, R.E.; Wang, C.; Radtke, P.J. Forest Aboveground Biomass Map** and Estimation across Multiple Spatial Scales Using Model-Based Inference. Remote Sens. Environ. 2016, 184, 350–360. [Google Scholar] [CrossRef]
  5. Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A Tutorial on Synthetic Aperture Radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef]
  6. Soja, M.J.; Quegan, S.; d’Alessandro, M.M.; Banda, F.; Scipal, K.; Tebaldini, S.; Ulander, L.M.H. Map** Above-Ground Biomass in Tropical Forests with Ground-Cancelled P-Band SAR and Limited Reference Data. Remote Sens. Environ. 2021, 253, 112153. [Google Scholar] [CrossRef]
  7. Tsokas, A.; Rysz, M.; Pardalos, P.M.; Dipple, K. SAR Data Applications in Earth Observation: An Overview. Expert Syst. Appl. 2022, 205, 117342. [Google Scholar] [CrossRef]
  8. Yan, X.; Li, J.; Smith, A.R.; Yang, D.; Ma, T.; Su, Y.; Shao, J. Evaluation of Machine Learning Methods and Multi-Source Remote Sensing Data Combinations to Construct Forest above-Ground Biomass Models. Int. J. Digit. Earth 2023, 16, 4471–4491. [Google Scholar] [CrossRef]
  9. Zhang, R.; Zhou, X.; Ouyang, Z.; Avitabile, V.; Qi, J.; Chen, J.; Giannico, V. Estimating Aboveground Biomass in Subtropical Forests of China by Integrating Multisource Remote Sensing and Ground Data. Remote Sens. Environ. 2019, 232, 111341. [Google Scholar] [CrossRef]
  10. Huang, H.; Liu, C.; Wang, X.; Zhou, X.; Gong, P. Integration of Multi-Resource Remotely Sensed Data and Allometric Models for Forest Aboveground Biomass Estimation in China. Remote Sens. Environ. 2019, 221, 225–234. [Google Scholar] [CrossRef]
  11. Feng, Y.; Lu, D.; Chen, Q.; Keller, M.; Moran, E.; dos-Santos, M.N.; Bolfe, E.L.; Batistella, M. Examining Effective Use of Data Sources and Modeling Algorithms for Improving Biomass Estimation in a Moist Tropical Forest of the Brazilian Amazon. Int. J. Digit. Earth 2017, 10, 996–1016. [Google Scholar] [CrossRef]
  12. Shao, Z.; Zhang, L.; Wang, L. Stacked Sparse Autoencoder Modeling Using the Synergy of Airborne LiDAR and Satellite Optical and SAR Data to Map Forest Above-Ground Biomass. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2017, 10, 5569–5582. [Google Scholar] [CrossRef]
  13. Liao, Z.; He, B.; Quan, X.; Van Dijk, A.I.J.M.; Qiu, S.; Yin, C. Biomass Estimation in Dense Tropical Forest Using Multiple Information from Single-Baseline P-Band PolInSAR Data. Remote Sens. Environ. 2019, 221, 489–507. [Google Scholar] [CrossRef]
  14. Solberg, S.; Hansen, E.H.; Gobakken, T.; Næssset, E.; Zahabu, E. Biomass and InSAR Height Relationship in a Dense Tropical Forest. Remote Sens. Environ. 2017, 192, 166–175. [Google Scholar] [CrossRef]
  15. Zeng, P.; Zhang, W.; Li, Y.; Shi, J.; Wang, Z. Forest Total and Component Above-Ground Biomass (AGB) Estimation through C- and L-Band Polarimetric SAR Data. Forests 2022, 13, 442. [Google Scholar] [CrossRef]
  16. Luo, H.; Qin, S.; Li, J.; Lu, C.; Yue, C.; Ou, G. High-Density Forest AGB Estimation in Tropical Forest Integrated with PolInSAR Multidimensional Features and Optimized Machine Learning Algorithms. Ecol. Indic. 2024, 160, 111878. [Google Scholar] [CrossRef]
  17. Chowdhury, T.A.; Thiel, C.; Schmullius, C. Growing Stock Volume Estimation from L-Band ALOS PALSAR Polarimetric Coherence in Siberian Forest. Remote Sens. Environ. 2014, 155, 129–144. [Google Scholar] [CrossRef]
  18. Thiel, C.; Schmullius, C. The Potential of ALOS PALSAR Backscatter and InSAR Coherence for Forest Growing Stock Volume Estimation in Central Siberia. Remote Sens. Environ. 2016, 173, 258–273. [Google Scholar] [CrossRef]
  19. Eini-Zinab, S.; Maghsoudi, Y.; Sayedain, S.A. Assessing the Performance of Indicators Resulting from Three-Component Freeman–Durden Polarimetric SAR Interferometry Decomposition at P-and L-Band in Estimating Tropical Forest Aboveground Biomass. Int. J. Remote Sens. 2020, 41, 433–454. [Google Scholar] [CrossRef]
  20. Tebaldini, S. Single and Multipolarimetric SAR Tomography of Forested Areas: A Parametric Approach. IEEE Trans. Geosci. Remote Sens. 2010, 48, 2375–2387. [Google Scholar] [CrossRef]
  21. Li, W.; Zhang, Y.; Zhang, J.; Chen, H.; Chen, E.; Zhao, L.; Zhao, D. Tropical Forest AGB Estimation Based on Structure Parameters Extracted by TomoSAR. Int. J. Appl. Earth Obs. Geoinf. 2023, 121, 103369. [Google Scholar] [CrossRef]
  22. Zhang, H.; Wang, C.; Zhu, J.; Fu, H.; Han, W.; ** Forest Aboveground Biomass in the Northeastern United States with ALOS PALSAR Dual-Polarization L-Band. Remote Sens. Environ. 2012, 124, 466–478. [Google Scholar] [CrossRef]
  23. Peregon, A.; Yamagata, Y. The Use of ALOS/PALSAR Backscatter to Estimate above-Ground Forest Biomass: A Case Study in Western Siberia. Remote Sens. Environ. 2013, 137, 139–146. [Google Scholar] [CrossRef]
  24. Kumar, B.; Dikshit, O.; Gupta, A.; Singh, M.K. Feature Extraction for Hyperspectral Image Classification: A Review. Int. J. Remote Sens. 2020, 41, 6248–6287. [Google Scholar] [CrossRef]
  25. Nesha, M.K.; Hussin, Y.A.; Van Leeuwen, L.M.; Sulistioadi, Y.B. Modeling and Map** Aboveground Biomass of the Restored Mangroves Using ALOS-2 PALSAR-2 in East Kalimantan, Indonesia. Int. J. Appl. Earth Obs. Geoinf. 2020, 91, 102158. [Google Scholar] [CrossRef]
  26. Bouvet, A.; Mermoz, S.; Le Toan, T.; Villard, L.; Mathieu, R.; Naidoo, L.; Asner, G.P. An Above-Ground Biomass Map of African Savannahs and Woodlands at 25 m Resolution Derived from ALOS PALSAR. Remote Sens. Environ. 2018, 206, 156–173. [Google Scholar] [CrossRef]
  27. Englhart, S.; Keuck, V.; Siegert, F. Aboveground Biomass Retrieval in Tropical Forests—The Potential of Combined X- and L-Band SAR Data Use. Remote Sens. Environ. 2011, 115, 1260–1271. [Google Scholar] [CrossRef]
  28. Loew, A.; Mauser, W. Generation of Geometrically and Radiometrically Terrain Corrected SAR Image Products. Remote Sens. Environ. 2007, 106, 337–349. [Google Scholar] [CrossRef]
  29. Zhao, L.; Chen, E.; Li, Z.; Zhang, W.; Gu, X. Three-Step Semi-Empirical Radiometric Terrain Correction Approach for PolSAR Data Applied to Forested Areas. Remote Sens. 2017, 9, 269. [Google Scholar] [CrossRef]
  30. Nie, Y.; Hu, Y.; Sa, R.; Fan, W. Inversion of Forest above Ground Biomass in Mountainous Region Based on PolSAR Data after Terrain Correction: A Case Study from Saihanba, China. Remote Sens. 2024, 16, 846. [Google Scholar] [CrossRef]
  31. Zhang, H.; Zhu, J.; Wang, C.; Lin, H.; Long, J.; Zhao, L.; Fu, H.; Liu, Z. Forest Growing Stock Volume Estimation in Subtropical Mountain Areas Using PALSAR-2 L-Band PolSAR Data. Forests 2019, 10, 276. [Google Scholar] [CrossRef]
  32. Shi, J.; Zhang, W.; Marino, A.; Zeng, P.; Ji, Y.; Zhao, H.; Huang, G.; Wang, M. Forest Total and Component Biomass Retrieval via GA-SVR Algorithm and Quad-Polarimetric SAR Data. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103275. [Google Scholar] [CrossRef]
  33. Luo, M.; Wang, Y.; ** Tropical Forest Biomass with Radar and Spaceborne LiDAR in Lopé National Park, Gabon: Overcoming Problems of High Biomass and Persistent Cloud. Biogeosciences 2012, 9, 179–191. [Google Scholar] [CrossRef]
  34. Chhabra, A.; Rüdiger, C.; Yebra, M.; Jagdhuber, T.; Hilton, J. RADAR-Vegetation Structural Perpendicular Index (R-VSPI) for the Quantification of Wildfire Impact and Post-Fire Vegetation Recovery. Remote Sens. 2022, 14, 3132. [Google Scholar] [CrossRef]
  35. Freeman, A. Fitting a Two-Component Scattering Model to Polarimetric SAR Data From Forests. IEEE Trans. Geosci. Remote Sens. 2007, 45, 2583–2592. [Google Scholar] [CrossRef]
  36. Freeman, A.; Durden, S.L. A Three-Component Scattering Model for Polarimetric SAR Data. IEEE Trans. Geosci. Remote Sens. 1998, 36, 963–973. [Google Scholar] [CrossRef]
  37. Cui, Y.; Yamaguchi, Y.; Yang, J.; Park, S.-E.; Kobayashi, H.; Singh, G. Three-Component Power Decomposition for Polarimetric SAR Data Based on Adaptive Volume Scatter Modeling. Remote Sens. 2012, 4, 1559–1572. [Google Scholar] [CrossRef]
  38. Yamaguchi, Y.; Moriyama, T.; Ishido, M.; Yamada, H. Four-Component Scattering Model for Polarimetric SAR Image Decomposition. IEEE Trans. Geosci. Remote Sens. 2005, 43, 1699–1706. [Google Scholar] [CrossRef]
  39. Van Zyl, J.J.; Arii, M.; Kim, Y. Model-Based Decomposition of Polarimetric SAR Covariance Matrices Constrained for Nonnegative Eigenvalues. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3452–3459. [Google Scholar] [CrossRef]
  40. Yang, J.; Peng, Y.-N.; Yamaguchi, Y.; Yamada, H. On Huynen’s Decomposition of a Kennaugh Matrix. IEEE Geosci. Remote Sens. Lett. 2006, 3, 369–372. [Google Scholar] [CrossRef]
  41. Dey, S.; Bhattacharya, A.; Ratha, D.; Mandal, D.; Frery, A.C. Target Characterization and Scattering Power Decomposition for Full and Compact Polarimetric SAR Data. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3981–3998. [Google Scholar] [CrossRef]
  42. Dey, S.; Bhattacharya, A.; Frery, A.C.; Lopez-Martinez, C.; Rao, Y.S. A Model-Free Four Component Scattering Power Decomposition for Polarimetric SAR Data. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2021, 14, 3887–3902. [Google Scholar] [CrossRef]
  43. Cloude, S.R. Target Decomposition Theorems in Radar Scattering. Electron. Lett. 1985, 21, 22–24. [Google Scholar] [CrossRef]
  44. Krogager, E. New Decomposition of the Radar Target Scattering Matrix. Electron. Lett. 1990, 26, 1525. [Google Scholar] [CrossRef]
  45. Cloude, S.R.; Pottier, E. An Entropy Based Classification Scheme for Land Applications of Polarimetric SAR. IEEE Trans. Geosci. Remote Sens. 1997, 35, 68–78. [Google Scholar] [CrossRef]
  46. Demirci, S.; Kirik, O.; Ozdemir, C. Interpretation and Analysis of Target Scattering from Fully-Polarized ISAR Images Using Pauli Decomposition Scheme for Target Recognition. IEEE Access 2020, 8, 155926–155938. [Google Scholar] [CrossRef]
  47. Liu, Z.; Michel, O.O.; Wu, G.; Mao, Y.; Hu, Y.; Fan, W. The Potential of Fully Polarized ALOS-2 Data for Estimating Forest Above-Ground Biomass. Remote Sens. 2022, 14, 669. [Google Scholar] [CrossRef]
  48. Chang, Q.; Zwieback, S.; DeVries, B.; Berg, A. Application of L-Band SAR for Map** Tundra Shrub Biomass, Leaf Area Index, and Rainfall Interception. Remote Sens. Environ. 2022, 268, 112747. [Google Scholar] [CrossRef]
  49. Kursa, M.B.; Rudnicki, W.R. Feature Selection with the Boruta Package. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef]
  50. Pan, Y.; Birdsey, R.A.; Fang, J.; Houghton, R.; Kauppi, P.E.; Kurz, W.A.; Phillips, O.L.; Shvidenko, A.; Lewis, S.L.; Canadell, J.G.; et al. A Large and Persistent Carbon Sink in the World’s Forests. Science 2011, 333, 988–993. [Google Scholar] [CrossRef]
  51. Park, S.-E. The Effect of Topography on Target Decomposition of Polarimetric SAR Data. Remote Sens. 2015, 7, 4997–5011. [Google Scholar] [CrossRef]
  52. Lee, J.-S.; Ainsworth, T.L. The Effect of Orientation Angle Compensation on Coherency Matrix and Polarimetric Target Decompositions. IEEE Trans. Geosci. Remote Sens. 2011, 49, 53–64. [Google Scholar] [CrossRef]
  53. Li, H.; Li, Q.; Wu, G.; Chen, J.; Liang, S. The Impacts of Building Orientation on Polarimetric Orientation Angle Estimation and Model-Based Decomposition for Multilook Polarimetric SAR Data in Urban Areas. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5520–5532. [Google Scholar] [CrossRef]
  54. Chowdhury, T.; Thiel, C.; Schmullius, C.; Stelmaszczuk-Górska, M. Polarimetric Parameters for Growing Stock Volume Estimation Using ALOS PALSAR L-Band Data over Siberian Forests. Remote Sens. 2013, 5, 5725–5756. [Google Scholar] [CrossRef]
  55. Soja, M.J.; Sandberg, G.; Ulander, L.M.H. Regression-Based Retrieval of Boreal Forest Biomass in Slo** Terrain Using P-Band SAR Backscatter Intensity Data. IEEE Trans. Geosci. Remote Sens. 2013, 51, 2646–2665. [Google Scholar] [CrossRef]
  56. Sandberg, G.; Ulander, L.M.H.; Wallerman, J.; Fransson, J.E.S. Measurements of Forest Biomass Change Using P-Band Synthetic Aperture Radar Backscatter. IEEE Trans. Geosci. Remote Sens. 2014, 52, 6047–6061. [Google Scholar] [CrossRef]
  57. De Almeida, C.T.; Galvão, L.S.; Aragão, L.E.d.O.C.e.; Ometto, J.P.H.B.; Jacon, A.D.; Pereira, F.R.d.S.; Sato, L.Y.; Lopes, A.P.; Graça, P.M.L.d.A.; Silva, C.V.d.J.; et al. Combining LiDAR and Hyperspectral Data for Aboveground Biomass Modeling in the Brazilian Amazon Using Different Regression Algorithms. Remote Sens. Environ. 2019, 232, 111323. [Google Scholar] [CrossRef]
  58. Luckman, A. A Study of the Relationship between Radar Backscatter and Regenerating Tropical Forest Biomass for Spaceborne SAR Instruments. Remote Sens. Environ. 1997, 60, 1–13. [Google Scholar] [CrossRef]
  59. Kc, Y.B.; Liu, Q.; Saud, P.; Xu, C.; Gaire, D.; Adhikari, H. Driving Factors and Spatial Distribution of Aboveground Biomass in the Managed Forest in the Terai Region of Nepal. Forests 2024, 15, 663. [Google Scholar] [CrossRef]
  60. Narvaes, I.d.S.; dos Santos, J.R.; Bispo, P.d.C.; Graça, P.M.d.A.; Guimarães, U.S.; Gama, F.F. Estimating Forest Above-Ground Biomass in Central Amazonia Using Polarimetric Attributes of ALOS/PALSAR Images. Forests 2023, 14, 941. [Google Scholar] [CrossRef]
  61. Waqar, M.; Sukmawati, R.; Ji, Y.; Sri Sumantyo, J. Tropical PeatLand Forest Biomass Estimation Using Polarimetric Parameters Extracted from RadarSAT-2 Images. Land 2020, 9, 193. [Google Scholar] [CrossRef]
  62. Tian, L.; Wu, X.; Tao, Y.; Li, M.; Qian, C.; Liao, L.; Fu, W. Review of Remote Sensing-Based Methods for Forest Aboveground Biomass Estimation: Progress, Challenges, and Prospects. Forests 2023, 14, 1086. [Google Scholar] [CrossRef]
  63. Li, Y.; Li, C.; Li, M.; Liu, Z. Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation Using Machine Learning Algorithms. Forests 2019, 10, 1073. [Google Scholar] [CrossRef]
  64. Gao, Y.; Lu, D.; Li, G.; Wang, G.; Chen, Q.; Liu, L.; Li, D. Comparative Analysis of Modeling Algorithms for Forest Aboveground Biomass Estimation in a Subtropical Region. Remote Sens. 2018, 10, 627. [Google Scholar] [CrossRef]
Figure 1. Overview of study sites: (a) the location of Saihanba Forest Farm in relation to the provinces and counties in China; (b) the spatial location of ALOS-2 data relative to Weichang County; (c) the Pauli RGB image (R: |HH-VV|, G: |HV|, B: |HH + VV|) based on PolSAR data and the location of the measured samples; the basemap is the optical image of Tianditu.
Figure 1. Overview of study sites: (a) the location of Saihanba Forest Farm in relation to the provinces and counties in China; (b) the spatial location of ALOS-2 data relative to Weichang County; (c) the Pauli RGB image (R: |HH-VV|, G: |HV|, B: |HH + VV|) based on PolSAR data and the location of the measured samples; the basemap is the optical image of Tianditu.
Remotesensing 16 02229 g001
Figure 2. A flowchart of the proposed forest AGB map** scheme.
Figure 2. A flowchart of the proposed forest AGB map** scheme.
Remotesensing 16 02229 g002
Figure 3. Absolute value of Pearson correlation coefficient (R) between forest AGB and the PolSAR features based on the data (25 July 2020) with radiometric terrain correction (RTC, olive) and non-RTC data (NRT, red). Sorted based on R_RTC (i.e., absolute value of R value between forest AGB and SAR features extracted based on RTC data). (a) The first set of the extracted original PolSAR features; (b) the second set of the extracted original PolSAR features (39 in total); (c) derived features based on PolSAR original features (39 in total).
Figure 3. Absolute value of Pearson correlation coefficient (R) between forest AGB and the PolSAR features based on the data (25 July 2020) with radiometric terrain correction (RTC, olive) and non-RTC data (NRT, red). Sorted based on R_RTC (i.e., absolute value of R value between forest AGB and SAR features extracted based on RTC data). (a) The first set of the extracted original PolSAR features; (b) the second set of the extracted original PolSAR features (39 in total); (c) derived features based on PolSAR original features (39 in total).
Remotesensing 16 02229 g003
Figure 4. Taking PolSAR data from 25 July 2020 as an example, we created scatter density plots between the decibel values of the three components (Volume scattering component (Vol), Surface scattering component (Odd), and Double-bounce scattering component (Dbl)) of the Freeman three-decomposition in different topographic correction stages (non-radiometric terrain correction (NRTC), polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC)) and the local incidence angle θloc. (a) NRTC_Vol; (b) POAC_Vol; (c) ESAC_Vol; (d) AVEC_Vol; (e) NRTC_Odd; (f) POAC_Odd; (g) ESAC_Odd; (h) AVEC_Odd; (i) NRTC_Dbl; (j) POAC_Dbl; (k) ESAC_Dbl; (l) AVEC_Dbl.
Figure 4. Taking PolSAR data from 25 July 2020 as an example, we created scatter density plots between the decibel values of the three components (Volume scattering component (Vol), Surface scattering component (Odd), and Double-bounce scattering component (Dbl)) of the Freeman three-decomposition in different topographic correction stages (non-radiometric terrain correction (NRTC), polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC)) and the local incidence angle θloc. (a) NRTC_Vol; (b) POAC_Vol; (c) ESAC_Vol; (d) AVEC_Vol; (e) NRTC_Odd; (f) POAC_Odd; (g) ESAC_Odd; (h) AVEC_Odd; (i) NRTC_Dbl; (j) POAC_Dbl; (k) ESAC_Dbl; (l) AVEC_Dbl.
Remotesensing 16 02229 g004
Figure 5. Taking PolSAR data from 25 July 2020 as an example, we created a scatter density plot for each component of Freeman three-decomposition (FRE3) at different radiometric terrain correction (RTC) stages (Y-axis) relative to the previous stage (X-axis), and in AVEC stages (that is, after all processing of the RTC was completed) with respect to non-RTC (NRTC). The three components of FRE3 are the Volume scattering component (Vol), Surface scattering component (Odd), and Double-bounce scattering component (Dbl). The three stages of RTC are polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC). The red line is a 1:1 line. (a) NRTC vs. POAC of Vol; (b) POAC vs. ESAC of Vol; (c) ESAC vs. AVEC of Vol; (d) NRTC vs. AVEC of Vol; (e) NRTC vs. POAC of Odd; (f) POAC vs. ESAC of Odd; (g) ESAC vs. AVEC of Odd; (h) NRTC vs. AVEC of Odd; (i) NRTC vs. POAC of Dbl; (j) POAC vs. ESAC of Dbl; (k) ESAC vs. AVEC of Dbl; (l) NRTC vs. AVEC of Dbl.
Figure 5. Taking PolSAR data from 25 July 2020 as an example, we created a scatter density plot for each component of Freeman three-decomposition (FRE3) at different radiometric terrain correction (RTC) stages (Y-axis) relative to the previous stage (X-axis), and in AVEC stages (that is, after all processing of the RTC was completed) with respect to non-RTC (NRTC). The three components of FRE3 are the Volume scattering component (Vol), Surface scattering component (Odd), and Double-bounce scattering component (Dbl). The three stages of RTC are polarization orientation angle correction (POAC), effective scattering area correction (ESAC), and angular variation effect correction (AVEC). The red line is a 1:1 line. (a) NRTC vs. POAC of Vol; (b) POAC vs. ESAC of Vol; (c) ESAC vs. AVEC of Vol; (d) NRTC vs. AVEC of Vol; (e) NRTC vs. POAC of Odd; (f) POAC vs. ESAC of Odd; (g) ESAC vs. AVEC of Odd; (h) NRTC vs. AVEC of Odd; (i) NRTC vs. POAC of Dbl; (j) POAC vs. ESAC of Dbl; (k) ESAC vs. AVEC of Dbl; (l) NRTC vs. AVEC of Dbl.
Remotesensing 16 02229 g005aRemotesensing 16 02229 g005b
Figure 6. Analysis of the effectiveness of RTC and the optimal regression model of this study, taking the SAR data from 25 July 2020 as an example. (a) The training results of the NRTC and RTC data, where the black dots are the results of the corresponding single training; (b) scatter plot of the measured forest AGB and the AGB predicted by the optimal regression model (BysRidge); (c) spatial distribution map of forest AGB in the study area based on optimal model prediction.
Figure 6. Analysis of the effectiveness of RTC and the optimal regression model of this study, taking the SAR data from 25 July 2020 as an example. (a) The training results of the NRTC and RTC data, where the black dots are the results of the corresponding single training; (b) scatter plot of the measured forest AGB and the AGB predicted by the optimal regression model (BysRidge); (c) spatial distribution map of forest AGB in the study area based on optimal model prediction.
Remotesensing 16 02229 g006
Table 1. The main parameters of the PolSAR data.
Table 1. The main parameters of the PolSAR data.
ParameterValueParameterValue
Data formats and processing levelCEOS level 1.1Observation modeHBQ
Observation date of scene center27.8054°Radar wavelength0.2424525 m
Length of range direction49.8 kmRange resolution2.860844 m
Length of azimuth direction69.3 kmAzimuth resolution2.642742 m
Orbit directionAscendingRange pixel8392
Observation directionRight lookingAzimuth pixel26,105
Note: HBQ is High-sensitive mode Full (Quad.) polarimetry.
Table 2. Allometric growth equation of different tree species.
Table 2. Allometric growth equation of different tree species.
Tree SpeciesAllometric EquationRef.
Larix principis-rupprechtiiW = 0.1431 · DBH2.2193[60]
Betula platyphyllaW = 0.0330 · DBH2.9314[61]
Pinus sylvestris var. mongolicaW = 0.0930 · DBH2.3429[62]
Pinus tabuliformisW = 0.0520 · DBH2.5830[63]
Acer truncatumW = 0.1260 · DBH2.3830[64]
Note: W is the AGB (kg) per plant, and DBH is the diameter at breast height (cm).
Table 3. Sample plot of AGB statistics.
Table 3. Sample plot of AGB statistics.
CountMin (t/ha)Max (t/ha)Mean (t/ha)SD (t/ha)
13236.39166.58103.8832.31
Note: SD is standard deviation; Count is the number of sample plots.
Table 4. The features derived from the backscattering coefficients extracted from the PolSAR data.
Table 4. The features derived from the backscattering coefficients extracted from the PolSAR data.
FeaturesSymbolEquationSymbol (dB)Source
SpanSpan σ H H 0 + 2 σ H V 0 + σ V V 0 Span_db[67]
Co-Pol HH/VV RatioR_HH/VV σ H H 0 / σ V V 0 R_HH/VV_db[67]
Cross-Pol HH/HV RatioR_HH/HV σ H H 0 / σ H V 0 R_HH/HV_db[67]
Cross-Pol VV/VH RatioR_VV/VH σ V V 0 / σ V H 0 R_VV/VH_db[68]
Biomass IndexBMI σ H H 0 + σ V V 0 / 2 BMI_db[69]
Volume Scattering IndexVSI σ H V 0 / σ H V 0 + B M I VSI_db[69]
Canopy Structure IndexCSI σ V V 0 / σ H H 0 + σ V V 0 CSI_db[69]
Radar Vegetation IndexRVI 8 σ H V 0 / σ H H 0 + 2 σ H V 0 + σ V V 0 RVI_db[70]
Radar Forest Degradation IndexRFDI σ H H 0 σ H V 0 / σ H H 0 + σ H V 0 RFDI_db[71]
modified RFDImRFDI σ V V 0 σ H V 0 / σ V V 0 + σ H V 0 mRFDI_db[72]
Note: The backscattering coefficient σ p q 0 in the table is the backscattering intensity in linear units.
Table 5. The polarization decomposition features extracted from the PolSAR data.
Table 5. The polarization decomposition features extracted from the PolSAR data.
Decomposition MethodsAbbreviationSymbolSource
Freeman two-componentFRE2F2V, F2O[73]
Freeman three-componentFRE3F3V, F3D, F3O[74]
Yamaguchi three-componentYAM3Y3V, Y3D, Y3O[75]
Yamaguchi four-componentYAM4Y4V, Y4D, Y4O, Y4H[76]
Van Zyl three-componentVAZ3V3V, V3D, V3O[77]
An and Yang three-componentANY3A3V, A3D, A3O[78]
Model-free three-componentMF3CFM3V, M3D, M3O[79]
Model-free four-componentMF4CFM4V, M4D, M4O, M4H[80]
Cloude three-componentCLD3C3V, C3D, C3O[81]
Krogager three-componentKRO3K3S, K3D, K3H[82]
H-A-alphaHAαH, A, α[83]
Pauli three-componentPAU3P3D, P3O[84]
Note: In this table, the symbol of each component of the different polarization decomposition methods (except H-A-alpha) has a consistent marking method: the first character (letter) is the first letter of the abbreviation of the polarization decomposition method, the second character (number) is the number of its components, and the third character (letter) is the first letter of the abbreviation of the different component types, including Volume scattering component (Vol), Surface scattering component (Odd), Double-bounce scattering component (Dbl), Helix scattering component (Hel), and Sphere scattering component (Sph). In addition, the decibel symbol of each polarization decomposition component adds a suffix (“_db”) to the symbol of its intensity unit, which is not listed in this table. For example, the decibel symbol of F3V is F3V_db.
Table 6. The features derived from the polarization decomposition extracted from the PolSAR data.
Table 6. The features derived from the polarization decomposition extracted from the PolSAR data.
SymbolEquationSymbol (dB)Source
F2R_1 F 2 V / F 2 O F2R_1_db[85]
F3R_1 F 3 V / F 3 D + F 3 O F3R_1_db[85]
A3R_1 A 3 V / A 3 D + A 3 O A3R_1_db[85]
V3R_1 V 3 V / V 3 D + V 3 O V3R_1_db[85]
Y3R_1 Y 3 V / Y 3 D + Y 3 O Y3R_1_db[85]
F2R_2 F 2 V / F 2 V + F 2 O F2R_2_db[86]
F3R_2 F 3 V / F 3 V + F 3 D + F 3 O F3R_2_db[86]
A3R_2 A 3 V / A 3 V + A 3 D + A 3 O A3R_2_db[86]
V3R_2 V 3 V / V 3 V + V 3 D + V 3 O V3R_2_db[86]
Y3R_2 Y 3 V / Y 3 V + Y 3 D + Y 3 O Y3R_2_db[86]
Note: The meanings of the symbols in the equations in this table are shown in Table 5.
Table 7. The 10 regression models used in this study.
Table 7. The 10 regression models used in this study.
ModelPython PackageModuleEstimatorCategory
Ridgescikit-learn (v1.4.2)sklearn.linear_modelRidgelinear
Lassoscikit-learn (v1.4.2)sklearn.linear_modelLassolinear
ElasticNetscikit-learn (v1.4.2)sklearn.linear_modelElasticNetlinear
BysRidgescikit-learn (v1.4.2)sklearn.linear_modelBayesianRidgelinear
ARDscikit-learn (v1.4.2)sklearn.linear_modelARDRegressionlinear
RFscikit-learn (v1.4.2)sklearn.ensembleRandomForestRegressornon-parametric
ETscikit-learn (v1.4.2)sklearn.ensembleExtraTreesRegressornon-parametric
AdaBoostscikit-learn (v1.4.2)sklearn.ensembleAdaBoostRegressornon-parametric
XGBoostxgboost (v2.0.3)xgboostXGBRegressornon-parametric
CatBoostcatboost (v1.2.3)catboostCatBoostRegressornon-parametric
Table 8. The Python packages on which the implementation of two-step feature selection depends.
Table 8. The Python packages on which the implementation of two-step feature selection depends.
AlgorithmBorutaRFECVRFOptuna
packageBoruta (v0.3)scikit-learn (v1.4.2)scikit-learn (v1.4.2)Optuna (v3.6.0)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nie, Y.; Sa, R.; Chumachenko, S.; Hu, Y.; Wang, Y.; Fan, W. Inversion of Forest Aboveground Biomass in Regions with Complex Terrain Based on PolSAR Data and a Machine Learning Model: Radiometric Terrain Correction Assessment. Remote Sens. 2024, 16, 2229. https://doi.org/10.3390/rs16122229

AMA Style

Nie Y, Sa R, Chumachenko S, Hu Y, Wang Y, Fan W. Inversion of Forest Aboveground Biomass in Regions with Complex Terrain Based on PolSAR Data and a Machine Learning Model: Radiometric Terrain Correction Assessment. Remote Sensing. 2024; 16(12):2229. https://doi.org/10.3390/rs16122229

Chicago/Turabian Style

Nie, Yonghui, Rula Sa, Sergey Chumachenko, Yifan Hu, Youzhu Wang, and Wenyi Fan. 2024. "Inversion of Forest Aboveground Biomass in Regions with Complex Terrain Based on PolSAR Data and a Machine Learning Model: Radiometric Terrain Correction Assessment" Remote Sensing 16, no. 12: 2229. https://doi.org/10.3390/rs16122229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop