1. Introduction
Light Detection and Ranging (LiDAR), also referred to as laser scanning, is a widely used three-dimensional information-acquisition technology and provides high-accuracy and -quality data [
1]. LiDAR can directly provide centimeter-level-accuracy data for measuring and characterizing the structure of land surface objects, particularly in forest inventory management and urban surveying [
2,
3,
4].
Spaceborne LiDAR technology can penetrate the atmosphere and obtain accurate measurements of Earth’s surface [
5]. Full-waveform LiDAR records the complete echo waveform with multiple peaks in the ground sample range and collects discrete points distributed spatially adjacent to each other. The Global Ecosystem Dynamics Investigation (GEDI) instrument is the latest spaceborne full-waveform LiDAR system with three lasers and eight sample tracks [
6]. After launching in December 2018, the Global Ecosystem Dynamics Investigation (GEDI) mission was deployed to the International Space Station (ISS) and has been acquiring Earth surface height continuously since April 2019. One of the most important scientific objectives of the GEDI mission is to obtain three-dimensional vertical forest structure parameters for land carbon cycle modeling [
6]. Much previous research assessed the performance of the initial version of the GEDI product. As the second version was published, the geolocation error of GEDI footprints was improved from ≈20 m to ≈10 m, and most studies focused on the assessment and application of the GEDI product.
Spaceborne LiDAR data should be properly corrected before application due to complex environmental factors such as atmospheric scattering and spacecraft platform instability [
7,
8]. According to the GEDI team, the spatial geolocation accuracy of the second version of GEDI footprints is 10.2 m, resulting in an elevation error of 17.8 cm over a slope with one degree [
9]. This causes the footprint of GEDI to deviate from the real location and to only partially cover the real surface objects. To precisely locate the whole object, the geolocation error of GEDI spots should be corrected with high accuracy. The geolocation error of the laser footprint consists of system errors and random errors. The system error is mainly caused by sensors’ electronic features, platform attitude, orbit parameters, atmospheric delay, and GNSS positioning accuracy [
10]. Due to the complexity of the satellite operating environment and ground conditions, slight measurement errors in the footprint positioning model parameters will lead to random errors in the position of the laser footprints [
11].
The geolocation error correction of GEDI footprints usually includes the on-orbit positioning error correction method and the ground-data-based correction method. The on-orbit positioning error correction method needs a lot of satellite information and spacecraft orbital parameters, and can be calculated by satellite orbit parameters, the information of the spacecraft (attitude, pitch, yaw, and roll), GNSS positioning, and object target distance [
5]. As a satellite orbit with a high flight altitude, ICESat has more stable orbital parameters with centimeter-level on-orbit geolocation accuracy [
12], while GEDI’s orbit error is 60 m [
13]. There are two main reasons for the low orbit-location accuracy. One is the lower orbital altitude (about 400 km), which tends to cause instability in the estimation of orbital parameters. Second, the positioning accuracy of GNSS is low, mainly due to the reflection of the GNSS signal and the low visibility of the GNSS satellite. These two shortages make the ISS orbital position less accurate [
14,
15,
16]. In terms of ISS sensors, Montenbruck et al. [
17] realized the short-term prediction and improved position accuracy of the ISS orbit from 10 m to 1 m based on GNSS receiver data. Dou et al. [
18] utilized a quaternion-based algorithm based on orbit state, the observation vector of the International Space Station Agriculture Camera (ISSAC), and natural topographic data to improve the geolocation accuracy from 800 m to 500 m. Subsequently, the influence of ISS self-rotation was overcome and the residual geolocation error was improved from 1000 m to 500 m [
19]. The on-orbit positioning error correction method is applicable to all GEDI footprints, but it has a lot of technical specifications and measurement parameters with high requirements. Additionally, it cannot solve other types of error, for example, laser pointing errors; moreover, the on-orbit positioning error correction accuracy of ISS tends to be at the meter level, or even the ten-meter level.
The correction method based on ground data can reduce the geolocation error and achieve geolocation accuracy at the meter level and even the centimeter level. It can be divided into the field site geolocation error correction method and the non-field geolocation error correction method according to whether there is a calibration field site. In the field site geolocation error correction method, Luthcke et al. [
20] proposed a residual range calibration method to correct aiming and range biases based on spacecraft trim maneuvers and the residual over-ocean range. Magruder et al. [
5] deployed the placement of an electro-optical detector that captured the signal of GLAS laser and correcting laser pointing and timing errors. Sirota et al. [
21] improved the laser pointing angle by analyzing the mass center coordinate changes in GLAS footprints. The field site geolocation error correction method can achieve at centimeter-level geolocation accuracy but is hard to realize because of spatial and timing restrictions. The non-field geolocation error correction method is divided into two categories. The first category, the terrain matching method, only uses topographic data to correct the system error. Filin [
22] compared the ground vertical profile of reference ground elevation and GLAS laser track observations to correct the system error of GLAS footprints. Schleich et al. [
23] corrected the GEDI footprint location by minimizing the difference between the DTM (Digital Terrain Model) and ground elevation of a single GEDI footprint, improving the RMSE (root-mean-square error)/MAE (mean absolute error) of canopy height from 2.50 m/1.45 m to 2.10 m/1.07 m. The terrain matching method only uses the ground elevation of footprints, and the corrected accuracy is meter-level. The other category, called the waveform matching method, makes full use of waveform shape to remove the geolocation error by comparing real waveforms with reference waveforms. Harding [
24] generated reference waveforms using a Digital Surface Model (DSM) and matched GLAS waveforms pixel-by-pixel to determine the real position of the footprint. Yue et al. [
25] matched the DSM and the waveform based on statistical characteristics; later, the waveform-matching method was extended to different spatial areas and land cover types [
26]. Traditionally, the waveform matching method’s correction accuracy was meter-level because of the meter-level resolution of ground reference data, while point-cloud data describes the three-dimensional object with centimeter-level ranging resolution. The waveform matching method based on point-cloud data would improve the geolocation accuracy at the centimeter level.
The waveform simulation and the ground reference data are key parts in the application of the waveform matching method. The GEDI Simulator [
27] was designed for pre-launch testing and algorithm development by the GEDI science team. The GEDI Simulator can simulate the reference waveform using point-cloud data and assess the performance of the GEDI product [
28]. In the past, geolocation error correction for GLAS individual footprints was common due to the lack of point-cloud data [
25]. However, with the ubiquity of laser devices and publicly available point-cloud data [
29], systematic error correction based on multiple laser footprints is becoming more common and easier to apply [
7].
The main objective of this study is to correct the geolocation error of GEDI footprints based on point-cloud data over multiple study areas. Firstly, the best-matched position is determined based on multiple waveform matching between the real waveform and the reference waveform. Then, the positions of all the GEDI footprints are corrected according to the relative distance of the best-matched position and the original footprint position. We mainly aim to solve the following two questions:
Is there a geolocation error in the current GEDI footprint? If one exists, how serious is the geolocation error?
Is it possible to correct the geolocation error for GEDI footprints?
2. Materials and Methods
In the study, we used airborne LiDAR (ALS) data from the National Ecological Observatory Network (NEON) at 8 sites between 2019 and 2021 in the forest region. Taking these sites as research areas, we collected all the qualified GEDI footprints. Based on the ALS data, we calculated the geolocation accuracy of the GEDI footprints and verified the effectiveness of the error-corrected position.
We first calculated the geolocation error of the GEDI footprints based on the ALS waveform matching method and obtained the error values of different temporal GEDI footprints. Then, we analyzed the relationships between GEDI labels (“degrade_flag” and “solar_elevation”) and geolocation error from a statistical point of view. Next, we evaluated the effect of the waveform matching method by comparing the waveforms before and after the correction. The work flowchart of this study is shown in
Figure 1.
2.1. Study Area
The study area included eight ALS collection areas with a total area of approximately 1461 km
2, covering latitudes of 30° to 45°, longitudes of −122° to −81°, and elevations of 14–3776 m. The surface covering in the area mainly includes forests, shrubs, and grasslands without watershed and plant areas. The distribution of the study area is shown in
Figure 2 and details are given in
Table 1.
2.2. Data Collection
2.2.1. GEDI Version 2 Product
GEDI collects global waveform LiDAR between 50°N and 50°S. The GEDI laser system contains three lasers and eight observation sample beams. GEDI scanning beams can be divided into strong and weak beams depending on the intensity of the laser energy. Depending on the requirements, GEDI offers different types of product, including raw transmitting and receiving waveforms (L1 product), ground height and canopy height at the footprint level (L2 product), and height and biomass data in grid form (L3 and L4 products). The available GEDI footprints within the study area can be obtained via GEDI Finder (
https://git.earthdata.nasa.gov/projects/LPDUR/repos/GEDI-finder-tutorial-python/browse, accessed on 31 August 2022). The GEDI L1B [
31] and L2B [
32] products were used in this study, mainly for the real receiving waveforms and geographic location extraction, respectively. As each orbit has a unique orbit identifier (orbit number) in the GEDI footprint, GEDI temporal footprint IDs were determined using the ALS site name and orbit number in this study. The number of multi-temporal GEDI footprints used in this study is listed in
Table 2 and shown in
Figure 2.
The value of the label “degrade_flag” was most relevant to the geolocation accuracy among all the attributes of GEDI footprints. The values of “degrade_flag” included non-zero and zero. The non-zero value had a corresponding orbital degradation situation of platform stability and GNSS position precision during operation. The “degrade_flag” label equaled zero, which means low probability with a certain two-degradation situation. The location accuracy may be better or worse in the surrounding periods near the beginning and end of the “degrade_flag” flagged intervals. Additionally, we considered the effects of different footprint acquisition times on geolocation error. The daytime/nighttime information was extracted from the attribute of “solar_elevation”. Additionally, we took the “sensitivity” label into account in the data pre-processing flow.
2.2.2. Airborne LiDAR Data
NEON is an ecological observation project (
https://data.neonscience.org/data-products/explore, accessed on 31 August 2022). Among airborne data, airborne LiDAR data play an important role in quantitative information collection on land cover and changes in ecological structure.
The airborne laser scanning (ALS) data show centimeter-level ranging accuracy of 3D point-cloud data around the GEDI footprints. The ALS acquisition years were restricted to 2019, 2020, and 2021, with the average point densities across the sites varying from 8 to 60 points/m
2, as determined using the scanner instrument Optech Gemini. The use of ALS and GEDI footprints from different years together was avoided in this study to prevent the influence of year-to-year physical variability in the experimental results.
Table 1 presents all the ALS datasets used in this study, including spatial location, acquisition time (year-month), and elevation range [
29].
Because the NEON onboard LiDAR platform is calibrated every winter, including horizontal and vertical positioning accuracy, the ALS data can be considered to be without geolocation error and can be used as a reference source for GEDI geolocation error correction. According to the data quality report, the vertical geolocation precision is generally less than 10 cm, and the horizontal geolocation precision is higher. ALS data were used for waveform simulation and correcting the footprint location by comparing the real waveform with the reference waveform, corresponding to part1 and part2 in the data processing flowchart (
Figure 1).
2.3. Calculation of GEDI Footprints Geolocation Error using the Waveform Matching Method
The purpose of this study is to calculate the geolocation system error of temporal GEDI footprints and validate the correction result. Additionally, the main part of the evaluation of geolocation error was conducted using the GEDI Simulator.
The evaluation of GEDI footprint geolocation accuracy mainly included two main parts: reference waveform generation and error factor calculation. In the data pre-processing stage (
Figure 1 part1), we selected the high-quality GEDI waveforms, mainly using the attributes of “sensitivity” and “quality” as the training set, for calculating the geolocation error.
The objective of reference waveform generation is to unify the form of reference data and GEDI observation data to facilitate data comparison. This process requires the real laser spot spatial position and ALS point-cloud data and converts the point-cloud data of the corresponding range of the ALS subset data into waveforms. The waveform simulation process has two main steps. The first step assigns the laser pulse energy within the footprint range according to a Gaussian distribution, and the weight of the horizontal position is assigned using the distance of each point in the point-cloud data relative to the center of the footprint. The parameters of the laser pulse are the same as those of the GEDI system. The second step is then convolved vertically to form a continuous waveform. Different land cover and longitudinal changes are often reflected in 3D point-cloud data with different point densities. In the different point density results in the formation of different waveform data, the waveform of the flat terrain area is monotonous with a single peak, while the area with complex terrain tends to generate a differently shaped waveform with multiple peaks. Additionally, greater changes in the point cloud cause greater changes in the waveform amplitude value, which tends to occur between the ground and above-ground objects.
The geolocation error is calculated by maximizing the average correlation coefficient between the GEDI waveforms and the reference waveforms as the waveform similarity coefficient (SimiCoef). SimiCoef is calculated from the denoised GEDI waveform G(t) and the reference waveform R(t), as in the following Equation (1):
where cov is the covariance of two curves, and σ is the standard deviation.
The calculation of the best-matched position is based on the criteria for maximizing the average SimiCoef value (
Figure 3). The error calculation strategy used in this study was to carry out global matching first, and then, local matching. Firstly, we moved multiple GEDI footprints’ locations by a certain step and calculated the average SimiCoef matrix from the multiple SimiCoef matrixes of the footprints. In global matching, the moving step is an important factor affecting the matching result. On the one hand, too large a step will lead to low accuracy of the global matching result, thereby affecting the subsequent local matching accuracy result; on the other hand, too small a step calculation process is redundant and may cause program calculation failure due to limited computer resources. For a GLAS footprint with a ∼50 m diameter, the recommended step is 4 m [
26]. In this study, the moving step was chosen to be 1 m, considering that the GEDI footprint diameter was 25 m.
Then, the best position (i, j) of the globe matching serves as the initial position of the local matching. In the process of local optimization, the footprints’ positions are dynamically adjusted using the simplex optimization method [
33]. The main idea of the simplex algorithm is to calculate the objective function to maximize SimiCoef, calculate the corresponding function value of the objective function at certain position, then sort the function value, and continuously iteratively replace the element with the smallest function value until the simplex converges near the maximum value of the function [
34]. The algorithm calculation process is as follows:
- (1)
Initialization: Determine the initial feasible basis and the initial feasible solution, and construct the initial simplex.
- (2)
Optimality test: The coefficient of the non-basis variable is σ of the test number. If one of the following two conditions is met, the calculation is stopped and the current feasible solution is output as the optimal solution. Condition 1 is in the row corresponding to the objective function of the current table, all the values are non-positive, and Condition 2 is the number of iterations exceeding the pre-set threshold. Otherwise, go to the next step.
- (3)
Convert from one feasible solution to another feasible solution with a larger target value and form a new simplex:
Determine the variables that are swapped into the basis. Select > 0, the corresponding variable , as the substitution variable when there is more than one test number greater than 0 (generally, one should select the largest test number, that is, and its corresponding as the substitution variable.
Identify swapped-out variables. Calculate and select θ according to Equation (2), and select the smallest corresponding basis variable as the swapped-out variable.
where
is the right-hand system item in the current table, and
is the coefficient of the variable k in the ith constraint.
Replace the swapped-out variable in the base variable with the swapped-in variable to obtain a new base. A new basis can be found for a new feasible solution, and a new simplex can be obtained accordingly.
- (4)
Repeat steps 2 and 3 until the calculation is complete.
The best-matched footprint position is generated in two situations. Case 1 is where we obtain the optimal solution based on the judgment criteria of the simplex algorithm, while Case 2 is where the number of program iterations reaches the maximum and the last solution is considered the output [
35]. In brief, globe matching tends to find the optimal result area and the best footprint location after local matching. Finally, we obtain the best-matched position through the system error coefficient in the x and y directions. The distance between the original and final positions is the geolocation error. Due to the characteristics of both brute-force search and local optimization, this experimental procedure can achieve centimeter-level positioning accuracy of horizontal geolocation.
2.4. The Validation of the Geolocation Error Position Correction
After evaluating the geolocation error of the GEDI footprints, we can correct the geolocation error using the system error coefficient. Additionally, we need to validate the correction result. We calculate and compare the average SimiCoef of the original and corrected locations.
The validation part consists of 4 steps. In Step 1, we filter complex GEDI waveforms by “mode” values greater than two considering that the waveform matching method is suitable for areas with complex terrain distribution. In Step 2, these complex waveforms are divided into a training set and a test set according to whether they are involved in the calculation of the geolocation or not. In Step 3, after converting the footprint’s spatial coordinates from the WGS84 geographic coordinate system to the local projection coordinate system, we apply the error distance in the x and y directions from
Section 2.3. In the final step, we calculate the waveform similarity coefficients at the original position and the ideal position, respectively.