Abstract:Exploring the potential of global ecosystem dynamics investigation (GEDI) multi-beam LiDAR data to estimate regional forest canopy closure (FCC) plays an important role in assessing forest ecosystem status and stand environment. The typically ecologically fragile area, Shangri-la, was selected as the study area in the northwestern Yunnan. GEDI waveform data was used as the information source to extract 46245 forest footprints parameters. The empirical Bayesian kriging (EBK) method was used to obtain the continuous distribution of footprints parameters in the unknown space of the study area. Then, combined with 54 measured samples data, the recursive feature elimination method of support vector machine (SVM-RFE), random forest (RF) and Pearson analysis were chosen to optimize the characteristic variables, respectively. The best estimation model of forest canopy closure was studied and constructed by Bayesian optimal random forest regression model (BO-RFR), Bayesian optimal gradient regression model (BO-GBRT), and partial least square method (PLSR). The results showed that: (1) The EBK method had high prediction accuracy and reliable estimation results. Its R2 was 0.20-0.92, RMSE was 0.004-2812.912, MAE was 0.003-1996.258, and MRE was 0.007-4.423. (2) There were slight differences in the method selection of characteristic variables and number based on different characteristic optimization methods. Among the three methods, the SVM-RFE method selected six parameters (cover, pai, sensitivity, rv_a1, rv_a4, rg_a4) with an average cross-validation accuracy of 0.84. The RF method selected five parameters (cover, pai, pgap_theta_error, modis_treecover, modis_nonvegetated) with a contribution of 5% as the threshold. The Pearson method significantly selected five parameters (cover, pai, rv_a5, rg_a5, pgap_theta_error) with a correlation greater than 0.3 and at the 0.01 level. (3) The modeling parameters selected by different characteristic variable optimization methods had great differences in the prediction accuracy of the estimation model. Among them, the accuracy of the estimation models constructed from the parameters selected by the SVM-RFE and RF methods was better, while the accuracy of the estimation models based on the optimization parameters of SVM-RFE method was relatively stable. The BO-GBRT model in the RF method was the best FCC estimation model (R2=0.85, RMSE=0.069, P=86.5%). (4) The BO-GBRT model was used to estimate the forest canopy closure and spatial mapping in the study area, which had high spatial correlation with the FCC predicted by the GEDI pai parameter of 0.53, and the mean values of the FCC were 0.58 and 0.61, which were mainly distributed in the range of 0.4-0.7, accounting for 65.45% and 51.79%, respectively. The forest canopy closure in the study area was mainly in moderate canopy closure, and the northern area was mainly in high canopy closure area, which was consistent with the spatial distribution of vegetation coverage in the study area. It indicated that the method of estimating forest canopy closure using GEDI data in this study was feasible and the results were very reliable. Our research laid a foundation for the efficient, timely and low-cost estimation of forest horizontal structural parameters at large spatial scales based on GEDI data.