Abstract:Oasis ecosystems in arid regions are fragile, and vegetation dynamics are crucial for regional ecological balance and sustainable water resource management. However, traditional methods for predicting vegetation indices often assume linearity or stationarity, struggling to capture the inherent nonlinearity and spatial dependencies in actual environmental conditions. This study focuses on the Qingtu Oasis, utilizing remote sensing data including the Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), and Salinity Index (SI). Integrating 5×5 neighborhood spatial features and a 3-year time lag window, a deep learning model fusing Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU), termed CNN-GRU, was developed to predict spatio-temporal NDVI variations. Results demonstrate that the CNN-GRU model outperforms standalone CNN or GRU models, achieving superior performance metrics such as Coefficient of Determination (R2) and Root Mean Square Error (RMSE). Specifically, the R2 reached 0.88 during the testing period, with particularly higher accuracy in high-NDVI areas and oasis-desert transition zones. Incorporating neighborhood spatial information and three-year lagged features significantly enhances the model’s capability to capture the complex nonlinearities and spatial correlations within NDVI dynamics. This research provides robust technical support for long-term vegetation monitoring and ecological management in arid oases, offering novel insights and a valuable reference for applying deep learning and remote sensing techniques in ecosystem studies.