Environmental and Sustainability Indicators (Dec 2025)
From vegetation to vulnerability: Integrating remote sensing and AI to combat cheatgrass-induced wildfire hazards in California
Abstract
Wildfire risk is on the rise around the world. In places like California, this risk is further instigated by the invasive species cheatgrass (Bromus tectorum). Cheat-grass is highly flammable and benefits from wildfires, allowing it to replace native plant communities. Through increasing both the intensity and the frequency of wildfires, it endangers not only its natural environment but also human habitats. Here, we present a novel approach to map the distribution and expansion of cheat-grass and predict potential wildfire risk zones. Utilizing the open-source CalFlora dataset, alongside data from the Sentinel–2 satellites, we created a comprehensive spatial analysis framework. We integrated temporal dynamics via Vegetation In- dex statistical bands that encapsulate annual vegetation information. We employed semi-supervised learning techniques to refine and filter our data labels, thereby ensur-ing robust model training. We utilized machine learning algorithms Random Forest and XGBoost for model training. Our models exhibited a test accuracy of 91.1 % in multiclass classification and achieved a precision rate of 91 % specifically for the Cheatgrass class. Our multiclass classification model demonstrates exceptional dis-criminative ability and agreement with the actual classifications, with an ROC-AUC Score of 0.99 indicating near-perfect performance in distinguishing between the dif-ferent classes, and a Cohen's Kappa of 0.89 signifying a strong agreement, accounting for chance. We demonstrate the effectiveness of our methodology by leveraging pub-licly available open-source datasets to map the spread of invasive Cheatgrass, which in turn helps identify regions potentially at high risk for wildfires across California's varied landscapes. Our analysis effectively predicts the distribution of Cheatgrass and other vegetation with data available only until June, providing insight before the peak forest fire season, which spans from mid- July to September. This capability delivers actionable intelligence for assessing fuel load and connectivity, thus laying the groundwork for targeted wildfire prevention strategies and enhanced ecological management practices in fire-prone areas.
Keywords