Lightning is affected by many factors, many of which are not routinely measured, well understood, or accounted for in physical models. Several commonly used machine learning (ML) models have been applied to analyze the relationship between Atmospheric Radiation Measurement (ARM) data and lightning data from the Earth Networks Total Lightning Network (ENTLN) in order to identify important variables affecting lightning occurrence in the vicinity of the Southern Great Plains (SGP) ARM site during the summer months (June, July, August and September) of 2012 to 2020. Testing various ML models, we found that the random forest model is the best predictor among common classifiers. When convective clouds were detected, it predicts lightning occurrence with an accuracy of 76.9 % and an area under the curve (AUC) of 0.850. Using this model, we further ranked the variables in terms of their effectiveness in nowcasting lightning and identified geometric cloud thickness, rain rate and convective available potential energy (CAPE) as the most effective predictors. The contrast in meteorological variables between no-lightning and frequent-lightning periods was examined for hours with CAPE values conducive to thunderstorm formation. Besides the variables considered for the ML models, surface variables and mid-altitude variables (e.g., equivalent potential temperature and minimum equivalent potential temperature, respectively) have statistically significant contrasts between no-lightning and frequent-lightning hours. For example, the minimum equivalent potential temperature from 700 to 500 hPa is significantly lower during frequent-lightning hours compared with no-lightning hours. Finally, a notable positive relationship between the intracloud (IC) flash fraction and the square root of CAPE (CAPE) was found, suggesting that stronger updrafts increase the height of the electrification zone, resulting in fewer flashes reaching the surface and consequently a greater IC flash fraction.