Scientific Reports (Feb 2021)
A social Beaufort scale to detect high winds using language in social media posts
Abstract
Abstract People often talk about the weather on social media, using different vocabulary to describe different conditions. Here we combine a large collection of wind-related Twitter posts (tweets) and UK Met Office wind speed observations to explore the relationship between tweet volume, tweet language and wind speeds in the UK. We find that wind speeds are experienced subjectively relative to the local baseline, so that the same absolute wind speed is reported as stronger or weaker depending on the typical weather conditions in the local area. Different linguistic tokens (words and emojis) are associated with different wind speeds. These associations can be used to create a simple text classifier to detect ‘high-wind’ tweets with reasonable accuracy; this can be used to detect high winds in a locality using only a single tweet. We also construct a ‘social Beaufort scale’ to infer wind speeds based only on the language used in tweets. Together with the classifier, this demonstrates that language alone is indicative of weather conditions, independent of tweet volume. However, the number of high-wind tweets shows a strong temporal correlation with local wind speeds, increasing the ability of a combined language-plus-volume system to successfully detect high winds. Our findings complement previous work in social sensing of weather hazards that has focused on the relationship between tweet volume and severity. These results show that impacts of wind and storms are found in how people communicate and use language, a novel dimension in understanding the social impacts of extreme weather.