Environmental Research Letters (Jan 2019)

How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions

  • Alexander Y Sun,
  • Bridget R Scanlon

DOI
https://doi.org/10.1088/1748-9326/ab1b7d
Journal volume & issue
Vol. 14, no. 7
p. 073001

Abstract

Read online

Big Data and machine learning (ML) technologies have the potential to impact many facets of environment and water management (EWM). Big Data are information assets characterized by high volume, velocity, variety, and veracity. Fast advances in high-resolution remote sensing techniques, smart information and communication technologies, and social media have contributed to the proliferation of Big Data in many EWM fields, such as weather forecasting, disaster management, smart water and energy management systems, and remote sensing. Big Data brings about new opportunities for data-driven discovery in EWM, but it also requires new forms of information processing, storage, retrieval, as well as analytics. ML, a subdomain of artificial intelligence (AI), refers broadly to computer algorithms that can automatically learn from data. ML may help unlock the power of Big Data if properly integrated with data analytics. Recent breakthroughs in AI and computing infrastructure have led to the fast development of powerful deep learning (DL) algorithms that can extract hierarchical features from data, with better predictive performance and less human intervention. Collectively Big Data and ML techniques have shown great potential for data-driven decision making, scientific discovery, and process optimization. These technological advances may greatly benefit EWM, especially because (1) many EWM applications (e.g. early flood warning) require the capability to extract useful information from a large amount of data in autonomous manner and in real time, (2) EWM researches have become highly multidisciplinary, and handling the ever increasing data volume/types using the traditional workflow is simply not an option, and last but not least, (3) the current theoretical knowledge about many EWM processes is still incomplete, but which may now be complemented through data-driven discovery. A large number of applications on Big Data and ML have already appeared in the EWM literature in recent years. The purposes of this survey are to (1) examine the potential and benefits of data-driven research in EWM, (2) give a synopsis of key concepts and approaches in Big Data and ML, (3) provide a systematic review of current applications, and finally (4) discuss major issues and challenges, and recommend future research directions. EWM includes a broad range of research topics. Instead of attempting to survey each individual area, this review focuses on areas of nexus in EWM, with an emphasis on elucidating the potential benefits of increased data availability and predictive analytics to improving the EWM research.

Keywords