Atmosphere (Sep 2023)
All-Day Cloud Classification via a Random Forest Algorithm Based on Satellite Data from CloudSat and Himawari-8
Abstract
It remains challenging to accurately classify complicated clouds owing to the various types of clouds and their distribution on multiple layers. In this paper, multi-band radiation information from the geostationary satellite Himawari-8 and the cloud classification product of the polar orbit satellite CloudSat from June to September 2018 are investigated. Based on sample sets matched by two types of satellite data, a random forest (RF) algorithm was applied to train a model, and a retrieval method was developed for cloud classification. With the use of this method, the sample sets were inverted and classified as clear sky, low clouds, middle clouds, thin cirrus, thick cirrus, multi-layer clouds and deep convection (cumulonimbus) clouds. The results indicate that the average accuracy for all cloud types during the day is 88.4%, and misclassifications mainly occur between low and middle clouds, thick cirrus clouds and cumulonimbus clouds. The average accuracy is 79.1% at night, with more misclassifications occurring between middle clouds, multi-layer clouds and cumulonimbus clouds. Moreover, Typhoon Muifa from 2022 was selected as a sample case, and the cloud type (CLT) product of an FY-4A satellite was used to examine the classification method. In the cloud system of Typhoon Muifa, a cumulonimbus area classified using the method corresponded well with a mesoscale convective system (MCS). Compared to the FY-4A CLT product, the classifications of ice-type (thick cirrus) and multi-layer clouds are effective, and the location, shape and size of these two varieties of cloud are similar.
Keywords