Computational Ecology and Software (Apr 2011)

Using data clustering as a method of estimating the risk of establishment of bacterial crop diseases

  • Michael J. Watts

Journal volume & issue
Vol. 1, no. 1
pp. 1 – 13

Abstract

Read online

Previous work has investigated the use of data clustering of regional species assemblages to estimate the relative risk of establishment of insect crop pest species. This paper describes the use of these techniques to estimate the risk posed by bacterial crop plant diseases. Two widely-used clustering algorithms, the Kohonen Self-Organising Map and the k-means clustering algorithm, were investigated. It describes how a wider variety of SOM architectures than previously used were investigated, and how both of these algorithms reacted to the addition of small amounts of random 'noise' to the species assemblages. The results indicate that the k-means clustering algorithm is much more computationally efficient, produces better clusters as determined by an objective measure of cluster quality and is more resistant to noise in the data than equivalent Kohonen SOM. Therefore k-means is considered to be the better algorithm for this problem.

Keywords