Risks (Sep 2024)

Insurance Analytics with Clustering Techniques

  • Charlotte Jamotton,
  • Donatien Hainaut,
  • Thomas Hames

DOI
https://doi.org/10.3390/risks12090141
Journal volume & issue
Vol. 12, no. 9
p. 141

Abstract

Read online

The K-means algorithm and its variants are well-known clustering techniques. In actuarial applications, these partitioning methods can identify clusters of policies with similar attributes. The resulting partitions provide an actuarial framework for creating maps of dominant risks and unsupervised pricing grids. This research article aims to adapt well-established clustering methods to complex insurance datasets containing both categorical and numerical variables. To achieve this, we propose a novel approach based on Burt distance. We begin by reviewing the K-means algorithm to establish the foundation for our Burt distance-based framework. Next, we extend the scope of application of the mini-batch and fuzzy K-means variants to heterogeneous insurance data. Additionally, we adapt spectral clustering, a technique based on graph theory that accommodates non-convex cluster shapes. To mitigate the computational complexity associated with spectral clustering’s O(n3) runtime, we introduce a data reduction method for large-scale datasets using our Burt distance-based approach.

Keywords