Names (Sep 2008)
Name Clustering on the Basis of Parental Preferences
Abstract
Abstract Parents do not choose first names for their children at random. Using two large datasets, for the UK and the Netherlands, covering the names of children born in the same family over a period of two decades, this paper seeks to identify clusters of names entirely inferred from common parental naming preferences. These name groups can be considered as coherent sets of names that have a high probability to be found in the same family. Operational measures for the statistical association between names and clusters are developed, as well as a two-stage clustering technique. The name groups are subsequently merged into a limited set of grand clusters. The results show that clusters emerge with cultural, linguistic, or ethnic parental backgrounds, but also along characteristics inherent in names, such as clusters of names after flowers and gems for girls, abbreviated names for boys, or names ending in –y or -ie.