Silva Fennica (Jan 2013)

An assessment of three variance estimators for the k-nearest neighbour technique

  • Magnussen, Steen

DOI
https://doi.org/10.14214/sf.925
Journal volume & issue
Vol. 47, no. 1

Abstract

Read online

A jackknife (JK), a bootstrap (BOOT), and an empirical difference estimator (EDE) of totals and variance were assessed in simulated sampling from three artificial but realistic complex multivariate populations (N = 8000 elements) organized in clusters of four elements. Intra-cluster correlations of the target variables (Y) varied from 0.03 to 0.26. Time-saving implementations of JK and BOOT are detailed. In simple random sampling (SRS), bias in totals was ≤ 0.4% for the two largest sample sizes (n = 200, 300), but slightly larger for n = 50, and 100. In cluster sampling (CLU) bias was typically 0.1% higher and more variable. The lowest overall bias was in EDE. In both SRS and CLU, JK estimates of standard error were slightly (3%) too high, while the bootstrap estimates in both SRS and CLU were too low (8%). Estimates of error suggested a trend in EDE toward an overestimation with increasing sample size. Calculated 95% confidence intervals achieved a coverage that in most cases was fairly close (± 2%) to the nominal level. For estimation of a population total the EDE estimator appears to be slightly better than the JK estimator.