Smoothing County-Level Sampling Variances to Improve Small Area Models’ Outputs

Lu Chen; Luca Sartore; Habtamu Benecha; Valbona Bejleri; Balgobin Nandram

doi:10.3390/stats5030052

Stats (Sep 2022)

Smoothing County-Level Sampling Variances to Improve Small Area Models’ Outputs

Lu Chen,
Luca Sartore,
Habtamu Benecha,
Valbona Bejleri,
Balgobin Nandram

Affiliations

Lu Chen: National Institute of Statistical Sciences, 1750 K Street NW Suite 1100, Washington, DC 20006, USA
Luca Sartore: National Institute of Statistical Sciences, 1750 K Street NW Suite 1100, Washington, DC 20006, USA
Habtamu Benecha: United States Department of Agriculture, National Agricultural Statistics Service, 1400 Independence Avenue SW, Washington, DC 20250, USA
Valbona Bejleri: United States Department of Agriculture, National Agricultural Statistics Service, 1400 Independence Avenue SW, Washington, DC 20250, USA
Balgobin Nandram: United States Department of Agriculture, National Agricultural Statistics Service, 1400 Independence Avenue SW, Washington, DC 20250, USA

DOI: https://doi.org/10.3390/stats5030052
Journal volume & issue: Vol. 5, no. 3
pp. 898 – 915

Abstract

Read online

The use of hierarchical Bayesian small area models, which take survey estimates along with auxiliary data as input to produce official statistics, has increased in recent years. Survey estimates for small domains are usually unreliable due to small sample sizes, and the corresponding sampling variances can also be imprecise and unreliable. This affects the performance of the model (i.e., the model will not produce an estimate or will produce a low-quality modeled estimate), which results in a reduced number of official statistics published by a government agency. To mitigate the unreliable sampling variances, these survey-estimated variances are typically modeled against the direct estimates wherever a relationship between the two is present. However, this is not always the case. This paper explores different alternatives to mitigate the unreliable (beyond some threshold) sampling variances. A Bayesian approach under the area-level model set-up and a distribution-free technique based on bootstrap sampling are proposed to update the survey data. An application to the county-level corn yield data from the County Agricultural Production Survey of the United States Department of Agriculture’s (USDA’s) National Agricultural Statistics Service (NASS) is used to illustrate the proposed approaches. The final county-level model-based estimates for small area domains, produced based on updated survey data from each method, are compared with county-level model-based estimates produced based on the original survey data and the official statistics published in 2016.

Published in Stats

ISSN: 2571-905X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Social Sciences: Statistics
Website: https://www.mdpi.com/journal/stats

About the journal

Abstract

Keywords