Clustering in General Measurement Error Models. - Texas A&M University (TAMU) Scholar

abstract

This paper is dedicated to the memory of Peter G. Hall. It concerns a deceptively simple question: if one observes variables corrupted with measurement error of possibly very complex form, can one recreate asymptotically the clusters that would have been found had there been no measurement error? We show that the answer is yes, and that the solution is surprisingly simple and general. The method itself is to simulate, by computer, realizations with the same distribution as that of the true variables, and then to apply clustering to these realizations. Technically, we show that if one uses K-means clustering or any other risk minimizing clustering, and a multivariate deconvolution device with certain smoothness and convergence properties, then, in the limit, the cluster means based on our method converge to the same cluster means as if there is no measurement error. Along with the method and its technical justification, we analyze two important nutrition data sets, finding patterns that make sense nutritionally.

authors

Carroll, Raymond

published proceedings

Stat Sin

author list (cited authors)

Su, Y. a., Reedy, J., & Carroll, R. J.

citation count

6

complete list of authors

Su, Ya||Reedy, Jill||Carroll, Raymond J

publication date

October 2018

publisher

Statistica Sinica (Institute of Statistical Science) Publisher

published in

STATISTICA SINICA Journal

keywords

Clustering
Deconvolution
K-means
Measurement Error
Mixtures Of Distributions

PubMed Central ID

30636855

Digital Object Identifier (DOI)

10.5705/ss.202017.0093

start page

2337

end page

2351

volume

28

issue

4

URL

http://dx.doi.org/10.5705/ss.202017.0093

Clustering in General Measurement Error Models. Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL