Grouped Heterogeneous Mixture Modeling for Clustered Data

Shonosuke Sugasawa

Research output: Contribution to journalArticlepeer-review

Abstract

Clustered data are ubiquitous in a variety of scientific fields. In this article, we propose a flexible and interpretable modeling approach, called grouped heterogeneous mixture modeling, for clustered data, which models cluster-wise conditional distributions by mixtures of latent conditional distributions common to all the clusters. In the model, we assume that clusters are divided into a finite number of groups and mixing proportions are the same within the same group. We provide a simple generalized EM algorithm for computing the maximum likelihood estimator, and an information criterion to select the numbers of groups and latent distributions. We also propose structured grouping strategies by introducing penalties on grouping parameters in the likelihood function. Under the settings where both the number of clusters and cluster sizes tend to infinity, we present asymptotic properties of the maximum likelihood estimator and the information criterion. We demonstrate the proposed method through simulation studies and an application to crime risk modeling in Tokyo.

Original languageEnglish
Pages (from-to)999-1010
Number of pages12
JournalJournal of the American Statistical Association
Volume116
Issue number534
DOIs
Publication statusPublished - 2021
Externally publishedYes

Keywords

  • EM algorithm
  • Finite mixture
  • Maximum likelihood estimation
  • Mixture of experts
  • Unobserved heterogeneity

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Grouped Heterogeneous Mixture Modeling for Clustered Data'. Together they form a unique fingerprint.

Cite this