Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data
- PMID: 39691937
- PMCID: PMC11650374
Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data
Abstract
We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups. We characterize the graphical Dirichlet process using a novel hypergraph representation as well as the stick-breaking representation, the restaurant-type representation, and the representation as a limit of a finite mixture model. We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell data set.
Keywords: Bayesian nonparametrics; clustering; directed acyclic graph; family-owned restaurant process; non-exchangeable groups.
Figures
References
-
- Alam Md. Hijbul, Peltonen Jaakko, Nummenmaa Jyrki, and Järvelin Kalervo. Tree-structured hierarchical dirichlet process. In Rodríguez Sara, Prieto Javier, Faria Pedro, Kłos Sławomir, Fernández Alberto, Mazuelas Santiago, Jiménez-López M. Dolores, Moreno María N., and Navarro Elena M., editors, Distributed Computing and Artificial Intelligence, Special Sessions, 15th International Conference, pages 291–299, Cham, 2019. Springer International Publishing. ISBN 978-3-319-99608-0.
-
- Antoniak Charles E.. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The Annals of Statistics, 2(6):1152–1174, 1974. ISSN 00905364. URL http://www.jstor.org/stable/2958336.
-
- Barrios Ernesto, Lijoi Antonio, Nieto-Barajas Luis E, and Prünster Igor. Modeling with normalized random measure mixture models. Statistical Science, 28(3):313–334, 2013.
-
- Basu D. On statistics independent of a complete sufficient statistic. Sankhyā: The Indian Journal of Statistics (1933–1960), 15(4):377–380, 1955. ISSN 00364452. URL http://www.jstor.org/stable/25048259.
-
- Beraha Mario, Guglielmi Alessandra, and Quintana Fernando A. The semi-hierarchical dirichlet process and its application to clustering homogeneous distributions. Bayesian Analysis, 16(4): 1187–1219, 2021.
Grants and funding
LinkOut - more resources
Full Text Sources