Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data
- PMID: 39691937
- PMCID: PMC11650374
Graphical Dirichlet Process for Clustering Non-Exchangeable Grouped Data
Abstract
We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups. We characterize the graphical Dirichlet process using a novel hypergraph representation as well as the stick-breaking representation, the restaurant-type representation, and the representation as a limit of a finite mixture model. We develop an efficient posterior inference algorithm and illustrate our model with simulations and a real grouped single-cell data set.
Keywords: Bayesian nonparametrics; clustering; directed acyclic graph; family-owned restaurant process; non-exchangeable groups.
Figures










Similar articles
-
Mixture models with a prior on the number of components.J Am Stat Assoc. 2018;113(521):340-356. doi: 10.1080/01621459.2016.1255636. Epub 2017 Nov 13. J Am Stat Assoc. 2018. PMID: 29983475 Free PMC article.
-
Axially Symmetric Data Clustering Through Dirichlet Process Mixture Models of Watson Distributions.IEEE Trans Neural Netw Learn Syst. 2019 Jun;30(6):1683-1694. doi: 10.1109/TNNLS.2018.2872986. Epub 2018 Oct 23. IEEE Trans Neural Netw Learn Syst. 2019. PMID: 30369452
-
Latent Nested Nonparametric Priors (with Discussion).Bayesian Anal. 2019 Dec;14(4):1303-1356. doi: 10.1214/19-BA1169. Epub 2019 Jun 27. Bayesian Anal. 2019. PMID: 35978607 Free PMC article.
-
Generalized species sampling priors with latent Beta reinforcements.J Am Stat Assoc. 2014 Dec 1;109(508):1466-1480. doi: 10.1080/01621459.2014.950735. J Am Stat Assoc. 2014. PMID: 25870462 Free PMC article.
-
Generalized cumulative shrinkage process priors with applications to sparse Bayesian factor analysis.Philos Trans A Math Phys Eng Sci. 2023 May 15;381(2247):20220148. doi: 10.1098/rsta.2022.0148. Epub 2023 Mar 27. Philos Trans A Math Phys Eng Sci. 2023. PMID: 36970824 Review.
References
-
- Alam Md. Hijbul, Peltonen Jaakko, Nummenmaa Jyrki, and Järvelin Kalervo. Tree-structured hierarchical dirichlet process. In Rodríguez Sara, Prieto Javier, Faria Pedro, Kłos Sławomir, Fernández Alberto, Mazuelas Santiago, Jiménez-López M. Dolores, Moreno María N., and Navarro Elena M., editors, Distributed Computing and Artificial Intelligence, Special Sessions, 15th International Conference, pages 291–299, Cham, 2019. Springer International Publishing. ISBN 978-3-319-99608-0.
-
- Antoniak Charles E.. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The Annals of Statistics, 2(6):1152–1174, 1974. ISSN 00905364. URL http://www.jstor.org/stable/2958336.
-
- Barrios Ernesto, Lijoi Antonio, Nieto-Barajas Luis E, and Prünster Igor. Modeling with normalized random measure mixture models. Statistical Science, 28(3):313–334, 2013.
-
- Basu D. On statistics independent of a complete sufficient statistic. Sankhyā: The Indian Journal of Statistics (1933–1960), 15(4):377–380, 1955. ISSN 00364452. URL http://www.jstor.org/stable/25048259.
-
- Beraha Mario, Guglielmi Alessandra, and Quintana Fernando A. The semi-hierarchical dirichlet process and its application to clustering homogeneous distributions. Bayesian Analysis, 16(4): 1187–1219, 2021.
Grants and funding
LinkOut - more resources
Full Text Sources