Near-Optimal Clustering in Mixture of Markov Chains

May 2, 2026·

Junghyun Lee

Yassir Jedra

Alexandre Proutière

Se-Young Yun

· 0 min read

PDF

Abstract

We study the problem of clustering T trajectories of length H, each generated by one of K unknown ergodic Markov chains over a finite state space of size S. The goal is to accurately group trajectories according to their underlying generative model. We begin by deriving an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains. We then present a novel two-stage clustering algorithm. In Stage~~I, we apply spectral clustering using a new injective Euclidean embedding for ergodic Markov chains – a contribution of independent interest that enables sharp concentration results. Stage~~II refines the initial clusters via a single step of likelihood-based reassignment. Our method achieves a near-optimal clustering error with high probability, under the conditions H=Ω̃ (γ−1ps(S2∨π−1min)) and TH=Ω̃ (γ−1psS2), where πmin is the minimum stationary probability of a state across the K chains and γps is the minimum pseudo-spectral gap. These requirements provide significant improvements, if not at least comparable, to the state-of-the-art guarantee (Kausik et al., 2023), and moreover, our algorithm offers a key practical advantage: unlike existing approach, it requires no prior knowledge of model-specific quantities (e.g., separation between kernels or visitation probabilities). We conclude by discussing the inherent gap between our upper and lower bounds, providing insights into the unique structure of this clustering problem.

Type

Conference paper

Publication

29th International Conference on Artificial Intelligence and Statistics

Last updated on May 2, 2026

Markov Chain Clustering Information Theory Statistics Probability Theory

Authors

Junghyun Lee

PhD Candidate in AI

PhD candidate at KAIST AI, jointly advised by Se-Young Yun and Chulhee Yun. I work on interactive machine learning, theoretical aspects of LLMs, learning/optimization theory, and statistical analysis of large networks.

← GL-LowPopArt: A Nearly Instance-Wise Minimax-Optimal Estimator for Generalized Low-Rank Trace Regression May 2, 2026

TESSAR: Geometry-Aware Active Regression via Dynamic Voronoi Tessellation Apr 23, 2026 →

No results found

Near-Optimal Clustering in Mixture of Markov Chains