Cumulative Distribution Regret Minimization with Max- Quantile Threshold in Multi-Armed Bandit
Jaeyoung Cha, Junghyun Lee, Chuhee Yun
May, 2026Abstract
We study a new risk-averse bandit setting motivated by semiconductor manufacturing, where the quality of a recipe is judged not by its mean performance but by its weakest outcomes. We formalize this via cumulative distribution regret with a max-quantile threshold, which measures the cumulative excess defective ratio relative to the arm attaining the best τ-quantile. We develop two UCB-type algorithms, C-UCB and Q-UCB, whose regret bounds depend on distinct problem-dependent gaps arising from CDF and quantile separations.
Publication
In Korea Computer Congress

PhD Student
PhD student at GSAI, KAIST, jointly advised by Profs. Se-Young Yun and Chulhee Yun. Research focuses on interactive machine learning, “theoretical perspectives” of LLMs, optimization theory, and statistical analyses of large networks with an emphasis on community detection. Broadly interested in mathematical and theoretical AI, as well as related problems in mathematics and statistics.