Cumulative Distribution Regret Minimization with Max- Quantile Threshold in Multi-Armed Bandit

Jaeyoung Cha, Junghyun Lee, Chuhee Yun

May, 2026

Abstract

We study a new risk-averse bandit setting motivated by semiconductor manufacturing, where the quality of a recipe is judged not by its mean performance but by its weakest outcomes. We formalize this via cumulative distribution regret with a max-quantile threshold, which measures the cumulative excess defective ratio relative to the arm attaining the best τ-quantile. We develop two UCB-type algorithms, C-UCB and Q-UCB, whose regret bounds depend on distinct problem-dependent gaps arising from CDF and quantile separations.

Type

Domestic Conference/Journal

Publication

In Korea Computer Congress

Bandits Statistics

Cumulative Distribution Regret Minimization with Max- Quantile Threshold in Multi-Armed Bandit

Abstract

Junghyun Lee

PhD Student