Cumulative Distribution Regret Minimization with Max- Quantile Threshold in Multi-Armed Bandit
We study a new risk-averse bandit setting motivated by semiconductor manufacturing, where the quality of a recipe is judged not by its mean performance but by its weakest outcomes. …





