Online Learning

Looking Through the Mirror: Minimax-Optimal Regularized Regrets in Online Learning and Bandits

We revisit regularized regret minimization under full-information and bandit feedback, where a learner optimizes an objective of the form $\langle r, \pi \rangle - \eta^{-1} …

Junghyun Lee

• May 21, 2026 • 1 min read

Bandits

A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

We consider the problem of heteroskedastic generalized linear bandits (GLBs) with adversarial corruptions, which subsumes various stochastic contextual bandit settings, including …

sanghwa-kim

• Feb 12, 2026 • 1 min read

Bandits

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

Logistic bandit is a ubiquitous framework of modeling users' choices, e.g., click vs. no click for advertisement recommender system. We observe that the prior works overlook or …

Junghyun Lee

• Jan 20, 2024 • 1 min read

No results found

Online Learning

Looking Through the Mirror: Minimax-Optimal Regularized Regrets in Online Learning and Bandits

A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion