MDP

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

Event Weekly OptiML Lab Group Meeting Short summary In this seminar, I will talk about my own paper “Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion” (Lee et al.

Jun 18, 2024

Introduction to Reinforcement Learning with Human Feedback (RLHF): A Theoretically Biased Overview

Event Weekly OptiML Lab Group Meeting Short summary In this talk, I will first (somewhat rigorously) introduce the framework of reinforcement learning with human feedback (RLHF). Then I will go over three recent breakthroughs in the analysis and improvement of RLHF.

Nov 30, 2023

Introduction to Reinforcement Learning with Human Feedback (RLHF): A Theoretically Biased Overview

Event Weekly OSI Lab Seminar Short summary In this talk, I will first (somewhat rigorously) introduce the framework of reinforcement learning with human feedback (RLHF). Then I will go over three recent breakthroughs in the analysis and improvement of RLHF.

Nov 30, 2023

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

Event Weekly OSI Lab Seminar Short summary In this seminar, I will talk about the paper “Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs” (Cheng et al., ICLR 2023).

Mar 31, 2023

Nearly Optimal Latent State Decoding in Block MDPs

Event Weekly OptiML Lab Group Meeting Short summary In this seminar, I will talk about my own paper “Nearly Optimal Latent State Decoding in Block MDPs” (Jedra et al., arXiv 2022).

Oct 7, 2022

Clustering in Block Markov Chains

Event Weekly OSI Lab Seminar Short summary In this seminar, I will talk about the paper “Clustering in Block Markov Chains” (Sanders et al., Ann. Stat. 2020). Abstract (taken directly from the paper)

Nov 26, 2021

Empirical Analyses of Corruption in the Clustering of Block MDPs

We show that a simple trick of randomly corrupting the trajectories in Block MDPs allow for us to use the the clustering algorithm proposed of Jedra et al. (2023) for general classes of MDPs.

Junghyun Lee, Se-Young Yun

Nearly Optimal Latent State Decoding in Block MDPs

First theoretical analysis of model estimation and reward-free RL of block MDP, without resorting to function approximation frameworks. Lower bounds and algorithms with near-optimal upper bound are provided.

Yassir Jedra, Junghyun Lee, Alexandre Proutière, Se-Young Yun

Nearly Optimal Latent State Decoding in Block MDPs

Preliminary Empirical Analyses of Clustering in Block MDPs

We empirically validate the clustering algorithm proposed in (Jedra et al., 2022).

Junghyun Lee, Se-Young Yun