Event Weekly OptiML Lab Group Meeting
Short summary In this talk, I will first (somewhat rigorously) introduce the framework of reinforcement learning with human feedback (RLHF). Then I will go over three recent breakthroughs in the analysis and improvement of RLHF.