Junghyun Lee

PhD Candidate in AI

Kim Jaechul Graduate School of AI, KAIST

PhD candidate at KAIST AI, jointly advised by Se-Young Yun and Chulhee Yun. I work on interactive machine learning, theoretical aspects of LLMs, learning/optimization theory, and statistical analysis of large networks.

LLMs

AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners

Self-Taught Reasoners (STaR), synonymously known as Rejection sampling Fine-Tuning (RFT), is an integral part of the training pipeline of self-improving reasoning Language Models …

woosung-koh

• May 22, 2025 • 1 min read

Diffusion Model

Probability-Flow ODE in Infinite-Dimensional Function Spaces

Recent advances in infinite-dimensional diffusion models have demonstrated their effectiveness and scalability in function generation tasks where the underlying structure is …

kunwoo-na

• Mar 7, 2025 • 1 min read

MARL

FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL

Multi-agent reinforcement learning has demonstrated significant potential in addressing complex cooperative tasks across various real-world applications. However, existing MARL …

woosung-koh

• Oct 11, 2024 • 1 min read

On the Estimation of Linear Softmax Parametrized Markov Chains

In reinforcement learning and deep learning, softmax parameterization is commonly used to represent discrete probability distributions.In this work, we study three possible softmax …

kunwoo-na

• Jun 26, 2024 • 1 min read

Bandits

A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits

We present a unified likelihood ratio-based confidence sequence (CS) for any (self-concordant) generalized linear model (GLM) that is guaranteed to be convex and numerically tight. …

Junghyun Lee

• Jun 19, 2024 • 1 min read

Deep Learning Theory

Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults

Although gradient descent with Polyak's momentum is widely used in modern machine and deep learning, a concrete understanding of its effects on the training trajectory remains …

prin-phunyaphibarn

• Jun 17, 2024 • 1 min read

Active Learning

Querying Easily Flip-flopped Samples for Deep Active Learning

Proposes a new active learning approach by proposing a new uncertainty measure called the least disagree metric, as well as its efficient estimator, which is proven to be …

seong-jin-cho

• May 7, 2024 • 1 min read

Bandits

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

Logistic bandit is a ubiquitous framework of modeling users' choices, e.g., click vs. no click for advertisement recommender system. We observe that the prior works overlook or …

Junghyun Lee

• Jan 20, 2024 • 1 min read

Empirical Analyses of Corruption in the Clustering of Block MDPs

We show that a simple trick of randomly corrupting the trajectories in Block MDPs allow for us to use the the clustering algorithm proposed of Jedra et al. (2023) for general …

Junghyun Lee

• Jan 19, 2024 • 1 min read

On the Estimation of Linear Softmax Parametrized Probability Distributions

Linear softmax parametrization (LSP) of a discrete probability distribution is ubiquitous in many areas, such as deep learning, RL, NLP, and social choice models. Instead of trying …

murad-aghazada

• Dec 20, 2023 • 1 min read

No results found

Junghyun Lee