Paper-Conference

Near-Optimal Clustering in Mixture of Markov Chains featured image

Near-Optimal Clustering in Mixture of Markov Chains

We study the problem of clustering T trajectories of length H, each generated by one of K unknown ergodic Markov chains over a finite state space of size S. The goal is to …

avatar
Junghyun Lee
GL-LowPopArt: A Nearly Instance-Wise Minimax-Optimal Estimator for Generalized Low-Rank Trace Regression featured image

GL-LowPopArt: A Nearly Instance-Wise Minimax-Optimal Estimator for Generalized Low-Rank Trace Regression

We present GL-LowPopArt, a novel Catoni-style estimator for generalized low-rank trace regression. Building on LowPopArt (Jang et al., 2024), it employs a two-stage approach -- …

avatar
Junghyun Lee

TESSAR: Geometry-Aware Active Regression via Dynamic Voronoi Tessellation

Active learning improves training efficiency by selectively querying the most informative samples for labeling. While it naturally fits classification tasks–where informative …

seong-jin-cho
AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners featured image

AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners

Self-Taught Reasoners (STaR), synonymously known as Rejection sampling Fine-Tuning (RFT), is an integral part of the training pipeline of self-improving reasoning Language Models …

woosung-koh
Probability-Flow ODE in Infinite-Dimensional Function Spaces featured image

Probability-Flow ODE in Infinite-Dimensional Function Spaces

Recent advances in infinite-dimensional diffusion models have demonstrated their effectiveness and scalability in function generation tasks where the underlying structure is …

kunwoo-na
FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL featured image

FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL

Multi-agent reinforcement learning has demonstrated significant potential in addressing complex cooperative tasks across various real-world applications. However, existing MARL …

woosung-koh
A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits featured image

A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits

We present a unified likelihood ratio-based confidence sequence (CS) for any (self-concordant) generalized linear model (GLM) that is guaranteed to be convex and numerically tight. …

avatar
Junghyun Lee
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults featured image

Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults

Although gradient descent with Polyak's momentum is widely used in modern machine and deep learning, a concrete understanding of its effects on the training trajectory remains …

prin-phunyaphibarn
Querying Easily Flip-flopped Samples for Deep Active Learning featured image

Querying Easily Flip-flopped Samples for Deep Active Learning

Proposes a new active learning approach by proposing a new uncertainty measure called the least disagree metric, as well as its efficient estimator, which is proven to be …

seong-jin-cho
Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion featured image

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

Logistic bandit is a ubiquitous framework of modeling users' choices, e.g., click vs. no click for advertisement recommender system. We observe that the prior works overlook or …

avatar
Junghyun Lee