FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL

Oct 11, 2024·

Woosung Koh

Wonbeen Oh

Siyeol Kim

Suhin Shin

Hyeongjin Kim

Jaein Jang

Junghyun Lee

Se-Young Yun

· 0 min read

PDF Project

Abstract

Multi-agent reinforcement learning has demonstrated significant potential in addressing complex cooperative tasks across various real-world applications. However, existing MARL approaches often rely on the restrictive assumption that the number of entities (e.g., agents, obstacles) remains constant between training and inference. This overlooks scenarios where entities are dynamically removed or added during the inference trajectory – a common occurrence in real-world environments like search and rescue missions and dynamic combat situations. In this paper, we tackle the challenge of intra-trajectory dynamic entity composition under zero-shot out-of-domain (OOD) generalization, where such dynamic changes cannot be anticipated beforehand. Our empirical studies reveal that existing MARL methods suffer significant performance degradation and increased uncertainty in these scenarios. In response, we propose FlickerFusion, a novel OOD generalization method that acts as a universally applicable augmentation technique for MARL backbone methods. Our results show that FlickerFusion not only achieves superior inference rewards but also uniquely reduces uncertainty vis-à-vis the backbone, compared to existing methods. For standardized evaluation, we introduce MPEv2, an enhanced version of Multi Particle Environments (MPE), consisting of 12 benchmarks. Benchmarks, implementations, and trained models are organized and open-sourced at flickerfusion305.github.io, accompanied by ample demo video renderings.

Type

Conference paper

Publication

13th International Conference on Learning Representations & NeurIPS 2024 - Workshop on Open-World Agents (OWA-2024)

Last updated on Oct 11, 2024

MARL

Authors

Junghyun Lee

PhD Candidate in AI

PhD candidate at KAIST AI, jointly advised by Se-Young Yun and Chulhee Yun. I work on interactive machine learning, theoretical aspects of LLMs, learning/optimization theory, and statistical analysis of large networks.

← Probability-Flow ODE in Infinite-Dimensional Function Spaces Mar 7, 2025

On the Estimation of Linear Softmax Parametrized Markov Chains Jun 26, 2024 →

No results found

FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL