Skip to content

Latest commit

 

History

History
64 lines (64 loc) · 2.69 KB

2025-01-14-bougie25a.md

File metadata and controls

64 lines (64 loc) · 2.69 KB
title booktitle year volume series month publisher pdf url openreview abstract layout issn id tex_title firstpage lastpage page order cycles bibtex_editor editor bibtex_author author date address container-title genre issued extras
Exploring Beyond Curiosity Rewards: Language-Driven Exploration in RL
Proceedings of the 16th Asian Conference on Machine Learning
2025
260
Proceedings of Machine Learning Research
0
PMLR
qHv7qTETsw
Sparse rewards pose a significant challenge for many reinforcement learning algorithms, which struggle in the absence of a dense, well-shaped reward function. Drawing inspiration from the curiosity exhibited in animals, intrinsically-driven methods overcome this drawback by incentivizing agents to explore novel states. Yet, in the absence of domain-specific priors, sample efficiency is hindered as most discovered novelty has little relevance to the true task reward. We present iLLM, a curiosity-driven approach that leverages the inductive bias of foundation models — Large Language Models, as a source of information about plausibly useful behaviors. Two tasks are introduced for shaping exploration: 1) action generation and 2) history compression, where the language model is prompted with a description of the state-action trajectory. We further propose a technique for mapping state-action pairs to pretrained token embeddings of the language model in order to alleviate the need for explicit textual descriptions of the environment. By distilling prior knowledge from large language models, iLLM encourages agents to discover diverse and human-meaningful behaviors without requiring direct human intervention. We evaluate the proposed method on BabyAI-Text, MiniHack, Atari games, and Crafter tasks, demonstrating higher sample efficiency compared to prior curiosity-driven approaches.
inproceedings
2640-3498
bougie25a
{Exploring Beyond Curiosity Rewards}: {L}anguage-Driven Exploration in {RL}
127
142
127-142
127
false
Nguyen, Vu and Lin, Hsuan-Tien
given family
Vu
Nguyen
given family
Hsuan-Tien
Lin
Bougie, Nicolas and Watanabe, Narimasa
given family
Nicolas
Bougie
given family
Narimasa
Watanabe
2025-01-14
Proceedings of the 16th Asian Conference on Machine Learning
inproceedings
date-parts
2025
1
14