Original Paper: https://arxiv.org/abs/2303.11366
By: Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao
Abstract:
Large language models (LLMs) have been increasingly used to interact with external environments (e.g., games, compilers, APIs) as goal-driven agents. However, it remains challenging for these language agents to quickly and efficiently learn from trial-and-error as traditional reinforcement learning methods require extensive training samples and expensive model fine-tuning. We propose Reflexion, a novel framework to reinforce language agents not by updating weights, but instead through linguistic feedback. Concretely, Reflexion agents verbally reflect on task feedback signals, then maintain their own reflective text in an episodic memory buffer to induce better decision-making in subsequent trials. Reflexion is flexible enough to incorporate various types (scalar values or free-form language) and sources (external or internally simulated) of feedback signals, and obtains significant improvements over a baseline agent across diverse tasks (sequential decision-making, coding, language reasoning). For example, Reflexion achieves a 91% pass@1 accuracy on the HumanEval coding benchmark, surpassing the previous state-of-the-art GPT-4 that achieves 80%. We also conduct ablation and analysis studies using different feedback signals, feedback incorporation methods, and agent types, and provide insights into how they affect performance.
Summary Notes
Reflexion: Revolutionizing AI Language Learning
The field of artificial intelligence (AI) is constantly advancing, with language models being a key area of growth.
However, traditional models come with their own set of challenges, such as the need for large datasets and substantial computational power.
This is where Reflexion steps in, offering a groundbreaking way to train language agents more efficiently and with fewer resources.
Introducing Verbal Reinforcement Learning
Reflexion marks a significant shift in AI language learning.
It uses verbal reinforcement learning (VRL) to teach language agents through linguistic feedback, similar to how humans learn from their experiences.
This method enables agents to consider their actions and outcomes verbally, leading to more efficient learning that also conservatively uses computational resources.
How Reflexion Works
Reflexion's operation is based on three main components that form a continuous improvement loop:
- Actor Model (M_a): Generates actions based on current policies and state observations.
- Evaluator Model (M_e): Assesses the actions, providing performance feedback.
- Self-Reflection Model (M_sr): Creates verbal reflections using the Evaluator's feedback, which then inform future actions.
This process of acting, evaluating, and reflecting enables ongoing enhancements in the agent's performance, mirroring an introspective learning journey.
Testing and Achievements
Reflexion's effectiveness has been proven across various tasks, from decision-making to programming challenges.
In the AlfWorld environment, it significantly outperformed existing models, showing a 22% increase in decision-making accuracy.
Additionally, it reached a 91% pass@1 accuracy on the HumanEval coding benchmark, demonstrating its superior capabilities.
The Significance of Reflexion
Reflexion opens new doors for AI engineers, especially in companies looking to develop more complex and efficient language agents without the high costs of traditional models. With its unique verbal reinforcement learning approach and the creation of challenging testing environments, Reflexion is pushing AI into new territories.
What's Next for Reflexion
The development of Reflexion is ongoing, with future research focused on exploring more intricate feedback mechanisms, integrating with different learning models, and enhancing the episodic memory component. These efforts aim to broaden Reflexion's application scope and efficiency.
Conclusion: The Future of AI Language Learning
Reflexion stands as a pioneering solution in AI language learning, utilizing verbal reinforcement learning for scalable, efficient, and effective language agent training.
As this innovative approach continues to evolve, the possibilities for AI enhancements seem endless.
For those interested in further details or collaboration, all resources related to Reflexion are available on the Reflexion GitHub repository.
This open-source initiative encourages further research and community collaboration, paving the way for future advancements in AI.
Athina AI is a collaborative IDE for AI development.
Learn more about how Athina can help your team ship AI 10x faster →