research-papers

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Athina AI

01 Jan 2024 — 4 min read

Original Paper: https://arxiv.org/abs/2401.00812

By: Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi R. Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Yiquan Wang, Heng Ji, Chengxiang Zhai

Abstract:

The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code).

As a medium between humans and computers, code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.

In this survey, we present an overview of the various benefits of integrating code into LLMs' training data.

Specifically, beyond enhancing LLMs in code generation, we observe that these unique properties of code help

(i) unlock the reasoning ability of LLMs, enabling their applications to a range of more complex natural language tasks

(ii) steer LLMs to produce structured and precise intermediate steps, which can then be connected to external execution ends through function calls

(iii) take advantage of code compilation and execution environment, which also provides diverse feedback for model improvement.

In addition, we trace how these profound capabilities of LLMs, brought by code, have led to their emergence as intelligent agents (IAs) in situations where the ability to understand instructions, decompose goals, plan and execute actions, and refine from feedback are crucial to their success on downstream tasks.

Finally, we present several key challenges and future directions of empowering LLMs with code.

Summary Notes

Figure: An illustration of how code empowers large language models (LLMs) and enhances their downstream applications as intelligent agents (IAs). While traditional LLMs excel in conventional natural language tasks like document classification and question answering, further pre-training or fine-tuning LLMs with human-interpretable and machine-executable code serves as an additional power-up — akin to equipping wizards with mana-boosting wands. This significantly boosts their performance as IAs through intricately woven operational steps.

Introduction

In the rapidly advancing world of artificial intelligence, Large Language Models (LLMs) are evolving from mere text generators to sophisticated intelligent agents (IAs) capable of complex reasoning and decision-making.

This transformation is largely driven by the integration of code into their training data. Code, with its structured syntax, logical consistency, and executability, equips LLMs with enhanced capabilities, enabling them to tackle tasks that were once considered beyond their reach.

In this blog post, we explore the fascinating journey of LLMs empowered by code, their methodologies, findings, implications, and future directions.

Key Methodologies

The integration of code into LLM training involves a dual approach: pre-training and fine-tuning. LLMs like GPT-3.5 and GPT-4 have been pre-trained on massive datasets comprising both natural and programming languages.

This approach leverages the inherent qualities of code—explicitness, logical structure, and machine executability—to enhance the models' reasoning and problem-solving abilities.

Pre-training with Code: LLMs are exposed to diverse programming languages, enabling them to understand and generate code. This process involves optimizing language modeling loss, where models predict sequences of code, similar to natural language.
Fine-tuning with Specific Tasks: For specialized applications, LLMs undergo fine-tuning using smaller datasets that focus on particular programming paradigms or formal languages, enhancing their proficiency in executing targeted tasks.

Main Findings and Results

Enhanced Programming Skills: LLMs trained on code exhibit superior programming proficiency, capable of generating complex code snippets across multiple languages. This capability extends their utility to applications such as database management, embedded control, and software development.
Improved Reasoning Abilities: The structured nature of code enhances LLMs' reasoning skills, particularly in tasks requiring chain-of-thought (CoT) processing. LLMs demonstrate improved accuracy in solving mathematical problems and generating logical sequences.
Structured Knowledge Capture: Code-based training enables LLMs to better capture and reason about structured knowledge, outperforming traditional models in tasks involving graphs, tables, and charts.
Seamless Integration with Tools: The code-centric paradigm allows LLMs to dynamically generate tokens that invoke external tools and APIs, enhancing their flexibility and scalability in interacting with diverse functional ends.

Implications and Applications

The implications of these findings are profound, paving the way for LLMs to function as intelligent agents in real-world scenarios:

Decision-Making: LLMs can now perceive and process structured environmental data, making informed decisions and planning complex tasks with precision.
Execution and Action Grounding: By generating formalized function calls, LLMs can execute actions in dynamic environments, such as robotics and autonomous systems, without additional grounding modules.
Self-Improvement through Feedback: Embedded in a code execution environment, LLMs receive automated feedback, enabling them to self-correct and enhance their performance over time.

Conclusion

The integration of code into LLM training is not just a technical enhancement; it's a paradigm shift in how these models interact with the world.

As LLMs continue to evolve into intelligent agents, their ability to reason, plan, and act autonomously will redefine the boundaries of artificial intelligence.

While challenges remain, particularly in seamlessly integrating LLMs with diverse tools and environments, the journey of LLMs empowered by code is just beginning.

As we look to the future, the potential applications of these intelligent agents in fields like autonomous driving, smart manufacturing, and scientific research are both exciting and limitless.

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Athina AI

Abstract:

Summary Notes

Introduction

Key Methodologies

Main Findings and Results

Implications and Applications

Conclusion

Read more

How a Founder ran 100+ Voice Interviews in 48 Hours — without a Single Zoom Call, Powered by Dialog

Top 10 AI Agent Papers of the Week: 10th April - 18th April

Top 10 AI Agent Papers of the Week: 1st April - 8th April

Top 10 AI Agents Papers from March 2025