Original Paper: https://arxiv.org/abs/2408.04560
By: Liat Ein-Dor, Orith Toledo-Ronen, Artem Spector, Shai Gretz, Lena Dankin, Alon Halfon, Yoav Katz, Noam Slonim
Abstract:
Prompts are how humans communicate with LLMs. Informative prompts are essential for guiding LLMs to produce the desired output. However, prompt engineering is often tedious and time-consuming, requiring significant expertise, limiting its widespread use. We propose Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them articulate their output preferences and integrating these into the prompt. The process includes two main stages: first, the model uses user-provided unlabeled data to generate data-driven questions and utilize user responses to shape the initial instruction. Then, the model shares the outputs generated by the instruction and uses user feedback to further refine the instruction and the outputs. The final result is a few-shot prompt, where the outputs approved by the user serve as few-shot examples. A user study on summarization tasks demonstrates the value of CPE in creating personalized, high-performing prompts. The results suggest that the zero-shot prompt obtained is comparable to its - much longer - few-shot counterpart, indicating significant savings in scenarios involving repetitive tasks with large text volumes.
Summary Notes
Figure: CPE Workflow from the user’s perspective. Each step can be a multi-turn conversation between the user and CPE.
Introduction
In the dynamic world of engineering, the advent of Large Language Models (LLMs) has opened up a plethora of opportunities. From automating customer support to generating comprehensive text summaries, the potential is immense. However, harnessing this power requires crafting effective prompts—a process known as prompt engineering (PE). Traditional PE is labor-intensive, requiring significant expertise and time. Enter Conversational Prompt Engineering (CPE), an innovative, user-friendly tool designed to streamline and simplify this process.
Simplifying Prompt Engineering with CPE
CPE leverages a conversational model to interact with users, helping them articulate their specific output preferences and integrating these into the prompt. This two-stage process involves generating data-driven questions and using user responses to shape the initial instruction, followed by refining the instruction and outputs based on user feedback. The final result is a few-shot prompt that incorporates user-approved outputs as examples.
Key Methodologies
CPE operates through a structured chat-based interaction between three actors: the user, the system, and the LLM. The system orchestrates the interaction, guiding the LLM to perform core tasks such as analyzing user data, refining instructions, and enhancing outputs. The process can be broken down into several key stages:
- Initialization: Users select their target model and upload unlabeled data. The system then prepares to analyze this data.
- Initial Discussion and Instruction Creation: CPE engages with the user to discuss various aspects of their task and output preferences, using this interaction to generate an initial instruction.
- Instruction Refinement: Based on user feedback, the instruction is revised to better align with their needs.
- Output Generation: The refined instruction is used to generate outputs, which are then evaluated by the user.
- User Feedback and Output Enhancement: Users provide feedback on the outputs, prompting further refinement until the outputs meet their expectations.
- Convergence on CPE FS Prompt: Once the instruction and outputs are approved, the final few-shot prompt is created and shared with the user.
Main Findings and Results
A user study focusing on summarization tasks demonstrated CPE's effectiveness. Participants engaged in open-ended conversations with CPE to develop prompts tailored to their needs. The results showed that CPE-generated prompts significantly improved output quality compared to baseline prompts. Specifically, users preferred summaries generated by CPE prompts in 53% of cases (zero-shot) and 47% of cases (few-shot), suggesting that CPE effectively captures user preferences.
Implications and Potential Applications
CPE's ability to generate high-quality prompts without requiring labeled data or initial prompts is a game-changer. This capability makes it particularly valuable for tasks that involve repetitive processing of large volumes of text, such as summarizing email threads or generating personalized content for advertising. By reducing the cognitive burden on users and simplifying the prompt creation process, CPE enables more widespread and effective use of LLMs in various enterprise scenarios.
Conclusion
Conversational Prompt Engineering represents a significant advancement in the field of prompt engineering. By leveraging conversational models to interact with users, CPE simplifies and streamlines the prompt creation process, making it more accessible and efficient. The positive results from the user study underscore its potential to revolutionize how engineers interact with and harness the power of LLMs. As we look to the future, the integration of CPE into broader applications, including agentic workflows, holds exciting possibilities for further enhancing productivity and innovation.
References
- Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems.
- Ye, Q., et al. (2023). Prompt Engineering a Prompt Engineer. arXiv preprint arXiv:2311.05661.
- Fernando, C., et al. (2023). Promptbreeder: Self-referential Self-improvement via Prompt Evolution. Preprint, arXiv:2309.16797.
- Kim, T. et al. (2024). EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24.
Athina AI is a collaborative IDE for AI development.
Learn more about how Athina can help your team ship AI 10x faster →