By Date12 viewsBy DateAll postsweekly: 2sep-8sepTableFine-tuningEvaluationPrompt EngineeringSafetyDataset GenerationRAGFoundation ModelHallucinationNameTagsPublish DateSlugFeaturedAuthorsExcerptExtra InfoLast Edited TimeRelated PostsDo not indexHide CTAHide in Main FeedMeta DescriptionMeta TitleHide CoverOriginal PaperBlog URLAuthorExploring Advanced Large Language Models with LLMsuiteLarge Language ModelsJuly 1, 2024Sep 13, 2024 3:46 PMhttps://arxiv.org/pdf/2407.12036Distilling System 2 into System 1LLM PerformanceFine TuningJuly 8, 2024Oct 2, 2024 1:56 PMhttps://arxiv.org/pdf/2407.06023v1A Survey on Employing Large Language Models for Text-to-SQL TasksLarge Language ModelsAugust 11, 2024Sep 13, 2024 3:46 PMhttps://arxiv.org/pdf/2407.15186ThinK: Thinner Key Cache by Query-Driven PruningJuly 30, 2024Sep 13, 2024 3:46 PMhttps://arxiv.org/pdf/2407.21018ShieldGemma: Generative AI Content Moderation Based on GemmaLarge Language ModelsAugust 4, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2407.21772Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language ModelsLLM PerformanceLarge Language ModelsAugust 5, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.02442A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?Large Language ModelsAugust 9, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.05109LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMsLarge Language ModelsAugust 13, 2024Oct 9, 2024 10:11 AMhttps://arxiv.org/pdf/2408.07055Challenges and Responses in the Practice of Large Language ModelsLarge Language ModelsLLM PerformanceAugust 21, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.09416Controllable Text Generation for Large Language Models: A SurveyLarge Language ModelsLLM PerformanceAugust 22, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.12599Enhancing Robustness in Large Language Models: Prompting for Mitigating the Impact of Irrelevant InformationLLM PerformanceLarge Language ModelsAugust 20, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.10615Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented GenerationRAGLarge Language ModelsLLM PerformanceAugust 8, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.04187Transformer Explainer: Interactive Learning of Text-Generative ModelsFoundation ModelAugust 8, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.04619Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-JudgeLarge Language ModelsJuly 30, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2407.19594Know Your Limits: A Survey of Abstention in Large Language ModelsLarge Language ModelsAugust 8, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2407.18418Weak-to-Strong ReasoningLarge Language ModelsLLM PerformanceReasoningJuly 18, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2407.13647Prover-Verifier Games improve legibility of LLM outputsLLM PerformanceLarge Language ModelsAugust 1, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2407.13692The AI Scientist: Towards Fully Automated Open-Ended Scientific DiscoveryLarge Language ModelsAugust 15, 2024Sep 13, 2024 3:47 PMhttps://arxiv.org/pdf/2408.06292Tree of Thoughts: Deliberate Problem Solving with Large Language ModelsPrompt EngineeringMay 17, 2023Athina AI Research AgentSep 30, 2024 9:20 AMPrompt Design and Engineering: Introduction and Advanced MethodsPost Hoc Explanations of Language Models Can Improve Language ModelsEnhancing Large Language Models Against Inductive Instructions with Dual-critique PromptingKnowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Modelshttps://arxiv.org/abs/2305.10601blog.athina.aiMistral 7B: Foundation Model Research Paper SummaryFoundation ModelApril 4, 2024Athina AI Research AgentSep 30, 2024 6:46 AMUniversal and Transferable Adversarial Attacks on Aligned Language ModelsText Summarization: LLM Failure Cases and Detection MethodsCYBERSECEVAL 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Modelshttps://arxiv.org/pdf/2310.06825.pdfblog.athina.aiWhat is the Role of Small Models in the LLM Era: A SurveyLarge Language ModelsSeptember 12, 2024Athina AI Research AgentSep 28, 2024 4:03 PMLarge Language Monkeys: Scaling Inference Compute with Repeated SamplingAgent Workflow Memoryhttps://arxiv.org/abs/2409.06857The Foundation Model Transparency IndexFoundation ModelOctober 19, 2023Athina AI Research AgentSep 28, 2024 4:03 PMConversational Prompt EngineeringScaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model ParametersBeyond Preferences in AI AlignmentAgent Workflow Memoryhttps://arxiv.org/abs/2310.12941Achieving Peak Performance for Large Language Models: A Systematic ReviewLLM PerformanceSeptember 7, 2024Athina AI Research AgentSep 26, 2024 5:48 PMRAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FutureStrategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitationhttps://arxiv.org/abs/2409.04833Beyond Preferences in AI AlignmentSafetyAugust 30, 2024Athina AI Research AgentSep 26, 2024 5:20 PMThe Foundation Model Transparency IndexScaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model ParametersFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FutureStrategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitationhttps://arxiv.org/abs/2408.16984Large Language Model-Based Agents for Software Engineering: A SurveyAgentsSeptember 4, 2024Athina AI Research AgentSep 26, 2024 5:20 PMScaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model ParametersLarge Language Monkeys: Scaling Inference Compute with Repeated SamplingConversational Prompt EngineeringStrategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitationhttps://arxiv.org/abs/2409.02977NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?LLM PerformanceJuly 16, 2024Sep 25, 2024 9:38 PMhttps://arxiv.org/pdf/2407.11963Does Refusal Training in LLMs Generalize to the Past Tense?LLM PerformanceJuly 19, 2024Sep 13, 2024 3:48 PMhttps://arxiv.org/pdf/2407.11969MindSearch: Mimicking Human Minds Elicits Deep AI SearcherLarge Language ModelsJuly 29, 2024Sep 13, 2024 3:48 PMhttps://arxiv.org/abs/2407.20183Machine Unlearning in Generative AI: A SurveyLarge Language ModelsJuly 30, 2024Sep 13, 2024 3:48 PMhttps://arxiv.org/pdf/2407.20516Recursive Introspection: Teaching Language Model Agents How to Self-ImproveLarge Language ModelsLLM PerformanceJuly 26, 2024Sep 13, 2024 3:48 PMhttps://arxiv.org/pdf/2407.18219Generation Constraint Scaling Can Mitigate HallucinationHallucinationsJuly 23, 2024Oct 3, 2024 9:08 PMhttps://arxiv.org/pdf/2407.16908SpreadsheetLLM: Encoding Spreadsheets for Large Language ModelsLLM PerformanceLarge Language ModelsJuly 12, 2024Sep 13, 2024 3:48 PMhttps://arxiv.org/pdf/2407.09025Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach RAGLLM PerformanceJuly 23, 2024Sep 25, 2024 9:37 PMhttps://arxiv.org/abs/2407.16833Conversational Prompt EngineeringPrompt EngineeringRAGAugust 8, 2024Athina AI Research AgentSep 13, 2024 3:48 PMFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FutureConversational Prompt Engineering🛠️ Understand Your Users → Detect Hallucinations → IterateBe like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMsMixture-of-Agents Enhances Large Language Model Capabilitieshttps://arxiv.org/abs/2408.04560Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-TeachingFine TuningFoundation ModelJune 15, 2024Athina AI Research AgentSep 13, 2024 3:48 PMllmEngineer.weekly: Running models locally, LLM-graded evals too expensive for production? Here's our solution...RAGEval: Scenario Specific RAG Evaluation Dataset Generation Frameworkhttps://arxiv.org/abs/2406.06326From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future AgentsReasoningEvaluationAugust 5, 2024Athina AI Research AgentSep 25, 2024 9:37 PMLLM Critics Help Catch LLM BugsFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FutureCan I walk you through Athina in 15 mins?Conversational Prompt EngineeringSelf-Taught EvaluatorsOn LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A SurveyFrom Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic DataBe like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMshttps://arxiv.org/abs/2408.02479Adaptive Retrieval-Augmented Generation for Conversational Systems RAGConversational AIJuly 31, 2024Athina AI Research AgentSep 13, 2024 3:48 PMHow non-technical users can prototype pipelines, run AI experiments and evaluationsEvaluate llama-3 vs gpt-4o on YOUR dataset in a few clicksRe-run your production traces on different LLMs and compare the resultshttps://arxiv.org/abs/2407.21712Tree Search For Language Model AgentsAgentsReasoningJuly 1, 2024Athina AI Research AgentSep 25, 2024 9:37 PMAre you afraid of making changes to your LLM pipeline?June Product Updates: Enterprise Features, Dynamic Columns, Spreadsheet-ing, Prompt Management + moreRAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkPersonaGym: Evaluating Persona Agents and LLMsAgentsEvaluationJuly 29, 2024Athina AI Research AgentOct 3, 2024 8:34 PMPrompts, Prompts, Prompts!RAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkSelf-Taught Evaluatorshttps://arxiv.org/abs/2407.18416PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision MakersRAGReasoningJune 18, 2024Athina AI Research AgentOct 3, 2024 8:33 PMAnalyze and compare LLM performance across different prompts, models, and topicsCompare Mode on Athina IDECommon LLM chatbot problems and how to solve themhttps://arxiv.org/abs/2406.12430LLM Pruning and Distillation in Practice: The Minitron ApproachLarge Language ModelsLLM PerformanceAugust 26, 2024Oct 3, 2024 8:23 PMhttps://arxiv.org/abs/2408.11796EfficientRAG: Efficient Retriever for Multi-Hop Question AnsweringRAGAugust 8, 2024Oct 3, 2024 8:31 PMhttps://arxiv.org/pdf/2408.04259LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM InferenceLLM PerformanceJuly 19, 2024Oct 3, 2024 8:33 PMhttps://arxiv.org/pdf/2407.14057Context Embeddings for Efficient Answer Generation in RAGRAGFine TuningLLM PerformanceJuly 23, 2024Oct 3, 2024 8:32 PMhttps://arxiv.org/pdf/2407.09252RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented GenerationRAGAugust 17, 2024Oct 3, 2024 8:31 PMhttps://arxiv.org/pdf/2408.08067HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information ExtractionRAGAugust 9, 2024Oct 3, 2024 8:30 PMhttps://arxiv.org/pdf/2408.04948Graph Retrieval-Augmented Generation: A SurveyRAGAugust 15, 2024Oct 3, 2024 8:29 PMhttps://arxiv.org/pdf/2408.08921Automated Design of Agentic SystemAgentsAugust 15, 2024Oct 3, 2024 8:28 PMhttps://arxiv.org/abs/2408.08435Mixture-of-Agents Enhances Large Language Model CapabilitiesAgentsJune 7, 2024Athina AI Research AgentSep 25, 2024 9:30 PMConversational Prompt EngineeringRAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented GenerationAthina IDE: A Collaborative Editor for AI teams to Prototype, Evaluate, and Experimenthttps://arxiv.org/abs/2406.04692Discovering Preference Optimization Algorithms with and for Large Language ModelsLLM PerformanceReasoningJune 12, 2024Athina AI Research AgentSep 13, 2024 3:48 PMHow the best teams evaluate their chatbotsCustom Evaluations for your AI for freeImproving Retrieval Augmented Language Model with Self-Reasoninghttps://arxiv.org/abs/2406.08414Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMsReasoningLLM PerformanceJune 14, 2024Athina AI Research AgentSep 13, 2024 3:48 PMConversational Prompt EngineeringFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FutureJune Product Updates: Enterprise Features, Dynamic Columns, Spreadsheet-ing, Prompt Management + morehttps://arxiv.org/abs/2406.10209Following Length Constraints in InstructionsReasoningLLM PerformanceJune 25, 2024Athina AI Research AgentSep 13, 2024 3:48 PMJune Product Updates: Enterprise Features, Dynamic Columns, Spreadsheet-ing, Prompt Management + moreAthina IDE: A Collaborative Editor for AI teams to Prototype, Evaluate, and ExperimentRAGEval: Scenario Specific RAG Evaluation Dataset Generation Frameworkhttps://arxiv.org/abs/2406.17744On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey EvaluationDataset GenerationJune 14, 2024Athina AI Research AgentSep 13, 2024 3:48 PMAthina IDE: A Collaborative Editor for AI teams to Prototype, Evaluate, and ExperimentFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FutureJune Product Updates: Enterprise Features, Dynamic Columns, Spreadsheet-ing, Prompt Management + morehttps://arxiv.org/abs/2406.15126Improving Retrieval Augmented Language Model with Self-ReasoningRAGReasoningAugust 2, 2024Athina AI Research AgentSep 25, 2024 9:30 PMEvaluating LLM Chatbot Conversations is hard - here's how we're solving itEvaluating JSON responses: LLMs still can't be trusted to produce consistent JSON outputsGenerate high-quality synthetic datasets for RAG Q&A in 30 secondsDiscovering Preference Optimization Algorithms with and for Large Language Modelshttps://arxiv.org/abs/2407.19813Concise Thoughts: Impact of Output Length on LLM Reasoning and CostReasoningLLM PerformanceJuly 29, 2024Athina AI Research AgentSep 13, 2024 3:48 PMHave you tried Cohere's new model Command R+? Compare against Claude 3 and Gemini Pro on AthinaAthina IDE: A Collaborative Editor for AI teams to Prototype, Evaluate, and ExperimentJune Product Updates: Enterprise Features, Dynamic Columns, Spreadsheet-ing, Prompt Management + morehttps://arxiv.org/abs/2407.19825SelfGoal: Your Language Agents Already Know How to Achieve High-level GoalsAgentsReasoningJune 7, 2024Athina AI Research AgentSep 13, 2024 3:48 PMSelf-Taught EvaluatorsRAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkAthina IDE: A Collaborative Editor for AI teams to Prototype, Evaluate, and Experimenthttps://arxiv.org/abs/2406.04784From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic DataFine TuningDataset GenerationRAGJune 27, 2024Athina AI Research AgentSep 13, 2024 3:48 PMFrom Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic DataSupport for Custom Models hosted on Azure and AWS Bedrock!From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Futurehttps://arxiv.org/abs/2406.19292RAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkDataset GenerationRAGAugust 18, 2024Athina AI Research AgentSep 25, 2024 9:30 PMWe just launched on Product Hunt!Annotate LLM traces on Athina + new models support, automatic token & cost trackingHow to backtest prompt / model changes?PersonaGym: Evaluating Persona Agents and LLMsFollowing Length Constraints in InstructionsTree Search For Language Model AgentsSelf-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-TeachingSelfGoal: Your Language Agents Already Know How to Achieve High-level Goalshttps://arxiv.org/abs/2408.01262RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented GenerationLLM PerformanceRAGAugust 5, 2024Athina AI Research AgentSep 25, 2024 9:30 PMProduct Hunt Launch: Help us get to #1 Product of the Day!Access your LLM Traces via our GraphQL APIRAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented GenerationMixture-of-Agents Enhances Large Language Model Capabilitieshttps://arxiv.org/abs/2408.02545Self-Taught Evaluators EvaluationLLM PerformanceAugust 8, 2024Athina AI Research AgentSep 13, 2024 3:48 PMConfiguring an eval in 15 seconds (yes, really)June Product Updates: Enterprise Features, Dynamic Columns, Spreadsheet-ing, Prompt Management + moreFrom LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and FuturePersonaGym: Evaluating Persona Agents and LLMsSelfGoal: Your Language Agents Already Know How to Achieve High-level Goalshttps://arxiv.org/abs/2408.02666Retrieval with Feedback LoopsRAGSep 13, 2024 3:48 PMAthina AIExplainable Retrieval RAGSep 13, 2024 3:48 PMAthina AIContextual CompressionRAGSep 13, 2024 3:48 PMAthina AISemantic ChunkingRAGSep 13, 2024 3:48 PMAthina AIHypothetical Questions (HyDE Approach)RAGSep 13, 2024 3:48 PMAthina AIAdvanced RAG Technique: Hierarchical IndicesRAGSep 13, 2024 3:48 PMAthina AIRe-ranking methodsRAGAugust 21, 2024Sep 13, 2024 3:48 PMAthina AIQuery Transformations: Rewriting, Step-back Prompting, and Sub-query DecompositionRAGAugust 21, 2024Sep 13, 2024 3:48 PMAthina AIRAG-Fusion (Fusion Retrieval RAG)RAGAugust 21, 2024Sep 13, 2024 3:48 PMAthina AIMaatphor: Automated Variant Analysis for Prompt Injection AttacksDataset GenerationDecember 12, 2023Athina AI Research AgentSep 13, 2024 3:48 PMPrompt-Tuning Decision Transformer with Preference RankingProgressive Visual Prompt Learning with Contrastive Feature Re-formationSoft-prompt Tuning for Large Language Models to Evaluate BiasRobust Safety Classifier for Large Language Models: Adversarial Prompt Shieldhttps://arxiv.org/abs/2312.11513Universality and Limitations of Prompt TuningFoundation ModelMay 30, 2023Athina AI Research AgentSep 13, 2024 3:48 PMOne Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC EraGraph of Thoughts: Solving Elaborate Problems with Large Language ModelsFew-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluationhttps://arxiv.org/abs/2305.18787blog.athina.aiLanguage Is Not All You Need: Aligning Perception with Language ModelsRAGFebruary 27, 2023Athina AI Research AgentSep 13, 2024 3:48 PMCan ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERTHow Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding TasksPrompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot LearnersActive Prompting with Chain-of-Thought for Large Language ModelsNot what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt InjectionA Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPTGuiding Large Language Models via Directional Stimulus Promptinghttps://arxiv.org/abs/2302.14045blog.athina.aiMedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineeringPrompt EngineeringJune 6, 2024Athina AI Research AgentSep 13, 2024 3:48 PMLarge Language Models and Prompt Engineering for Biomedical Query Focused Multi-Document SummarisationEnhancing Medical Task Performance in GPT-4V: A Comprehensive Study on Prompt Engineering StrategiesCases of EFL Secondary Students' Prompt Engineering Pathways to Complete a Writing Task with ChatGPTChatGPT4PCG 2 Competition: Prompt Engineering for Science Birds Level GenerationLAMPER: LanguAge Model and Prompt EngineeRing for zero-shot time series classificationAutomated Black-box Prompt Engineering for Personalized Text-to-Image GenerationWordflow: Social Prompt Engineering for Large Language ModelsA Systematic Survey of Prompt Engineering in Large Language Models: Techniques and ApplicationsExploring EFL students' prompt engineering in human-AI story writing: an Activity Theory perspectiveA Novel Approach for Rapid Development Based on ChatGPT and Prompt EngineeringChit-Chat or Deep Talk: Prompt Engineering for Process MiningSAMAug: Point Prompt Augmentation for Segment Anything ModelSAM on Medical Images: A Comprehensive Study on Three Prompt ModesPrompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion ModelsDr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctnessTowards Large-scale 3D Representation Learning with Multi-dataset Point Prompt TrainingPrompt Cache: Modular Attention Reuse for Low-Latency Inferencehttps://arxiv.org/abs/2405.02664blog.athina.aiProRes: Exploring Degradation-aware Visual Prompt for Universal Image RestorationFoundation ModelJune 23, 2023Athina AI Research AgentSep 13, 2024 3:49 PMTopicGPT: A Prompt-based Topic Modeling FrameworkLanguage Prompt for Autonomous DrivingPrompt-tuning latent diffusion models for inverse problemsDePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuningLLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformershttps://arxiv.org/abs/2306.13653Cases of EFL Secondary Students' Prompt Engineering Pathways to Complete a Writing Task with ChatGPTSafetyJune 19, 2023Athina AI Research AgentSep 13, 2024 3:48 PMReAct: Synergizing Reasoning and Acting in Language ModelsPrompting GPT-3 To Be ReliableDocPrompting: Generating Code by Retrieving the DocsExploring the Intersection of Large Language Models and Agent-Based Modeling via Prompt EngineeringMedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineeringhttps://arxiv.org/abs/2307.05493blog.athina.aiPrompt Packer: Deceiving LLMs through Compositional Instruction with Hidden AttacksSafetyOctober 16, 2023Athina AI Research AgentSep 13, 2024 3:48 PMGeneralized Graph Prompt: Toward a Unification of Pre-Training and Downstream Tasks on GraphsHD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion ModelsPromptCARE: Prompt Copyright Protection by Watermark Injection and VerificationAn automatically discovered chain-of-thought prompt generalizes to novel models and datasetsPractical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt CalibrationPrompt Tuning Large Language Models on Personalized Aspect Extraction for RecommendationsPrompt Middleware: Mapping Prompts for Large Language Models to UI AffordancesPrompt-based Node Feature Extractor for Few-shot Learning on Text-Attributed GraphsDivide and Prompt: Chain of Thought Prompting for Text-to-SQLLayout and Task Aware Instruction Prompt for Zero-shot Document Image Question AnsweringBadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIPhttps://arxiv.org/abs/2310.10077How to Use a Custom Grading Criteria to Evaluate LLM Responses (LLM-as-a-Judge)HallucinationsEvaluationApril 17, 2024Sep 13, 2024 3:49 PMCYBERSECEVAL 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language ModelsReprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs SamplingReasoningMay 23, 2024Athina AI Research AgentOct 9, 2024 9:39 AMImageDream: Image-Prompt Multi-view Diffusion for 3D GenerationLanguage Prompt for Autonomous DrivingLongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compressionhttps://arxiv.org/abs/2305.09993Chain-of-Symbol Prompting Elicits Planning in Large Langauge ModelsEvaluationMay 17, 2023Athina AI Research AgentAug 19, 2024 11:43 PMEfficient Prompting via Dynamic In-Context LearningThe Web Can Be Your Oyster for Improving Large Language ModelsTreePrompt: Learning to Compose Tree Prompts for Explainable Visual Groundinghttps://arxiv.org/abs/2305.10276blog.athina.aiSoft-prompt Tuning for Large Language Models to Evaluate BiasEvaluationMarch 5, 2024Athina AI Research AgentAug 19, 2024 11:40 PMLLM Critics Help Catch LLM BugsBIM-GPT: a Prompt-Based Virtual Assistant Framework for BIM Information RetrievalProgressive Visual Prompt Learning with Contrastive Feature Re-formationMaatphor: Automated Variant Analysis for Prompt Injection AttacksSPELL: Semantic Prompt Evolution based on a LLMhttps://arxiv.org/abs/2306.04735RoT: Enhancing Large Language Models with Reflection on Search TreesReasoningApril 11, 2024Athina AI Research AgentAug 19, 2024 11:38 PMPathFinder: Guided Search over Multi-Step Reasoning PathsFounder-GPT: Self-play to evaluate the Founder-Idea fitRNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrievalhttps://arxiv.org/abs/2404.05449blog.athina.aiPromptbreeder: Self-Referential Self-Improvement Via Prompt EvolutionReasoningSeptember 28, 2023Athina AI Research AgentAug 19, 2024 11:48 PMRe-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and BeyondImageDream: Image-Prompt Multi-view Diffusion for 3D GenerationLongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt CompressionConnecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt OptimizersQuantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formattingChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Designhttps://arxiv.org/abs/2309.16797blog.athina.aiAnalyzing Toxicity in Deep Conversations: A Reddit Case StudyDataset GenerationApril 11, 2024Athina AI Research AgentAug 19, 2024 11:47 PMEvidence to Generate (E2G): A Single-agent Two-step Prompting for Context Grounded and Retrieval Augmented ReasoningPathFinder: Guided Search over Multi-Step Reasoning PathsRNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrievalhttps://arxiv.org/abs/2404.07879blog.athina.aiLanguage Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-ThoughtReasoningJanuary 26, 2023Athina AI Research AgentAug 19, 2024 11:46 PMPAL: Program-aided Language ModelsDocPrompting: Generating Code by Retrieving the DocsLarge Language Models Are Human-Level Prompt Engineershttps://arxiv.org/abs/2210.01240v3blog.athina.aiSkeleton-of-Thought: Prompting LLMs for Efficient Parallel GenerationPrompt EngineeringJuly 28, 2023Athina AI Research AgentAug 19, 2024 11:52 PMRe-Reading Improves Reasoning in Large Language ModelsConnecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt OptimizersLLMLingua: Compressing Prompts for Accelerated Inference of Large Language ModelsPost Hoc Explanations of Language Models Can Improve Language Modelshttps://arxiv.org/abs/2307.15337blog.athina.aiPre-Training to Learn in ContextRAGMay 16, 2023Athina AI Research AgentSep 13, 2024 3:49 PMZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMsTELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex TasksSatLM: Satisfiability-Aided Language Models Using Declarative PromptingFrom Prompt Injections to SQL Injection Attacks: How Protected is Your LLM-Integrated Web Application?Language Prompt for Autonomous DrivingImageDream: Image-Prompt Multi-view Diffusion for 3D GenerationLongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compressionhttps://arxiv.org/abs/2305.09137blog.athina.aiChain of Hindsight Aligns Language Models with FeedbackEvaluationFebruary 6, 2023Athina AI Research AgentAug 19, 2024 11:54 PMChain of Hindsight Aligns Language Models with FeedbackCan ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERTHow Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding TasksHow Does In-Context Learning Help Prompt Tuning?Scalable Prompt Generation for Semi-supervised Learning with Language Modelshttps://arxiv.org/abs/2302.02676blog.athina.aiGuiding Large Language Models via Directional Stimulus PromptingPrompt EngineeringOctober 9, 2023Athina AI Research AgentAug 19, 2024 11:49 PMA Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPTLanguage Is Not All You Need: Aligning Perception with Language ModelsActive Prompting with Chain-of-Thought for Large Language ModelsHow Does In-Context Learning Help Prompt Tuning?https://arxiv.org/abs/2302.11520blog.athina.aiJailbreaking ChatGPT via Prompt Engineering: An Empirical StudyEvaluationMarch 10, 2024Athina AI Research AgentAug 19, 2024 11:51 PMQuantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formattingPrompt Injection attack against LLM-integrated ApplicationsChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software DesignIP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion ModelsTensor Trust: Interpretable Prompt Injection Attacks from an Online GameAnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly DetectionAn LLM can Fool Itself: A Prompt-Based Adversarial AttackPromptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code Generatorshttps://arxiv.org/abs/2305.13860blog.athina.aiSelf-Consistency Improves Chain of Thought Reasoning in Language ModelsReasoningMarch 7, 2023Athina AI Research AgentAug 19, 2024 11:55 PMInferring Properties of Graph Neural NetworksRAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language ModelsPromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt OptimizationPAL: Program-aided Language Modelshttps://arxiv.org/abs/2203.11171blog.athina.aiDemystifying Chains, Trees, and Graphs of ThoughtsReasoningApril 5, 2024Athina AI Research AgentAug 19, 2024 11:57 PMThe Flan Collection: Designing Data and Methods for Effective Instruction TuningLarge Language Models are reasoners with Self-VerificationConstitutional AI: Harmlessness from AI FeedbackAlgorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Modelshttps://arxiv.org/abs/2401.14295blog.athina.aiPost Hoc Explanations of Language Models Can Improve Language ModelsPrompt EngineeringMay 19, 2023Athina AI Research AgentAug 19, 2024 11:58 PMRe-Reading Improves Reasoning in Large Language ModelsSkeleton-of-Thought: Prompting LLMs for Efficient Parallel GenerationPrompt Design and Engineering: Introduction and Advanced MethodsTree of Thoughts: Deliberate Problem Solving with Large Language ModelsKnowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Modelshttps://arxiv.org/abs/2305.11426blog.athina.aiGraph of Thoughts: Solving Elaborate Problems with Large Language ModelsPrompt EngineeringAugust 18, 2023Athina AI Research AgentAug 20, 2024 12:03 AMReasoning with Language Model Prompting: A SurveyTowards Reasoning in Large Language Models: A SurveyGraph of Thoughts: Solving Elaborate Problems with Large Language ModelsLess Likely Brainstorming: Using Language Models to Generate Alternative HypothesesUniversality and Limitations of Prompt TuningMultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought PromptingReasoning with Language Model is Planning with World ModelBetter Zero-Shot Reasoning with Self-Adaptive PromptingLet's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMsTreePrompt: Learning to Compose Tree Prompts for Explainable Visual Groundinghttps://arxiv.org/abs/2308.09687v2blog.athina.aiA Novel Approach for Rapid Development Based on ChatGPT and Prompt EngineeringEvaluationDecember 21, 2023Athina AI Research AgentAug 19, 2024 11:59 PMMedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineeringChatGPT4PCG 2 Competition: Prompt Engineering for Science Birds Level GenerationLAMPER: LanguAge Model and Prompt EngineeRing for zero-shot time series classificationhttps://arxiv.org/abs/2312.13115blog.athina.aiPrompt Tuning Large Language Models on Personalized Aspect Extraction for RecommendationsPrompt EngineeringJune 2, 2023Athina AI Research AgentAug 20, 2024 12:00 AMDP-OPT: Make Large Language Model Your Privacy-Preserving Prompt EngineerPrompt Packer: Deceiving LLMs through Compositional Instruction with Hidden AttacksPrompt Tuning Large Language Models on Personalized Aspect Extraction for RecommendationsPrompt-Guided Transformers for End-to-End Open-Vocabulary Object Detectionhttps://arxiv.org/abs/2306.01475An LLM can Fool Itself: A Prompt-Based Adversarial AttackEvaluationOctober 20, 2023Athina AI Research AgentAug 20, 2024 12:02 AMAnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly DetectionTensor Trust: Interpretable Prompt Injection Attacks from an Online GameJailbreaking ChatGPT via Prompt Engineering: An Empirical StudyPromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Modelshttps://arxiv.org/abs/2310.13345blog.athina.aiPBNR: Prompt-based News Recommender SystemPrompt EngineeringApril 16, 2023Athina AI Research AgentAug 20, 2024 12:01 AMBenchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language ModelsSegment Any Anomaly without Training via Hybrid Prompt RegularizationLongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt CompressionPromptCARE: Prompt Copyright Protection by Watermark Injection and Verificationhttps://arxiv.org/abs/2304.07862TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex TasksEvaluationMay 19, 2023Athina AI Research AgentAug 20, 2024 12:07 AMExplaining Emergent In-Context Learning as Kernel RegressionLet's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMsCompress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable PromptEfficient Prompting via Dynamic In-Context LearningThe Web Can Be Your Oyster for Improving Large Language ModelsFlatness-Aware Prompt Selection Improves Accuracy and Sample EfficiencyReprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs SamplingSatLM: Satisfiability-Aided Language Models Using Declarative PromptingPre-Training to Learn in Contexthttps://arxiv.org/abs/2305.11430blog.athina.aiHow Does In-Context Learning Help Prompt Tuning?Fine TuningFebruary 22, 2023Athina AI Research AgentAug 20, 2024 12:05 AMGuiding Large Language Models via Directional Stimulus PromptingA Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPTChain of Hindsight Aligns Language Models with FeedbackScalable Prompt Generation for Semi-supervised Learning with Language ModelsGraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural NetworksThe Capacity for Moral Self-Correction in Large Language Modelshttps://arxiv.org/abs/2302.11521blog.athina.aiRetrieval-Augmented Thought Process as Sequential Decision MakingRAGFebruary 12, 2024Athina AI Research AgentAug 20, 2024 12:04 AMRetrieval-Augmented Thought Process as Sequential Decision MakingMultimodal Chain-of-Thought Reasoning in Language ModelsCompositional Exemplars for In-context LearningEverything of Thoughts: Defying the Law of Penrose Triangle for Thought GenerationBoosting Logical Reasoning in Large Language Models through a New Framework: The Graph of ThoughtTree of Attacks: Jailbreaking Black-Box LLMs Automaticallyhttps://arxiv.org/abs/2402.07812blog.athina.aiChain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous SourcesPrompt EngineeringMay 22, 2023Athina AI Research AgentAug 20, 2024 12:09 AMFactuality of Large Language Models in the Year 2024Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous SourcesSemi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model ReasoningFine-tuning Language Models for Factualityhttps://arxiv.org/abs/2305.13269blog.athina.aiPrompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion ModelsPrompt EngineeringDecember 19, 2023Athina AI Research AgentAug 23, 2024 2:30 AMProgressive Visual Prompt Learning with Contrastive Feature Re-formationTesting LLMs on Code Generation with Varying Levels of Prompt SpecificityPrompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Modelsviz2viz: Prompt-driven stylized visualization generation using a diffusion modelhttps://arxiv.org/abs/2312.12416Plum: Prompt Learning using MetaheuristicPrompt EngineeringMarch 14, 2024Athina AI Research AgentAug 20, 2024 12:17 AMFoodGPT: A Large Language Model in Food Testing Domain with Incremental Pre-training and Knowledge Graph PromptText-driven Prompt Generation for Vision-Language Models in Federated LearningConsistency-guided Prompt Learning for Vision-Language Modelshttps://arxiv.org/abs/2311.08364Autonomous Tree-search Ability of Large Language ModelsReasoningOctober 14, 2023Athina AI Research AgentAug 20, 2024 12:16 AMPathFinder: Guided Search over Multi-Step Reasoning PathsSPROUT: Authoring Programming Tutorials with Interactive Visualization of Large Language Model Generation ProcessFounder-GPT: Self-play to evaluate the Founder-Idea fithttps://arxiv.org/abs/2310.10686blog.athina.aiTemporal Data Meets LLM -- Explainable Financial Time Series ForecastingReasoningJune 19, 2023Athina AI Research AgentAug 20, 2024 12:13 AMGuReT: Distinguishing Guilt and Regret related TextRAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language ModelsEvidence to Generate (E2G): A Single-agent Two-step Prompting for Context Grounded and Retrieval Augmented Reasoninghttps://arxiv.org/abs/2306.11025blog.athina.aiFocused Prefix Tuning for Controllable Text GenerationFine TuningJune 1, 2023Athina AI Research AgentAug 20, 2024 12:37 AMTowards Reasoning in Large Language Models: A SurveyA Bibliometric Review of Large Language Models Research from 2017 to 2023Reasoning with Language Model Prompting: A Surveyhttps://arxiv.org/abs/2306.00369blog.athina.aiPromptTTS 2: Describing and Generating Voices with Text PromptPrompt EngineeringOctober 12, 2023Athina AI Research AgentAug 20, 2024 12:41 AMPrompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech RecognitionHD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion ModelsGeneralized Graph Prompt: Toward a Unification of Pre-Training and Downstream Tasks on GraphsVisual Prompt Based Personalized Federated LearningAn automatically discovered chain-of-thought prompt generalizes to novel models and datasetsProgressive Visual Prompt Learning with Contrastive Feature Re-formationhttps://arxiv.org/abs/2309.02285CYBERSECEVAL 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language ModelsSafetyEvaluationApril 18, 2024Athina AI Research AgentAug 23, 2024 12:56 AMHow to Use a Custom Grading Criteria to Evaluate LLM Responses (LLM-as-a-Judge)Mistral 7B: Foundation Model Research Paper SummaryChain-of-Verification Reduces Hallucination in Large Language Modelshttps://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/blog.athina.aiAI Chain on Large Language Model for Unsupervised Control Flow Graph Generation for Statically-Typed Partial CodeFoundation ModelJune 1, 2023Athina AI Research AgentAug 20, 2024 12:38 AMGuReT: Distinguishing Guilt and Regret related TextFounder-GPT: Self-play to evaluate the Founder-Idea fitBoosting of Thoughts: Trial-and-Error Problem Solving with Large Language Modelshttps://arxiv.org/abs/2306.00757blog.athina.aiPrompt-ICM: A Unified Framework towards Image Coding for Machines with Task-driven PromptsPrompt EngineeringMay 4, 2023Athina AI Research AgentAug 20, 2024 12:39 AMText-driven Prompt Generation for Vision-Language Models in Federated LearningImage-Object-Specific Prompt Learning for Few-Shot Class-Incremental LearningConsistency-guided Prompt Learning for Vision-Language Modelshttps://arxiv.org/abs/2305.02578Robust Safety Classifier for Large Language Models: Adversarial Prompt ShieldSafetyOctober 31, 2023Athina AI Research AgentAug 20, 2024 12:40 AMEfficient Federated Prompt Tuning for Black-box Large Pre-trained ModelsSPELL: Semantic Prompt Evolution based on a LLMMaatphor: Automated Variant Analysis for Prompt Injection AttacksImage-Object-Specific Prompt Learning for Few-Shot Class-Incremental Learninghttps://arxiv.org/abs/2311.00172EntGPT: Linking Generative Large Language Models with Knowledge BasesPrompt EngineeringFebruary 9, 2024Athina AI Research AgentAug 20, 2024 12:47 AMFrom Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text EmbeddingsUniversal and Transferable Adversarial Attacks on Aligned Language ModelsFactuality of Large Language Models in the Year 2024KnowGPT: Knowledge Injection for Large Language ModelsProbabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex QuestionsA Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questionshttps://arxiv.org/abs/2402.06738blog.athina.aiTesting LLMs on Code Generation with Varying Levels of Prompt SpecificityEvaluationNovember 10, 2023Athina AI Research AgentAug 20, 2024 12:43 AMLLM Critics Help Catch LLM BugsPrompt-Tuning Decision Transformer with Preference RankingProgressive Visual Prompt Learning with Contrastive Feature Re-formationULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual PromptPrompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Modelshttps://arxiv.org/abs/2311.07599Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and BeyondReasoningOctober 9, 2023Athina AI Research AgentAug 20, 2024 12:48 AMExploring LLM-based Agents for Root Cause AnalysisInvestigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction FollowingModel-tuning Via Prompts Makes NLP Models Adversarially RobustTool Learning with Foundation ModelsOne Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC EraA Bibliometric Review of Large Language Models Research from 2017 to 2023Natural Language Reasoning, A SurveyWalking Down the Memory Maze: Beyond Context Limit through Interactive ReadingFrom Sparse to Dense: GPT-4 Summarization with Chain of Density PromptingExploring Lottery Prompts for Pre-trained Language ModelsLet's Verify Step by StepPEARL: Prompting Large Language Models to Plan and Execute Actions Over Long DocumentsReasoning with Language Model is Planning with World ModelBetter Zero-Shot Reasoning with Self-Adaptive PromptingInteractive Natural Language ProcessingExplaining Emergent In-Context Learning as Kernel RegressionZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMshttps://arxiv.org/abs/2310.06147blog.athina.aiStyleDiffusion: Prompt-Embedding Inversion for Text-Based EditingFine TuningAugust 20, 2023Athina AI Research AgentAug 20, 2024 12:51 AMBenchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language ModelsRe-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and BeyondLongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compressionhttps://arxiv.org/abs/2303.15649Adversarial Prompt Tuning for Vision-Language ModelsSafetyDecember 25, 2023Athina AI Research AgentAug 20, 2024 12:50 AMMultimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image RestorationPromise: Prompt-driven 3D Medical Image Segmentation Using Pretrained Image Foundation ModelsAutoHint: Automatic Prompt Optimization with Hint GenerationPrompt Algebra for Task Compositionhttps://arxiv.org/abs/2311.11261Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language ModelsReasoningSeptember 28, 2023Athina AI Research AgentAug 20, 2024 12:49 AMUnleashing the potential of prompt engineering in Large Language Models: a comprehensive reviewDemystifying Chains, Trees, and Graphs of ThoughtsThe Flan Collection: Designing Data and Methods for Effective Instruction TuningEverything of Thoughts: Defying the Law of Penrose Triangle for Thought GenerationBoosting Logical Reasoning in Large Language Models through a New Framework: The Graph of ThoughtTree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoninghttps://arxiv.org/abs/2308.10379blog.athina.aiConnecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt OptimizersPrompt EngineeringFebruary 27, 2024Athina AI Research AgentAug 20, 2024 12:53 AMLongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt CompressionRe-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and BeyondPromptbreeder: Self-Referential Self-Improvement Via Prompt Evolutionhttps://arxiv.org/abs/2309.08532AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly DetectionHallucinationsMarch 16, 2024Athina AI Research AgentAug 20, 2024 12:52 AMChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software DesignQuantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formattingJailbreaking ChatGPT via Prompt Engineering: An Empirical StudyAn LLM can Fool Itself: A Prompt-Based Adversarial AttackPromptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code Generatorshttps://arxiv.org/abs/2310.18961blog.athina.aiConsistency-guided Prompt Learning for Vision-Language ModelsFine TuningFebruary 27, 2024Athina AI Research AgentAug 20, 2024 12:54 AMLLM Critics Help Catch LLM BugsFoodGPT: A Large Language Model in Food Testing Domain with Incremental Pre-training and Knowledge Graph PromptImage-Object-Specific Prompt Learning for Few-Shot Class-Incremental LearningReverse Stable Diffusion: What prompt was used to generate this image?Does Prompt-Tuning Language Model Ensure Privacy?Prompt-ICM: A Unified Framework towards Image Coding for Machines with Task-driven PromptsLarge Language Model Prompt Chaining for Long Legal Document ClassificationPlum: Prompt Learning using Metaheuristichttps://arxiv.org/abs/2306.01195A Bibliometric Review of Large Language Models Research from 2017 to 2023EvaluationApril 3, 2023Athina AI Research AgentSep 12, 2024 11:55 PMReinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and BeyondOne Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC EraFocused Prefix Tuning for Controllable Text GenerationLess Likely Brainstorming: Using Language Models to Generate Alternative HypothesesPEARL: Prompting Large Language Models to Plan and Execute Actions Over Long DocumentsHierarchical Prompting Assists Large Language Model on Web NavigationCan We Edit Factual Knowledge by In-Context Learning?Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Modelshttps://arxiv.org/abs/2304.02020blog.athina.aiExploring EFL students' prompt engineering in human-AI story writing: an Activity Theory perspectivePrompt EngineeringFebruary 10, 2024Athina AI Research AgentAug 20, 2024 12:53 AMLAMPER: LanguAge Model and Prompt EngineeRing for zero-shot time series classificationChatGPT4PCG 2 Competition: Prompt Engineering for Science Birds Level GenerationMedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineeringHow to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain SettingsA study on Prompt Design, Advantages and Limitations of ChatGPT for Deep Learning Program RepairGraph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPThttps://arxiv.org/abs/2306.01798blog.athina.aiRe-Reading Improves Reasoning in Large Language ModelsPrompt EngineeringSeptember 12, 2023Athina AI Research AgentAug 20, 2024 7:39 PMLLMLingua: Compressing Prompts for Accelerated Inference of Large Language ModelsPrincipled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4Prompt Design and Engineering: Introduction and Advanced MethodsSkeleton-of-Thought: Prompting LLMs for Efficient Parallel GenerationPost Hoc Explanations of Language Models Can Improve Language Modelshttps://arxiv.org/abs/2309.06275blog.athina.aiPrompt Middleware: Mapping Prompts for Large Language Models to UI AffordancesPrompt EngineeringJuly 3, 2023Athina AI Research AgentAug 20, 2024 7:36 PMLLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly TransformersPrompt Packer: Deceiving LLMs through Compositional Instruction with Hidden AttacksPractical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt CalibrationDivide and Prompt: Chain of Thought Prompting for Text-to-SQLhttps://arxiv.org/abs/2307.01142An automatically discovered chain-of-thought prompt generalizes to novel models and datasetsReasoningAugust 3, 2023Athina AI Research AgentAug 20, 2024 7:37 PMAn automatically discovered chain-of-thought prompt generalizes to novel models and datasetsLLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly TransformersPrompt Packer: Deceiving LLMs through Compositional Instruction with Hidden AttacksPromptTTS 2: Describing and Generating Voices with Text PromptQuery-Dependent Prompt Evaluation and Optimization with Offline Inverse RLExploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and ConcretenessProgressive Visual Prompt Learning with Contrastive Feature Re-formationhttps://arxiv.org/abs/2305.02897Active Retrieval Augmented GenerationRAGMay 11, 2023Athina AI Research AgentAug 20, 2024 7:41 PMFine-tuning Language Models for FactualitySearch-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive TasksAutoHall: Automated Hallucination Dataset Generation for Large Language ModelsPrompt Design and Engineering: Introduction and Advanced MethodsEnhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through LogicPrincipled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4A Comprehensive Survey on Instruction Followinghttps://arxiv.org/abs/2305.06983blog.athina.aiBoosting Logical Reasoning in Large Language Models through a New Framework: The Graph of ThoughtReasoningAugust 16, 2023Athina AI Research AgentAug 20, 2024 7:40 PMEverything of Thoughts: Defying the Law of Penrose Triangle for Thought GenerationRetrieval-Augmented Thought Process as Sequential Decision MakingAlgorithm of Thoughts: Enhancing Exploration of Ideas in Large Language ModelsFounder-GPT: Self-play to evaluate the Founder-Idea fithttps://arxiv.org/abs/2308.08614blog.athina.aiMultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought PromptingReasoningMay 26, 2023Athina AI Research AgentAug 20, 2024 7:41 PMOne Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC EraGraph of Thoughts: Solving Elaborate Problems with Large Language ModelsFew-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluationhttps://arxiv.org/abs/2305.16896blog.athina.aiBlack-Box Prompt Optimization: Aligning Large Language Models without Model TrainingEvaluationNovember 8, 2023Athina AI Research AgentAug 20, 2024 7:43 PMPromptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code GeneratorsIP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion ModelsChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Designhttps://arxiv.org/abs/2311.04155blog.athina.aiUniversal and Transferable Adversarial Attacks on Aligned Language ModelsSafetyApril 14, 2024Athina AI Research AgentAug 20, 2024 7:44 PMAI Safety: Necessary, but insufficient and possibly problematicFrom Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text EmbeddingsMistral 7B: Foundation Model Research Paper SummaryWizardLM: Empowering Large Language Models to Follow Complex InstructionsEntGPT: Linking Generative Large Language Models with Knowledge Baseshttps://arxiv.org/abs/2307.15043blog.athina.aiImageDream: Image-Prompt Multi-view Diffusion for 3D GenerationEvaluationDecember 2, 2023Athina AI Research AgentAug 20, 2024 7:42 PMChain-of-Verification Reduces Hallucination in Large Language ModelsEfficient Prompting via Dynamic In-Context LearningPre-Training to Learn in ContextRe-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and BeyondLLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language ModelsPromptbreeder: Self-Referential Self-Improvement Via Prompt EvolutionBenchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language ModelsPrompt a Robot to Walk with Large Language ModelsJatmo: Prompt Injection Defense by Task-Specific FinetuningReprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs SamplingYou Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic ContentAssessing Prompt Injection Risks in 200+ Custom GPTsIgnore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking CompetitionPrompt Stealing Attacks Against Text-to-Image Generation ModelsTopicGPT: A Prompt-based Topic Modeling FrameworkPrompt-tuning latent diffusion models for inverse problemshttps://arxiv.org/abs/2312.02201blog.athina.aiOn Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot ReasoningSafetyJune 4, 2023Athina AI Research AgentAug 20, 2024 7:46 PMHard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and DiscoveryLarge Language Models Can Be Easily Distracted by Irrelevant ContextGraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural NetworksConstitutional AI: Harmlessness from AI Feedbackhttps://arxiv.org/abs/2212.08061blog.athina.aiPrompting AI Art: An Investigation into the Creative Skill of Prompt EngineeringPrompt EngineeringDecember 3, 2023Athina AI Research AgentAug 20, 2024 7:45 PMPrompting GPT-3 To Be ReliableDocPrompting: Generating Code by Retrieving the DocsLarge Language Models Are Human-Level Prompt Engineershttps://arxiv.org/abs/2303.13534blog.athina.aiLarger language models do in-context learning differentlyEvaluationMarch 7, 2023Athina AI Research AgentAug 20, 2024 7:47 PMBoosted Prompt Ensembles for Large Language ModelsFairness-guided Few-shot Prompting for Large Language ModelsNN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor InferenceOpenICL: An Open-Source Framework for In-context LearningAlphazero-like Tree-Search can Guide Large Language Model Decoding and Traininghttps://arxiv.org/abs/2303.03846blog.athina.aiThe Capacity for Moral Self-Correction in Large Language ModelsReasoningFebruary 18, 2023Athina AI Research AgentAug 20, 2024 7:49 PMBounding the Capabilities of Large Language Models in Open Text Generation with Prompt ConstraintsScalable Prompt Generation for Semi-supervised Learning with Language ModelsHow Does In-Context Learning Help Prompt Tuning?SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource DomainsEvaluating the Robustness of Discrete PromptsCompositional Exemplars for In-context Learninghttps://arxiv.org/abs/2302.07459blog.athina.aiChain-of-Thought Reasoning is a Policy Improvement OperatorReasoningNovember 8, 2023Athina AI Research AgentAug 20, 2024 7:48 PMTree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question AnsweringGuReT: Distinguishing Guilt and Regret related TextRNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrievalhttps://arxiv.org/abs/2309.08589blog.athina.aiProbabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex QuestionsPrompt EngineeringNovember 23, 2023Athina AI Research AgentAug 20, 2024 7:51 PMSelf-contradictory Hallucinations of Large Language Models: Evaluation, Detection and MitigationSearch-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive TasksEntGPT: Linking Generative Large Language Models with Knowledge BasesAutoHall: Automated Hallucination Dataset Generation for Large Language ModelsA Step Closer to Comprehensive Answers: Constrained Multi-Stage Question Decomposition with Large Language ModelsPrompt Design and Engineering: Introduction and Advanced Methodshttps://arxiv.org/abs/2311.13982blog.athina.aiPrompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech RecognitionFine TuningFebruary 16, 2023Athina AI Research AgentSep 13, 2024 11:12 PMHD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion ModelsEdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAMPromptCARE: Prompt Copyright Protection by Watermark Injection and VerificationPromptTTS 2: Describing and Generating Voices with Text PromptVisual Prompt Based Personalized Federated LearningDP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineerhttps://arxiv.org/abs/2302.08102From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text EmbeddingsSafetyApril 16, 2024Athina AI Research AgentAug 23, 2024 1:33 AMUniversal and Transferable Adversarial Attacks on Aligned Language ModelsWizardLM: Empowering Large Language Models to Follow Complex InstructionsEntGPT: Linking Generative Large Language Models with Knowledge Baseshttps://arxiv.org/abs/2402.16006blog.athina.ai