Athina AI - Athina AI Hub (Page 13)

research-papers

Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Original Paper: https://arxiv.org/abs/2311.04934 By: In Gim, Guojun Chen, Seung-seob Lee, Nikhil Sarda, Anurag Khandelwal, Lin Zhong Abstract: We present Prompt Cache, an approach for accelerating inference for large language models (LLM) by reusing attention states across different LLM prompts. Many input prompts have overlapping text

research-papers

Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering

Original Paper: https://arxiv.org/abs/2404.14464 By: Li Jiapeng, Liu Runze, Li Yabo, Zhou Tong, Li Mingling, Chen Xiang Abstract: Multi-hop question answering is a knowledge-intensive complex problem. Large Language Models (LLMs) use their Chain of Thoughts (CoT) capability to reason complex problems step by step, and retrieval-augmentation

research-papers

Empowering Multi-step Reasoning across Languages via Tree-of-Thoughts

Original Paper: https://arxiv.org/abs/2311.08097 By: Leonardo Ranaldi, Giulia Pucci, Federico Ranaldi, Elena Sofia Ruzzetti, Fabio Massimo Zanzotto Abstract: Reasoning methods, best exemplified by the well-known Chain-of-Thought (CoT), empower the reasoning abilities of Large Language Models (LLMs) by eliciting them to solve complex tasks in a step-by-step

research-papers

RAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models

Original Paper: https://arxiv.org/abs/2404.12065 By: M. Abdul Khaliq, P. Chang, M. Ma, B. Pflugfelder, F. Miletić Abstract: The escalating challenge of misinformation, particularly in the context of political discourse, necessitates advanced solutions for fact-checking. We introduce innovative approaches to enhance the reliability and efficiency of multimodal

research-papers

Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences

Original Paper: https://arxiv.org/abs/2404.12272 By: Shreya Shankar, J.D. Zamfirescu-Pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo Abstract: Due to the cumbersome nature of human evaluation and limitations of code-based evaluation, Large Language Models (LLMs) are increasingly being used to assist humans in evaluating LLM outputs.

research-papers

CYBERSECEVAL 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

Original Paper: https://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/ By: Manish Bhatt∗, Sahana Chennabasappa∗, Yue Li∗, Cyrus Nikolaidis∗, Daniel Song∗, Shengye Wan∗, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, David Molnar, Spencer Whitman, Joshua Saxe∗ ∗Co-equal primary author Abstract: Large language models (LLMs) introduce new security risks, but there

research-papers

WizardLM: Empowering Large Language Models to Follow Complex Instructions

Original Paper: https://arxiv.org/abs/2304.12244 By: Can Xu, Qingfeng Sun, Kai Zheng, Xiubo Geng, Pu Zhao, Jiazhan Feng, Chongyang Tao, Daxin Jiang Abstract: Training large language models (LLMs) with open-domain instruction following data brings colossal success. However, manually creating such instruction data is very time-consuming and labor-intensive.

research-papers

On the Empirical Complexity of Reasoning and Planning in LLMs

Original Paper: https://arxiv.org/abs/2404.11041 By: Liwei Kang, Zirui Zhao, David Hsu, Wee Sun Lee Abstract: Large Language Models (LLMs) work surprisingly well for some complex reasoning problems via chain-of-thought (CoT) or tree-of-thought (ToT), but the underlying reasons remain unclear. We seek to understand the performance of

research-papers

How to Use a Custom Grading Criteria to Evaluate LLM Responses (LLM-as-a-Judge)

In the rapidly evolving field of language models, ensuring the accuracy and relevance of responses is crucial. This blog post will guide you through setting up a custom grading criteria to evaluate responses from large language models (LLMs), using a simple conditional evaluation system. What is it? A custom grading

research-papers

From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text Embeddings

Original Paper: https://arxiv.org/abs/2402.16006 By: Hao Wang, Hao Li, Minlie Huang, Lei Sha Abstract: The safety defense methods of Large language models(LLMs) stays limited because the dangerous prompts are manually curated to just few known attack types, which fails to keep pace with emerging varieties.

research-papers

Prompt Stealing Attacks Against Text-to-Image Generation Models

Original Paper: https://arxiv.org/abs/2302.09923 By: Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang Abstract: Text-to-Image generation models have revolutionized the artwork design process and enabled anyone to create high-quality images by entering text descriptions called prompts. Creating a high-quality prompt that consists of a subject and

research-papers

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification

Original Paper: https://arxiv.org/html/2311.09114v2 By: Haoqiang Kang, Juntong Ni, Huaxiu Yao Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in generating fluent text. However, they often encounter the challenge of generating inaccurate or hallucinated content. This issue is common in both non-retrieval-based generation and retrieval-augmented