Athina AI Hub (Page 5)

research-papers

Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

Original Paper: https://arxiv.org/abs/2407.19825 By: Sania Nayab, Giulio Rossolini, Giorgio Buttazzo, Nicolamaria Manes, Fabrizio Giacomelli Abstract: Today's large language models (LLMs) can solve challenging question-answering tasks, and prompt engineering techniques, such as chain-of-thought (CoT), have gained attention for enhancing the explanation and correctness of

research-papers

PersonaGym: Evaluating Persona Agents and LLMs

Original Paper: https://arxiv.org/abs/2407.18416 By: Vinay Samuel, Henry Peng Zou, Yue Zhou, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Ameet Deshpande, Karthik Narasimhan, Vishvak Murahari Abstract: Persona agents, LLMs designed to act according to assigned personas, show impressive contextual responses across various sectors like education, healthcare, and

research-papers

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Original Paper: https://arxiv.org/pdf/2407.20183 By: Zehui Chen, Kuikun Liu, Qiuchen Wang, Jiangning Liu, Wenwei Zhang, Kai Chen, Feng Zhao Abstract: Information seeking and integration is a complex cognitive task that consumes enormous time and effort. Inspired by the remarkable progress of Large Language Models, recent works

research-papers

Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach

Original Paper: https://arxiv.org/abs/2407.16833 By: Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky Abstract: Retrieval Augmented Generation (RAG) has been a powerful tool for Large Language Models (LLMs) to efficiently process overly lengthy contexts. However, recent LLMs like Gemini-1.5 and GPT-4 show exceptional

research-papers

Context Embeddings for Efficient Answer Generation in RAG

Original Paper: https://arxiv.org/abs/2407.09252 By: David Rau, Shuai Wang, Hervé Déjean, Stéphane Clinchant Abstract Retrieval-Augmented Generation (RAG) allows for overcoming the limited knowledge of LLMs by extending the input with external information. As a consequence, the contextual inputs to the model become much longer which slows

research-papers

Generation Constraint Scaling Can Mitigate Hallucination

Original Paper: https://arxiv.org/abs/2407.16908 By: Georgios Kollias, Payel Das, Subhajit Chaudhury Abstract: Addressing the issue of hallucinations in large language models (LLMs) is a critical challenge. As the cognitive mechanisms of hallucination have been related to memory, here we explore hallucination for LLM that is enabled

research-papers

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Original Paper: https://arxiv.org/abs/2407.14057 By: Qichen Fu, Minsik Cho, Thomas Merth, Sachin Mehta, Mohammad Rastegari, Mahyar Najibi Abstract The inference of transformer-based large language models consists of two sequential stages: 1) a prefilling stage to compute the KV cache of prompts and generate the first token

research-papers

Weak-to-Strong Reasoning

Original Paper: https://arxiv.org/abs/2407.13647 By: Yuqing Yang, Yan Ma, Pengfei Liu Abstract: When large language models (LLMs) exceed human-level capabilities, it becomes increasingly challenging to provide full-scale and accurate supervisions for these models. Weak-to-strong learning, which leverages a less capable model to unlock the latent abilities

research-papers

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Original Paper: https://arxiv.org/abs/2306.00978 By: Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han Abstract: Large language models (LLMs) have transformed numerous AI applications. On-device LLM is becoming increasingly important: running LLMs locally on edge

research-papers

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Original Paper: https://arxiv.org/abs/2407.11963 By: Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen Abstract: In evaluating the long-context capabilities of large language models (LLMs), identifying content relevant to a user's query from original long documents is a crucial prerequisite for any LLM to answer

research-papers

Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation

Original Paper: https://arxiv.org/abs/2407.12216 By: Garima Agrawal, Tharindu Kumarage, Zeyad Alghamdi, Huan Liu Abstract: Large Language Models (LLMs) are proficient at generating coherent and contextually relevant text but face challenges when addressing knowledge-intensive queries in domain-specific and factual question-answering tasks. Retrieval-augmented generation (RAG) systems mitigate this

research-papers

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Original Paper: https://arxiv.org/abs/2407.09025 By: Yuzhang Tian, Jianbo Zhao, Haoyu Dong, Junyu Xiong, Shiyu Xia, Mengyu Zhou, Yun Lin, José Cambronero, Yeye He, Shi Han, Dongmei Zhang Abstract: Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language

Latest

Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost

PersonaGym: Evaluating Persona Agents and LLMs

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach

Context Embeddings for Efficient Answer Generation in RAG

Generation Constraint Scaling Can Mitigate Hallucination

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Weak-to-Strong Reasoning

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models