Athina AI - Athina AI Hub (Page 11)

research-papers

LLM Critics Help Catch LLM Bugs

Original Paper: https://cdn.openai.com/llm-critics-help-catch-llm-bugs-paper.pdf By: OpenAI Abstract: Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains “critic” models that help humans to more accurately

research-papers

From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data

Original Paper: https://arxiv.org/abs/2406.19292 By: Zheyang Xiong, Vasilis Papageorgiou, Kangwook Lee, Dimitris Papailiopoulos Abstract: Recent studies have shown that Large Language Models (LLMs) struggle to accurately retrieve information and maintain reasoning capabilities when processing long-context inputs. To address these limitations, we propose a finetuning approach utilizing

research-papers

Following Length Constraints in Instructions

Original Paper: https://arxiv.org/abs/2406.17744 By: Weizhe Yuan, Ilia Kulikov, Ping Yu, Kyunghyun Cho, Sainbayar Sukhbaatar, Jason Weston, Jing Xu Abstract: Aligned instruction following models can better fulfill user requests than their unaligned counterparts. However, it has been shown that there is a length bias in evaluation

research-papers

CodeGemma: Open Code Models Based on Gemma

Original Paper: https://arxiv.org/abs/2406.11409 By: CodeGemma Team: Heri Zhao, Jeffrey Hui, Joshua Howland, Nam Nguyen, Siqi Zuo, Andrea Hu, Christopher A. Choquette-Choo, Jingyue Shen, Joe Kelley, Kshitij Bansal, Luke Vilnis, Mateo Wirth, Paul Michel, Peter Choy, Pratik Joshi, Ravin Kumar, Sarmad Hashmi, Shubham Agrawal, Zhitao Gong,

research-papers

Soft Prompt Tuning for Augmenting Dense Retrieval with Large Language Models

Original Paper: https://arxiv.org/abs/2307.08303 By: Zhiyuan Peng, Xuyang Wu, Qifan Wang, Yi Fang Abstract: Dense retrieval (DR) converts queries and documents into dense embeddings and measures the similarity between queries and documents in vector space. One of the challenges in DR is the lack of domain-specific

research-papers

PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Original Paper: https://arxiv.org/abs/2406.12430 By: Myeonghwa Lee, Seonho An, Min-Soo Kim Abstract: In this paper, we conduct a study to utilize LLMs as a solution for decision making that requires complex data analysis. We define Decision QA as the task of answering the best decision, dbest,

research-papers

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Original Paper: https://arxiv.org/abs/2406.06326 By: Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen Meng Abstract: Large language models (LLMs) often struggle to provide up-to-date information due to their one-time training and the constantly evolving nature of the world. To keep LLMs

research-papers

On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey

Original Paper: https://arxiv.org/abs/2406.15126 By: Lin Long, Rui Wang, Ruixuan Xiao, Junbo Zhao, Xiao Ding, Gang Chen, Haobo Wang Abstract: Within the evolving landscape of deep learning, the dilemma of data quantity and quality has been a long-standing problem. The recent advent of Large Language Models

research-papers

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Original Paper: https://arxiv.org/abs/2406.10209 By: Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein Abstract: Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization,

research-papers

Discovering Preference Optimization Algorithms with and for Large Language Models

Original Paper: https://arxiv.org/abs/2406.08414 By: Chris Lu, Samuel Holt, Claudio Fanconi, Alex J. Chan, Jakob Foerster, Mihaela van der Schaar, Robert Tjarko Lange Abstract: Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization

research-papers

Extreme Compression of Large Language Models via Additive Quantization

Original Paper: https://arxiv.org/abs/2401.06118v3 By: Vage Egiazarian, Andrei Panferov, Denis Kuznedelev, Elias Frantar, Artem Babenko, Dan Alistarh Abstract: The emergence of accurate open large language models (LLMs) has led to a race towards performant quantization techniques which can enable their execution on end-user devices. In this

research-papers

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Original Paper: https://arxiv.org/abs/2406.04784 By: Ruihan Yang, Jiangjie Chen, Yikai Zhang, Siyu Yuan, Aili Chen, Kyle Richardson, Yanghua Xiao, Deqing Yang Abstract: Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming. However, these agents