Athina AI Hub (Page 9)

research-papers

On the Empirical Complexity of Reasoning and Planning in LLMs

Original Paper: https://arxiv.org/abs/2404.11041 By: Liwei Kang, Zirui Zhao, David Hsu, Wee Sun Lee Abstract: Large Language Models (LLMs) work surprisingly well for some complex reasoning problems via chain-of-thought (CoT) or tree-of-thought (ToT), but the underlying reasons remain unclear. We seek to understand the performance of

research-papers

How to Use a Custom Grading Criteria to Evaluate LLM Responses (LLM-as-a-Judge)

In the rapidly evolving field of language models, ensuring the accuracy and relevance of responses is crucial. This blog post will guide you through setting up a custom grading criteria to evaluate responses from large language models (LLMs), using a simple conditional evaluation system. What is it? A custom grading

research-papers

From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text Embeddings

Original Paper: https://arxiv.org/abs/2402.16006 By: Hao Wang, Hao Li, Minlie Huang, Lei Sha Abstract: The safety defense methods of Large language models(LLMs) stays limited because the dangerous prompts are manually curated to just few known attack types, which fails to keep pace with emerging varieties.

research-papers

Prompt Stealing Attacks Against Text-to-Image Generation Models

Original Paper: https://arxiv.org/abs/2302.09923 By: Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang Abstract: Text-to-Image generation models have revolutionized the artwork design process and enabled anyone to create high-quality images by entering text descriptions called prompts. Creating a high-quality prompt that consists of a subject and

research-papers

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification

Original Paper: https://arxiv.org/html/2311.09114v2 By: Haoqiang Kang, Juntong Ni, Huaxiu Yao Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in generating fluent text. However, they often encounter the challenge of generating inaccurate or hallucinated content. This issue is common in both non-retrieval-based generation and retrieval-augmented

research-papers

H2O-Danube-1.8B Technical Report

Original Paper: https://arxiv.org/abs/2401.16818 By: Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati Abstract: We present H2O-Danube, a series of small 1.8B language models consisting of H2O-Danube-1.8B, trained on 1T tokens, and the incremental improved H2O-Danube2-1.8B

research-papers

Universal and Transferable Adversarial Attacks on Aligned Language Models

Original Paper: https://arxiv.org/abs/2307.15043 By: Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson Abstract: Because "out-of-the-box" large language models are capable of generating a great deal of objectionable content, recent work has focused on aligning these models in an

research-papers

Post-Semantic-Thinking: A Robust Strategy to Distill Reasoning Capacity from Large Language Models

Original Paper: https://arxiv.org/html/2404.09170v1 By: Xiaoshu Chen, Sihang Zhou, Liang Ke, XinwangLiu Abstract Chain of thought finetuning aims to endow small student models with reasoning capacity to improve their performance towards a specific task by allowing them to imitate the reasoning procedure of large language models

research-papers

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Original Paper: https://arxiv.org/html/2305.18290v2 By: Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn Abstract: While large-scale unsupervised language models (LMs) learn broad world knowledge and some reasoning skills, achieving precise control of their behavior is difficult due to the completely unsupervised

research-papers

Analyzing Toxicity in Deep Conversations: A Reddit Case Study

Original Paper: https://arxiv.org/abs/2404.07879 By: Vigneshwaran Shankaran, Rajesh Sharma Abstract: Online social media has become increasingly popular in recent years due to its ease of access and ability to connect with others. One of social media's main draws is its anonymity, allowing users to

research-papers

RoT: Enhancing Large Language Models with Reflection on Search Trees

Original Paper: https://arxiv.org/abs/2404.05449 By: Wenyang Hui, Chengyue Jiang, Yan Wang, Kewei Tu Abstract: Large language models (LLMs) have demonstrated impressive capability in reasoning and planning when integrated with tree-search-based prompting methods. However, since these methods ignore the previous search experiences, they often make the same

research-papers

AI Safety: Necessary, but insufficient and possibly problematic

Original Paper: https://arxiv.org/html/2403.17419v1 Author: Deepak P., Queen’s University Belfast, UK (deepaksp@acm.org) Artificial Intelligence (AI) is evolving rapidly, bringing the topic of AI safety to the forefront of discussions. While ensuring AI systems are safe and dependable is crucial, there's a

Latest

On the Empirical Complexity of Reasoning and Planning in LLMs

How to Use a Custom Grading Criteria to Evaluate LLM Responses (LLM-as-a-Judge)

From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text Embeddings

Prompt Stealing Attacks Against Text-to-Image Generation Models

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification

H2O-Danube-1.8B Technical Report

Universal and Transferable Adversarial Attacks on Aligned Language Models

Post-Semantic-Thinking: A Robust Strategy to Distill Reasoning Capacity from Large Language Models

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Analyzing Toxicity in Deep Conversations: A Reddit Case Study

RoT: Enhancing Large Language Models with Reflection on Search Trees

AI Safety: Necessary, but insufficient and possibly problematic