Context Relevancy metric evaluates how relevant the retrieved context is to a given input question. The score ranges from 0 to 1, where higher values indicate better performance.
The calculation is based on the number of relevant statements in the retrieved context compared to the total number of statements.
Formula:
Please note that Number of Extracted Sentences from the context are nothing but the relevant sentence that provides the answer.
Example:
Questions = What is SpaceX and Who found it
Answers = It is an American aerospace company founded by Elon Musk.
Contexts = SpaceX is an American aerospace company founded in 2002, Founded by Elon Musk, The full form of SpaceX is Space Exploration Technologies Corporation
Ground Truth = SpaceX is an American aerospace company founded by Elon Musk.
Solution:
Statement 1: SpaceX is an American aerospace company founded in 2002 (relevant)
Statement 2: Founded by Elon Musk (relevant)
Statement 3: The full form of SpaceX is Space Exploration Technologies Corporation (Not relevant)
So now we have 2 relevant statements to the question and 1 is not directly relevant to the question.
Let's calculate the context relevancy for this example: Number of Extracted Sentences = 2
Total Number of Sentences in Context = 3
Code:
Context Relevancy using RAGAS:
# This part is removed from the RAGAS.
Context Relevancy using Athina AI:
from athina.evals import RagasContextRelevancy
data = [
{
"query": "What is SpaceX and Who found it",
"context": ["SpaceX is an American aerospace company founded in 2002", "Founded by Elon Musk", "The full form of SpaceX is Space Exploration Technologies Corporation"],
},
]
# Load the data from CSV, JSON, Athina or Dictionary
dataset = Loader().load_dict(data)
eval_model = "gpt-3.5-turbo"
RagasContextRelevancy(model=eval_model).run_batch(data=dataset).to_df()