Answer Semantic Similarity refers to the similarity between the embedding of the RAG response and the embedding of the ground truth answer. The score ranges from 0 to 1, with higher scores showing a better match between the two answers.
This evaluation uses a cross-encoder model to calculate the semantic similarity score.
Example:
Questions = What is SpaceX?
Answers = It is an American aerospace company.
Ground Truth = SpaceX is an American aerospace company.
Solution:
To calculate the answer similarity model follows the following steps:
- Step 1: Use the specified embedding model to convert the ground truth answer into a numerical vector representation that captures its semantic meaning.
- Step 2: Similarly, vectorize the generated answer using the same embedding model.
- Step 3: Calculate the cosine similarity between the two vectors to quantify how closely the generated answer aligns with the ground truth.
Code:
Answer Semantic Similarity using RAGAS:
from datasets import Dataset
from ragas.metrics import answer_similarity
from ragas import evaluate
data_samples = {
'question': ['What is SpaceX?', 'Who found it?','What exactly does SpaceX do?'],
'answer': ['It is an American aerospace company', 'SpaceX founded by Elon Musk','SpaceX produces and operates the Falcon 9 and Falcon rockets'],
'contexts' : [['SpaceX is an American aerospace company founded in 2002'],
['SpaceX, founded by Elon Musk, is worth nearly $210 billion'], ['The full form of SpaceX is Space Exploration Technologies Corporation']],
'ground_truth': ['SpaceX is an American aerospace company', 'Founded by Elon Musk','SpaceX produces and operates the Falcon 9 and Falcon Heavy rockets']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset,metrics=[answer_similarity])
score.to_pandas()
Answer Semantic Similarity using Athina AI:
from athina.evals import RagasAnswerSemanticSimilarity
data = [
{
"response": "It is an American aerospace company",
"expected_response": "SpaceX is an American aerospace company"
},
{
"response": "SpaceX founded by Elon Musk",
"expected_response": "Founded by Elon Musk"
},
{
"response": "SpaceX produces and operates the Falcon 9 and Falcon rockets",
"expected_response": "SpaceX produces and operates the Falcon 9 and Falcon Heavy rockets"
},
]
dataset = Loader().load_dict(data)
eval_model = "gpt-3.5-turbo"
RagasAnswerSemanticSimilarity(model=eval_model).run_batch(data=dataset).to_df()