Describe the feature
To evaluate the quality of the search, you need to collect or generate a dataset. Firstly, for Retrieval part - entry consisting query and relevant documents (also may contain negative examples). Also, we can assess not only retrieval part, but full RAG-pipeline - it will require dataset with question-answer pairs, and some LLM-judge.
Suggested solution
Wiki: Relevance metrics
Weaviate: article about metrics
HuggingFace: RAG evaluation cookbook
Additional context
No response
Describe the feature
To evaluate the quality of the search, you need to collect or generate a dataset. Firstly, for Retrieval part - entry consisting query and relevant documents (also may contain negative examples). Also, we can assess not only retrieval part, but full RAG-pipeline - it will require dataset with question-answer pairs, and some LLM-judge.
Suggested solution
Wiki: Relevance metrics
Weaviate: article about metrics
HuggingFace: RAG evaluation cookbook
Additional context
No response