Evaluation
Measure and improve LLM pipeline quality with automated evaluation.
📄️ Evaluation
SynapseKit includes built-in evaluation metrics for measuring the quality of RAG and LLM outputs. Inspired by RAGAS-style evaluation, these metrics help you quantify faithfulness, relevancy, and groundedness.
📄️ RAG Evaluator
Sampled, LLM-judge-based evaluation for production RAG pipelines. The evaluator scores a fraction of live queries on recall, precision, relevance, and answer quality, fires alerts when scores drop below thresholds, and tracks ROI of the evaluation itself.