The Teckel Judge
At the heart of Teckel AI is our proprietary auditing engine, the Teckel Judge. This advanced evaluation system provides comprehensive analysis of every response your AI generates, delivering both quantitative quality scores and actionable improvement recommendations.
How It Works
When a trace is sent to Teckel AI, it's immediately queued for processing by our two-stage evaluation system. This process is designed to be objective and consistent, applying the same high standards to every audit while providing detailed insights into both what happened and how to improve.
For security, you do not share your full internal documents with us. The Judge only analyzes the specific "chunks" of information that your RAG system retrieved and cited to answer the user's query.
Stage 1: Quantitative Analysis
Our proprietary judging system performs a comprehensive multi-dimensional analysis, breaking down each AI response into individual factual claims and evaluating three key dimensions of quality plus document freshness:
Faithfulness - Measures whether factual claims in the response are supported by the source documents
Context Precision - Evaluates how relevant the retrieved chunks are to the user's question
Response Relevancy - Determines how well the response addresses the user's specific question
Freshness Tracking - Assesses the age and currency of source documents used in the response
Stage 2: Qualitative Feedback
Following the quantitative analysis, the Teckel Judge provides qualitative feedback focused on documentation improvement opportunities. Rather than just flagging problems, this stage helps you understand exactly how to enhance your knowledge base for better future performance.
Claims-Based Intelligence
A key innovation in Teckel's approach is our claims-based analysis. For every AI response, we:
- Extract individual factual claims from the AI's response
- Map each claim to its supporting evidence in the source chunks
- Identify unsupported claims that lack adequate evidence
- Analyze chunk relevance to determine which retrieved information actually contributed to the answer
This granular analysis provides unprecedented visibility into your AI's reasoning process and helps pinpoint exactly where knowledge gaps exist.
Document Health Tracking
Beyond individual trace analysis, Teckel continuously monitors the health of your document base by:
- Aggregating performance metrics across all traces that reference each document
- Calculating document precision based on how often chunks from each document support factual claims
- Tracking freshness scores using the
last_updated
timestamps in your trace data - Identifying problematic documents that consistently lead to unsupported claims or poor performance
The Complete Audit Result
After evaluation is complete, the Teckel Judge provides:
- Three core quality scores measuring faithfulness, context precision, and response relevancy (0.0-1.0 scale)
- Document freshness assessment indicating whether information may be outdated
- Overall quality score providing a single performance indicator
- Claims breakdown showing which statements are supported and which need attention
- Document quality insights identifying which sources contribute to high-quality responses
- Qualitative feedback with specific recommendations for improving your documentation
- Issue categorization using tags like "missing_details", "needs_examples", or "unclear_terminology"
This comprehensive evaluation focuses primarily on document quality improvement rather than just response scoring, giving you both immediate insights and a clear roadmap for enhancing your knowledge base.