Skip to main content

The Teckel Judge

At the heart of Teckel AI is our proprietary auditing engine, the Teckel Judge. This advanced evaluation system provides comprehensive analysis of every response your AI generates, delivering both quantitative quality scores and actionable improvement recommendations.

How It Works

When a trace is sent to Teckel AI, it's immediately queued for processing by our two-stage evaluation system. This process is designed to be objective and consistent, applying the same high standards to every audit while providing detailed insights into both what happened and how to improve.

For security, you do not share your full internal documents with us. The Judge only analyzes the specific "chunks" of information that your RAG system retrieved and cited to answer the user's query.

Stage 1: Quantitative Analysis

Our proprietary judging system performs a comprehensive multi-dimensional analysis, breaking down each AI response into individual factual claims and evaluating three key dimensions of quality plus document freshness:

Faithfulness - Measures whether factual claims in the response are supported by the source documents
Context Precision - Evaluates how relevant the retrieved chunks are to the user's question
Response Relevancy - Determines how well the response addresses the user's specific question Freshness Tracking - Assesses the age and currency of source documents used in the response

Stage 2: Qualitative Feedback

Following the quantitative analysis, the Teckel Judge provides qualitative feedback focused on documentation improvement opportunities. Rather than just flagging problems, this stage helps you understand exactly how to enhance your knowledge base for better future performance.

Claims-Based Intelligence

A key innovation in Teckel's approach is our claims-based analysis. For every AI response, we:

  1. Extract individual factual claims from the AI's response
  2. Map each claim to its supporting evidence in the source chunks
  3. Identify unsupported claims that lack adequate evidence
  4. Analyze chunk relevance to determine which retrieved information actually contributed to the answer

This granular analysis provides unprecedented visibility into your AI's reasoning process and helps pinpoint exactly where knowledge gaps exist.

Document Health Tracking

Beyond individual trace analysis, Teckel continuously monitors the health of your document base by:

  • Aggregating performance metrics across all traces that reference each document
  • Calculating document precision based on how often chunks from each document support factual claims
  • Tracking freshness scores using the last_updated timestamps in your trace data
  • Identifying problematic documents that consistently lead to unsupported claims or poor performance

The Complete Audit Result

After evaluation is complete, the Teckel Judge provides:

  • Three core quality scores measuring faithfulness, context precision, and response relevancy (0.0-1.0 scale)
  • Document freshness assessment indicating whether information may be outdated
  • Overall quality score providing a single performance indicator
  • Claims breakdown showing which statements are supported and which need attention
  • Document quality insights identifying which sources contribute to high-quality responses
  • Qualitative feedback with specific recommendations for improving your documentation
  • Issue categorization using tags like "missing_details", "needs_examples", or "unclear_terminology"

This comprehensive evaluation focuses primarily on document quality improvement rather than just response scoring, giving you both immediate insights and a clear roadmap for enhancing your knowledge base.