Gabbar Interview Evaluator

The problem

Interview debriefs over-index on charisma, recency, or whichever interviewer speaks most confidently. The hardest dimensions — agency, judgment under ambiguity, real craft motivation — get evaluated least consistently.

What I built

Gabbar takes a transcript and outputs a structured scorecard: risk classification, scaling assessment (will this person grow with the environment or plateau), and explicit bias flags.

The doctrine is 7 layers deep. Each layer targets a specific evaluation dimension. The system forces the debrief conversation to address what actually matters instead of defaulting to gut feel.

Design decision

Built explicit bias safeguards into the evaluation loop. The system flags when a score might be driven by pattern-matching on demographics rather than evidence. Most evaluation tools skip this entirely.

Gabbar Interview Evaluator

The problem

What I built

Design decision

See what else I've built