Review queue
/recruiting/assessments/review-queue
The Review Queue is the manual scoring surface for free-text responses (typing, narrative SJT, etc.) where the LLM scoring engine flagged its output as low-confidence — or the scoring attempt failed altogether.
Filters
- Status —
MANUAL_REVIEW(LLM ran but flagged low confidence) orFAILED(scoring attempt errored)
What the table shows
Per row: candidate name, template, scoring status badge, submission timestamp, and an excerpt of the candidate's response.
The review action
The Review button opens a drawer (ReviewDrawer) with the full
question, the candidate's answer, the LLM's tentative score (when present),
and a numeric input for the adjudicator's final score.
Submitting the score:
- Records the human-assigned score against the response
- Removes the row from the queue immediately (no re-fetch)
- Feeds back into the candidate's overall composite score
When it fires
The LLM scoring worker decides at scoring time whether to flag a response for manual review. Drivers include:
- Low LLM confidence
- Response length below a heuristic threshold for the question
- Response content that failed safety / topical-fit checks
Failed scoring attempts (network error, model timeout) land in the same
queue with status FAILED so adjudicators can step in before the candidate
is left waiting.
Pairs with
- Calibration drift — audit of how well the LLM and human reviewers agree over time.