Skip to main content

Review queue

/recruiting/assessments/review-queue

The Review Queue is the manual scoring surface for free-text responses (typing, narrative SJT, etc.) where the LLM scoring engine flagged its output as low-confidence — or the scoring attempt failed altogether.

Filters

  • StatusMANUAL_REVIEW (LLM ran but flagged low confidence) or FAILED (scoring attempt errored)

What the table shows

Per row: candidate name, template, scoring status badge, submission timestamp, and an excerpt of the candidate's response.

The review action

The Review button opens a drawer (ReviewDrawer) with the full question, the candidate's answer, the LLM's tentative score (when present), and a numeric input for the adjudicator's final score.

Submitting the score:

  • Records the human-assigned score against the response
  • Removes the row from the queue immediately (no re-fetch)
  • Feeds back into the candidate's overall composite score

When it fires

The LLM scoring worker decides at scoring time whether to flag a response for manual review. Drivers include:

  • Low LLM confidence
  • Response length below a heuristic threshold for the question
  • Response content that failed safety / topical-fit checks

Failed scoring attempts (network error, model timeout) land in the same queue with status FAILED so adjudicators can step in before the candidate is left waiting.

Pairs with

  • Calibration drift — audit of how well the LLM and human reviewers agree over time.