# CATI Quality Control vs. Contact-Center QA — Comparison Sheet

A one-page reference for evaluating any tool that claims to "monitor 100% of your calls."
The two disciplines start from the same recording and do completely different jobs.

Source: VeriCATI — https://vericati.com/blog/contact-center-qa-vs-cati-quality-control

---

## The core divergence

| Dimension | Contact-center QA | CATI quality control |
| --- | --- | --- |
| What's being judged | The agent's performance | The integrity of the survey data |
| Core question | "Did the rep handle this call well?" | "Is the delivered data what the respondent actually said?" |
| Typical metrics | Handle time, script adherence, tone/sentiment, CSAT, FCR | Response accuracy, screener eligibility, skip-logic adherence, verbatim fidelity, falsification signals |
| Ground truth | The script and the SLA | The questionnaire AND the recorded data file |
| Coverage norm | Sample ~2–5% | Sample ~5–15% (capacity-bound) |
| What a miss costs | A worse customer experience | A wave the client can dispute; a contaminated dataset |
| Standard it answers to | Internal QA scorecards | ISO 20252 / IQCS; AAPOR falsification guidance |

---

## The three failure modes CATI QC must catch (and contact-center QA cannot)

1. **Response–data mismatch** — the spoken answer and the recorded value disagree.
   Only catchable by comparing audio to the data file.
2. **Protocol failure** — screener administered wrong, skip logic violated, scale
   misread, leading probe. Scored against the questionnaire, not a sales script.
3. **Falsification** — the interview was partly or wholly fabricated. A survey-research
   discipline with its own audible signatures and decades of methodological literature.

---

## Questions to ask before you believe a "100% AI monitoring" claim

- Does it ingest the **data file**, or only the audio? (Audio-only cannot catch
  response–data mismatch — the error class that matters most in survey research.)
- Is it scored against **our questionnaire and skip logic**, or a generic script model?
- Can it evaluate **screener eligibility** and **quota-cell** decisions?
- Does it verify **typed verbatims** against what was actually spoken?
- Does it detect **falsification patterns**, or only tone/sentiment?
- Who makes the **verdict** — does a human confirm/dismiss/reclassify, or does the
  tool auto-judge the interviewer? (Falsification flags must never auto-file.)
- What's the **false-positive rate**, and how does calibration work over the first
  few waves?
- What does **setup** actually require — batch upload of recordings + data file, or a
  months-long platform integration?

---

## The bottom line

Transcription is shared infrastructure — it transfers from contact-center tools to
survey QC. Everything above it (what gets checked, what counts as a finding, what a
finding is compared against) is survey-specific. A contact-center QA tool judges whether
the agent performed; CATI QC verifies whether the data is true. Make sure you're buying
the second one.
