Escalation is where agentic systems earn or lose trust. The difference between clean handoffs and buried problems is twenty lines of policy and one observable signal.

Policy beats prompts

Prompts ask nicely. Policy enforces behavior. Production escalation needs thresholds, named owners, and logging — not "please involve a human when unsure."

We write escalation rules as code-adjacent specs: trigger, owner, channel, SLA, and fallback.

Observable signals

Confidence scores alone are unreliable. We pair them with business signals: deal size, sentiment shift, compliance keywords, or repeat failure on the same ticket type.

Every escalation emits an event the manager sees in the same dashboard as the KPI.

Field note

If the manager learns about failures from customers, the escalation design failed.

Testing escalation before launch

We run red-team scripts against escalation paths before go-live: ambiguous inputs, adversarial users, and missing data cases.

Agents that pass happy-path demos but fail red-team stay in shadow mode.

EC
Elena Cardoso
Head of AI Architecture, Arq AI

Twelve years building production systems in revenue ops.