Skip to content

Escalation Policies

Escalation policies define what happens when an alert fires and nobody responds. If an alert is not acknowledged within a configured time window, the policy moves to the next step in the chain — typically a louder or more urgent notification channel. This ensures critical issues reach someone even if the first notification is missed.

How escalation works

An escalation policy is an ordered list of steps. Each step specifies:

  1. A notification channel (or channel group) to send the alert to
  2. A wait time before escalating to the next step

When an alert fires with an escalation policy attached, Rime sends the alert to the first step’s channel and starts a timer. If the alert is still in a firing state (not acknowledged or resolved) when the timer expires, the alert is sent to the next step’s channel, and the next timer starts. This continues until the alert is acknowledged, resolved, or the policy runs out of steps.

If the policy reaches the final step and the alert is still not acknowledged, the last step’s channel continues to receive notifications on subsequent evaluation cycles.

Creating an escalation policy

Navigate to Monitoring > Escalation Policies > Create Policy. You will need to provide:

Policy name

A descriptive name for the policy, such as “Critical production alerts” or “Business-hours escalation”.

Steps

Each step requires:

FieldDescription
ChannelThe notification channel or channel group to notify at this step
Wait timeHow long to wait (in minutes) before escalating to the next step if the alert is not acknowledged

You can add as many steps as you need. The minimum wait time is 5 minutes. There is no maximum, but escalation policies are most effective with wait times under 60 minutes per step.

Example escalation chain

A typical three-step escalation for critical alerts:

StepChannelWait time
1Team email list— (immediate)
2Slack #data-alerts channel15 minutes
3PagerDuty on-call30 minutes

In this example:

  1. When the alert fires, an email is sent to the team immediately.
  2. If nobody acknowledges the alert within 15 minutes, the alert is also sent to the Slack channel.
  3. If another 30 minutes pass without acknowledgement (45 minutes total), the alert triggers a PagerDuty incident for the on-call engineer.

Acknowledgement

Acknowledging an alert pauses its escalation. When someone clicks Acknowledge on an alert (either in the dashboard or via a link in the notification), the escalation timer stops and no further steps are triggered.

Acknowledgement is not the same as resolution. An acknowledged alert is still active — it means someone has seen it and is working on it. The alert remains visible in the dashboard with an “Acknowledged” status until it is resolved.

You can optionally add a note when acknowledging, explaining what action you are taking. Notes are visible in the alert’s activity timeline.

Who can acknowledge

Any user with at least the Editor role in the project can acknowledge alerts. Viewers can see alerts but cannot acknowledge or resolve them.

Auto-resolution

Alerts can resolve automatically when the underlying condition clears. If an alert rule fires because a metric exceeded a threshold, and the metric later drops back below the threshold, the evaluation engine marks the alert as resolved on the next evaluation cycle.

When an alert auto-resolves:

  • The escalation chain stops (no further steps are triggered)
  • A resolution notification is sent to the channels that were already notified
  • The alert moves to “Resolved” status in the dashboard
  • If PagerDuty was notified, a resolve event is sent to close the PagerDuty incident

Auto-resolution is enabled by default. You can disable it per alert rule if you want alerts to remain active until manually resolved — for example, if the metric clearing does not necessarily mean the problem is fixed.

Manual resolution

You can manually resolve an alert from the dashboard or alert detail page. Manual resolution behaves the same as auto-resolution: it stops escalation, sends resolution notifications, and marks the alert as resolved.

Use manual resolution when:

  • You have fixed the issue but the evaluation engine has not yet detected the fix
  • The alert was triggered by a transient issue that has already passed
  • You want to clear the alert and will monitor the situation yourself

Assigning policies to rules

An escalation policy is linked to alert rules. When creating or editing an alert rule, you can select an escalation policy from the dropdown. If no policy is selected, the rule’s alerts are sent to all configured notification channels simultaneously with no escalation behavior.

Multiple rules can share the same escalation policy. For example, you might use the same “Critical production alerts” policy for both Snowpipe failure rules and pipeline failure rules.

Policy ordering tips

When designing escalation chains, consider:

  • Start quiet, get louder. Email first, then team chat, then paging. This avoids unnecessary pages for issues someone catches quickly.
  • Keep wait times proportional to severity. Critical alerts should escalate faster (5-15 minute steps) than warnings (30-60 minute steps).
  • End with a pager. The final step in a critical escalation should be something that wakes someone up. If nobody responds to email and Slack, PagerDuty ensures the alert is not lost.
  • Use channel groups for the first step. Sending to both email and Slack simultaneously as the first step gives the broadest initial coverage.

Viewing escalation history

Each alert tracks its escalation timeline in the activity log. You can see:

  • When each escalation step was triggered
  • Which channels were notified and whether delivery succeeded
  • When the alert was acknowledged or resolved
  • Any notes added by team members

This history is available on the alert detail page under the Activity tab.

Next steps