Skip to content

Alert Rules

Alert rules define the conditions under which Rime generates an alert. Each rule monitors a specific type of resource, evaluates a condition against a threshold, and fires an alert when the condition is met. Alerts appear on the monitoring dashboard and are sent to any configured notification channels.

Creating an alert rule

Navigate to Monitoring > Alert Rules > Create Rule. Each rule requires the following fields:

Name

A descriptive name for the rule. This appears in the alert list and in notifications, so choose something that makes the alert immediately understandable — for example, “Warehouse credit usage > 100” or “Snowpipe ingestion delay > 5 minutes”.

Resource type

The type of resource this rule monitors. Each resource type exposes different metrics that you can build conditions against.

Resource typeAvailable metrics
SnowpipeIngestion delay, error count, rows loaded, file count
dbt jobRun duration, test failures, model errors, rows affected
Connector runDuration, rows extracted, error count, tables failed
WarehouseCredit usage, queue time, query count, active sessions
PipelineRun duration, step failures, overall status

Condition

The condition combines a metric, an operator, and a threshold value. The available operators are:

OperatorMeaningExample
Greater thanFires when the metric exceeds the thresholdCredit usage > 100
Less thanFires when the metric drops below the thresholdRows loaded < 1
EqualsFires when the metric exactly matches the thresholdError count = 0 (useful inverted with “not equals”)
Not equalsFires when the metric does not match the thresholdPipeline status != “succeeded”
ContainsFires when the metric value contains the given stringError message contains “timeout”

The “contains” operator is only available for string-type metrics (such as error messages or status values). Numeric metrics use the comparison operators.

Severity

Each rule is assigned a severity level that determines how it appears in the dashboard and how it flows through escalation policies:

  • Critical — something is broken and needs immediate attention. Examples: pipeline failure, Snowpipe completely stalled, warehouse out of credits.
  • Warning — something is degraded or approaching a limit. Examples: connector run slower than usual, warehouse queue time increasing.
  • Info — a noteworthy event that does not require action. Examples: a dbt model was rebuilt, a large data load completed.

Evaluation frequency

By default, all alert rules are evaluated every 60 seconds. The evaluation engine checks the current metric value against the rule’s condition on each cycle. If the condition is met, an alert is created (or an existing alert is updated). If the condition clears, the alert is auto-resolved if the rule has auto-resolution enabled.

Custom evaluation frequencies are not currently supported. The 60-second cycle balances responsiveness with system load.

Grouping

When a rule fires for multiple resources of the same type, the alerts are grouped under the rule. For example, if you have a rule “Connector error count > 0” and three connectors fail in the same evaluation cycle, you will see one alert group with three individual alerts rather than three unrelated entries.

Grouping keeps the dashboard and notification channels manageable during widespread incidents. Each alert within a group can be acknowledged or resolved independently.

Deduplication

Rime prevents duplicate alerts for the same problem. Each alert is assigned a fingerprint calculated as the SHA-256 hash of three values:

  • The rule ID
  • The resource type
  • The resource ID

If an alert with the same fingerprint already exists and is still active (firing or acknowledged), the evaluation engine updates the existing alert’s last-seen timestamp rather than creating a new one. This means you will not receive repeated notifications for the same ongoing issue.

A new alert is only created when:

  • No active alert exists with the same fingerprint, or
  • The previous alert with that fingerprint has been resolved

Enabling and disabling rules

You can disable a rule without deleting it. A disabled rule is not evaluated and will not fire new alerts. Existing alerts from the rule remain in their current state (they are not automatically resolved when the rule is disabled).

To disable a rule, toggle the Enabled switch on the rule detail page. Re-enabling the rule resumes evaluation on the next 60-second cycle.

Disabling a rule is useful during planned maintenance or when you are investigating a known issue and do not want alert noise.

Editing and deleting rules

You can edit any field of an existing rule. Changes take effect on the next evaluation cycle. Editing a rule does not resolve or clear existing alerts that were already fired by it.

Deleting a rule permanently removes it. Active alerts from the deleted rule are automatically resolved with a note indicating the rule was removed.

Built-in rules

New projects come with a set of default alert rules that cover common failure scenarios:

  • Pipeline run failed (critical)
  • Connector run failed (warning)
  • dbt test failure (warning)
  • Snowpipe ingestion delay > 15 minutes (warning)
  • Warehouse credit usage > 80% of quota (info)

You can modify or delete these defaults. They are provided as a starting point; you should adjust thresholds and severities to match your operational requirements.

Tier limits

Alert rules are available on all tiers. The difference between tiers is which notification channels can receive the alerts, not how many rules you can create.

Next steps