Core Concepts

This page introduces the key concepts you will encounter throughout Rime. Understanding these building blocks helps you navigate the platform and plan your data stack configuration.

Projects

A project is the top-level organisational unit. Everything in Rime belongs to a project: connectors, infrastructure resources, transformation models, pipelines, and governance settings. Most organisations start with a single project for their production data stack and may add others for development or staging environments.

Your licensing tier determines how many projects you can create.

Connectors

A connector is a configured connection to a data source. Rime includes built-in connectors for:

Databases — PostgreSQL, MySQL, Microsoft SQL Server, MongoDB, Oracle
SaaS applications — Salesforce, Xero, Shopify, Google Sheets
DevOps tools — Jira, GitHub, GitLab, Azure DevOps
REST APIs — a generic connector for any HTTP API
Files — CSV and JSON uploads

Each connector stores encrypted credentials, a list of selected tables/endpoints, and an optional sync schedule. When a connector runs, it extracts data, converts it to Parquet format, uploads it to S3, and triggers Snowpipe to load it into Snowflake.

Infrastructure resources

Rime manages cloud infrastructure on your behalf. You configure resources through the UI and Rime provisions them using Terraform internally. Supported resource types include:

Snowflake: databases, schemas, warehouses, roles, grants, pipes, stages
AWS: S3 buckets, SNS topics, IAM roles and policies

You never write or see Terraform code. Rime presents a change preview (resources to create, modify, or destroy) before applying anything, similar to a pull request review for infrastructure.

Extraction

Extraction is the process of pulling data from source systems into Snowflake. Rime’s extraction pipeline works in stages:

Connect — the connector authenticates with the source system
Discover — Rime scans available tables, columns, and data types
Extract — data is read as Apache Arrow record batches
Write — Arrow data is written to Parquet files with Snappy compression
Upload — Parquet files are uploaded to S3
Ingest — S3 event notifications trigger Snowpipe, which loads data into Snowflake raw tables

Each extraction run is tracked with row counts, timing, and error details.

Transformation

Transformation shapes raw data into analytics-ready models. Rime supports two methodologies:

Kimball — the traditional dimensional modelling approach. Raw data flows through staging tables into dimension and fact tables, then into data marts. This is the default for most customers.
Data Vault — a methodology designed for large-scale, auditable data warehouses. Raw data is modelled into hubs, links, and satellites, then transformed through a business vault into marts. Available at Business tier and above.

You configure transformations through the UI by selecting source tables, choosing a methodology, and defining keys and grain. Rime generates and executes dbt models internally. You never write SQL directly.

Lineage

Lineage tracks the relationships between data objects — which source tables feed which models, and how data flows from raw ingestion through to final marts. Rime provides an interactive lineage graph that visualises these dependencies, making it easy to understand the impact of changes.

Pipelines

A pipeline is a directed acyclic graph (DAG) of steps that execute in sequence or parallel. Step types include:

Extract — run a connector sync
Provision — apply infrastructure changes
Transform — execute dbt models
Validate — run data quality tests
SQL — execute custom SQL against Snowflake
Webhook — call an external HTTP endpoint

Pipelines support cron scheduling with timezone awareness (including DST handling), retry policies, and versioning with rollback.

Monitoring

Rime collects metrics from Snowflake, dbt, S3, and your infrastructure, then evaluates alert rules against those metrics. You configure:

Alert rules — conditions and thresholds that trigger alerts (e.g., “alert if a Snowpipe fails” or “alert if data volume deviates more than 50% from the 30-day average”)
Notification channels — where alerts are sent (email, Slack, PagerDuty, webhooks)
Escalation policies — chains of notification channels with increasing urgency if an alert is not acknowledged

Governance

Rime takes a masked-by-default approach to data governance. Every column in every table is masked until explicitly classified and unmasked for specific roles. Key governance features:

Data classification — assign privacy levels (Public, Internal, Confidential, Restricted) and PII types to columns
PII detection — automatic scanning for personally identifiable information using patterns for New Zealand and Australian data formats (IRD numbers, NHI numbers, phone numbers, bank accounts)
Masking policies — Snowflake tag-based masking policies that control which roles can see unmasked data
Compliance reporting — coverage reports showing classification completeness, access matrices, and audit summaries

Tenants

Rime is a multi-tenant platform. Each customer organisation is a tenant with its own isolated database. Your data, configuration, and credentials are never shared with other tenants. See Tenant Isolation for details.

Licensing tiers

Rime offers four tiers — Free/Trial, Small Business, Business, and Business Critical — with increasing limits on projects, connectors, users, and features. See Licensing for a full comparison.