Data Vault
Data Vault is a modelling methodology designed for large-scale data warehouses where auditability, history tracking, and flexible schema evolution are priorities. It is available on Business tier and above. For most organisations, Kimball is the simpler and recommended starting point.
When to use Data Vault
Data Vault is a good fit when:
- Full audit history is required — every change to every attribute is tracked with load timestamps. Nothing is overwritten.
- Sources change frequently — adding new source systems or columns does not require restructuring existing models. Hubs and links are additive.
- Regulatory compliance demands traceability — government agencies and regulated industries need to prove exactly when data was loaded and what it looked like at any point in time.
- Multiple source systems describe the same entities — Data Vault handles merging data from different sources for the same business key naturally through its hub/satellite pattern.
If your use case is straightforward analytics with relatively stable sources, Kimball’s dimensional models will be simpler to build and query. See the comparison at the end of this page.
Data Vault layers
Data Vault organises data into five layers:
Raw tables -> Hubs + Links + Satellites -> Business Vault -> MartsHubs
A hub represents a core business concept identified by its business key. Each hub table contains:
| Column | Purpose |
|---|---|
hub_hash_key | A hash of the business key (generated by Rime) |
business_key | The natural key from the source system |
load_timestamp | When this business key was first seen |
record_source | Which source system provided this key |
Hubs are insert-only. Once a business key is recorded, its hub row never changes. Examples: hub_customer (keyed on customer_id), hub_product (keyed on product_sku), hub_order (keyed on order_number).
Configuring a hub
- Go to the Hubs tab and click Add Hub
- Enter a name (e.g.,
hub_customer) - Select the staging table that contains this business entity
- Choose the business key column or columns
- If multiple source systems provide the same entity, add additional source mappings
- Click Save
Links
A link captures a relationship between two or more hubs. It represents a business event or association — an order links a customer to a product, an employment record links an employee to a department.
Each link table contains:
| Column | Purpose |
|---|---|
link_hash_key | A hash of the combined hub keys |
hub_hash_key_1 | Foreign key to the first hub |
hub_hash_key_2 | Foreign key to the second hub |
load_timestamp | When this relationship was first seen |
record_source | Which source system provided this relationship |
Links can reference two or more hubs. Like hubs, links are insert-only.
Configuring a link
- Go to the Links tab and click Add Link
- Enter a name (e.g.,
link_order_product) - Select the staging table that contains this relationship
- Choose the hub references — which hubs does this link connect?
- Map the foreign key columns from the staging table to each referenced hub
- Click Save
Satellites
Satellites store the descriptive attributes and change history for hubs and links. Every time an attribute changes, a new satellite row is inserted with the updated values and a new load timestamp. Previous rows are preserved.
Each satellite table contains:
| Column | Purpose |
|---|---|
parent_hash_key | The hub or link hash key this satellite describes |
load_timestamp | When this version of the attributes was loaded |
load_end_timestamp | When this version was superseded (null if current) |
record_source | Which source system provided these attributes |
hash_diff | A hash of all attribute values for change detection |
| Attribute columns | The descriptive data (name, address, status, etc.) |
Rime handles change detection automatically. If the attribute values have not changed since the last load, no new satellite row is created.
Configuring a satellite
- Go to the Satellites tab and click Add Satellite
- Enter a name (e.g.,
sat_customer_details) - Select the parent — the hub or link this satellite describes
- Choose the attributes — the descriptive columns to track
- Click Save
You can have multiple satellites per hub or link. This is useful when different attributes change at different rates or come from different sources. For example, sat_customer_contact (email, phone — changes occasionally) and sat_customer_activity (last_login, session_count — changes frequently).
Business vault
The business vault contains derived or calculated data that does not exist in any source system. It follows the same hub/link/satellite structure but represents business logic rather than raw data. Examples:
- A satellite that calculates customer lifetime value from order history
- A link that derives product affinity relationships from co-purchase patterns
- A hub that generates a unified customer identity from multiple source systems
Configuring a business vault model
- Go to the Business Vault tab and click Add Model
- Choose the model type: hub, link, or satellite
- Define the inputs: which existing hubs, links, or satellites feed this model
- Configure the transformation logic: aggregations, calculations, or filters
- Click Save
Business vault models are optional. Simple Data Vault implementations may skip this layer and build marts directly from raw vault objects.
Marts
The mart layer translates the hub/link/satellite structure into business-friendly tables that analysts can query without understanding Data Vault concepts. Marts typically look like dimensional models — they join hubs, links, and satellites into flat or star-schema tables.
Configuring a mart
- Go to the Marts tab and click Add Mart
- Enter a name (e.g.,
mart_customer_orders) - Select the hubs and links to include
- Choose which satellite attributes to surface
- Define aggregations if needed (sum, count, average)
- Set a time grain if applicable
- Click Save
Data Vault vs Kimball
| Aspect | Kimball | Data Vault |
|---|---|---|
| History tracking | Optional (SCD Type 2) | Built-in for all attributes |
| Schema changes | May require restructuring | Additive (new hubs/satellites) |
| Query complexity | Simple star-schema joins | More joins (hub + link + satellite) |
| Auditability | Partial | Full (load timestamps, sources) |
| Learning curve | Lower | Higher |
| Best for | Analytics, reporting | Enterprise data warehouses, regulated industries |
| Rime tier | All tiers | Business and above |
Many organisations use both: Data Vault as the core warehouse for history and auditability, with Kimball-style marts on top for analyst consumption. Rime supports this pattern — your mart layer can follow dimensional modelling conventions regardless of which methodology feeds it.
Licensing
Data Vault is available on Business tier and Business Critical tier. If you are on Free/Trial or Small Business tier, the Data Vault option will not appear in the methodology selection when creating a transformation project.
Next steps
- Review the Kimball Methodology if you are evaluating which approach to use
- Learn how to visualise data flow in the Lineage graph
- Automate Data Vault loads in Building Pipelines