Uniqueness Checks

Uniqueness checks ensure there are no duplicate values in columns where uniqueness is required, such as primary keys, unique identifiers, and transaction IDs.

Follow these best practices for uniqueness checks:

  • Apply to all primary key columns to validate database constraints.

  • Use percentage-based thresholds for columns where some duplication is acceptable.

  • Combine with completeness checks for columns that should be both unique and non-null.

  • Consider case sensitivity for text-based unique identifiers

Available Metrics

  • Duplicate Count: Absolute number of duplicate values.

  • Duplicate Percentage: Proportion of duplicates relative to total records.

Configuration Examples

Check Description

Configuration Steps

Example Check

Ensure no duplicate customer IDs

  1. Select customer_id column.

  2. Choose Duplicate Count check.

  3. Select = operator.

  4. Set threshold to 0.

duplicate_count(customer_id) = 0

Validate low percentage of duplicate email addresses

  1. Select email column.

  2. Choose Duplicate Percentage check.

  3. Select <= operator.

  4. Set threshold to 0.1.

duplicate_percent(email) <= 0.1