Decube
Try for free
  • 🚀Overview
    • Welcome to decube
    • Getting started
      • How to connect data sources
    • Security and Compliance
    • Data Policy
    • Changelog
    • Public Roadmap
  • 🔌Data Warehouses
    • Snowflake
    • Redshift
    • Google Bigquery
    • Databricks
    • Azure Synapse
  • 🔌Relational Databases
    • PostgreSQL
    • MySQL
    • SingleStore
    • Microsoft SQL Server
    • Oracle
  • 🔌Transformation Tools
    • dbt (Cloud Version)
    • dbt Core
    • Fivetran
    • Airflow
    • AWS Glue
    • Azure Data Factory
    • Apache Spark
      • Apache Spark in Azure Synapse
    • OpenLineage (BETA)
    • Additional configurations
  • 🔌Business Intelligence
    • Tableau
    • Looker
    • PowerBI
  • 🔌Data Lake
    • AWS S3
    • Azure Data Lake Storage (ADLS)
      • Azure Function for Metadata
    • Google Cloud Storage (GCS)
  • 🔌Ticketing and Collaboration
    • ServiceNow
    • Jira
  • 🔒Security and Connectivity
    • Enabling VPC Access
    • IP Whitelisting
    • SSH Tunneling
    • AWS Identities
  • ✅Data Quality
    • Incidents Overview
    • Incident model feedback
    • Enable asset monitoring
    • Available Monitor Types
    • Available Monitor Modes
    • Catalog: Add/Modify Monitor
    • Set Up Freshness & Volume Monitors
    • Set Up Field Health Monitors
    • Set Up Custom SQL Monitors
    • Grouped-by Monitors
    • Modify Schema Drift Monitors
    • Modify Job Failure Monitors (Data Job)
    • Custom Scheduling For Monitors
    • Config Settings
  • 📖Catalog
    • Overview of Asset Types
    • Assets Catalog
    • Asset Overview
    • Automated Lineage
      • Lineage Relationship
      • Supported Data Sources and Lineage Types
    • Add lineage relationships manually
    • Add tags and classifications to fields
    • Field Statistcs
    • Preview sample data
  • 📚Glossary
    • Glossary, Category and Terms
    • Adding a new glossary
    • Adding Terms and Linked Assets
  • Moving Terms to Glossary/Category
  • AI Copilot
    • Copilot's Autocomplete
  • 🤝Collaboration
    • Ask Questions
    • Rate an asset
  • 🌐Data Mesh [BETA]
    • Overview on Data Mesh [BETA]
    • Creating and Managing Domains/Sub-domains
    • Adding members to Domain/Sub-domain
    • Linking Entities to Domains/Sub-domains
    • Adding Data Products to Domains/Subdomains
    • Creating a draft Data Asset
    • Adding a Data Contract - Default Settings
    • Adding a Data Contract - Freshness Test
    • Adding a Data Contract - Column Tests
    • Publishing the Data Asset
  • 🏛️Governance
    • Governance module
    • Classification Policies
    • Auto-classify data assets
  • ☑️Approval Workflow
    • What are Change Requests?
    • Initiate a change request
    • What are Access Requests?
    • Initiate an Access Request
  • 📑Data reconciliation
    • Adding a new recon
    • Understand your recon results
    • Supported sources for Recon
  • 📋Reports
    • Overview of Reports
    • Supported sources for Reports
    • Asset Report: Data Quality Scorecard
  • 📊Dashboard
    • Dashboard Overview
    • Incidents
    • Quality
  • ⏰Alert Notifications
    • Get alerts on email
    • Connect your Slack channels
    • Connect to Microsoft Teams
    • Webhooks integration
  • 🏛️Manage Access
    • User Management - Overview
    • Invite users
    • Deactivate or re-activate users
    • Revoke a user invite
  • 🔐Group-based Access Controls
    • Groups Management - Overview
    • Create Groups & Assign Policies
    • Source-based Policies
    • Administrative-based Policies
    • Module-based Policies
    • What is the "Owners" group?
  • 🗄️Org Settings
    • Multi-factor authentication
    • Single Sign-On (SSO) with Microsoft
    • Single Sign-On (SSO) with JumpCloud
  • ❓Support
    • Supported Features by Integration
    • Frequently Asked Questions
    • Supported Browsers and System Requirements
  • Public API (BETA)
    • Overview
      • Data API
        • Glossary
        • Lineage
        • ACL
          • Group
      • Control API
        • Users
    • API Keys
Powered by GitBook
On this page
  • Step 1: Set-up
  • Step 2: Configure: Scheduled Monitor
  • Get Notified/Custom Alert
  • Step 2: Configure: On-demand Monitor
  1. Data Quality

Set Up Field Health Monitors

Follow these steps to configure monitors for specific field tests, available as On-Demand and Scheduled modes.

PreviousSet Up Freshness & Volume MonitorsNextSet Up Custom SQL Monitors

Last updated 2 months ago

NOTE: On Demand Monitors are not applicable for "Cardinality" test.

Enabling Field Health Monitoring

To enable field health monitoring:

  1. Navigate to the Config landing page.

  2. Select the “Field Health” card.

You can also activate field health monitoring within the "Asset Details" section of the Data Catalog Module. This can be done via the "Monitors" tab.

For more detailed information you can refer to below link

Once selected, you’ll be redirected to the “Create a New Monitor” form.

  • The “Create a New Monitor” form consists of two steps:

    Setup

    Configure

The form fields will become available as you select the mandatory options.

Step 1: Set-up

  • Choose the test type from dropdown

    • You can choose which test types to run. Our system provides several test types mentioned below:

      1. Null%: Measures the percentage of null values in a column to identify data gaps or incomplete records.

      2. Not Null: Verifies that all values in a column are non-null, ensuring that critical fields contain valid data.

      3. Unique%: Calculates the percentage of unique values in a column, helping detect redundancy or duplication.

      4. Unique: Confirms whether all values in a column are unique, ensuring data consistency where required.

      5. Average: Computes the average of numerical values in a column, useful for analyzing trends or outliers.

      6. Min: Identifies the minimum value in a column to detect anomalies or validate data boundaries.

      7. Max: Identifies the maximum value in a column to verify data limits or detect errors.

      8. Cardinality: Measures the number of distinct values in a column, useful for understanding data variability or relationships.

      9. String Length: Validates the length of strings in a column to ensure compliance with formatting rules or constraints.

      10. Is Email: Checks if the values in a column conform to a valid email format, ensuring proper data integrity for email fields.

      11. Is UUID: Verifies whether the values in a column match a valid UUID (Universal Unique Identifier) format.

      12. Matches Regex: Validates values in a column against a user-defined regular expression to ensure they match specific patterns or rules.

  • Important Note for Synapse/SQL Server Users:

When setting up the Match REGEX threshold for data quality monitoring:

• Synapse/SQL Server requires string matching patterns instead of standard REGEX syntax.

• Ensure that the value entered in the “Set Threshold” field aligns with the string pattern supported by Synapse/SQL Server.

If the current preset field tests is not sufficient to run the specific test you require, you can also create a test via custom SQL script.

  • Select the data source & Filter by Schema is optional.

  • Select the dataset and Column you want to monitor.

  • Choose Monitor mode: Scheduled or On-Demand

  • Enable “Grouped By” (if applicable) by toggling the switch.

  • Select the column for grouping and click Validate.

  • A success message (“Column is valid to be grouped by”) confirms validation.

  • Click “Proceed to Monitor Setup” to move to the next step.

Step 2: Configure: Scheduled Monitor

  • Once you proceed to setup, you’ll reach the “Configure” page, where you can review your previous selections.

Within the "Configure" popup, users must complete the necessary fields to save their preferences and successfully set up their field health monitor. These required fields include:

  • Can create multiple tests for each test type per column/table

  • Able to add a test name to differentiate monitors created

  • Monitor Name

  • Monitor Description is Optional.

  • Row Creation: Select the Row creation from given options:

    • Timestamp (Select Timestamp from the dropdown column)

    • Validation for SQL Expression (when SQL Expression is chosen)

    • All Records

    • Enable Smart Training(Optional): Train your monitor on historical data to reduce the training period

  • Enable Auto Threshold (if Enable Machine Learning is activated)

  • Set Threshold (if Enable Machine Learning is deactivated)

When using SQL Expression, validating your query is compulsory. Ensure that your query is written in the dialect compatible with your linked data source, as illustrated below:

Google BigQuery - CAST(your_string_column AS DATETIME)

PostgreSQL - your_timestamp_column::timestamp

When working with Google BigQuery, you can review the provided documentation for further details here.

Smart Training requires Row Creation to be selected.

To activate Smart Training in Row Creation:

  • Users should initially select the timestamp.

  • If SQL Expression is chosen for row creation, users are required to validate the SQL Expression.

  • To set custom alerts, you must first turn on the "Notify default channel" toggle. Activating this will allow users to specify their desired alert channels, be it via emails or Slack channels.

    • You can select the desired alert channels in the dropdown.

    • Mention the address or channel name in the field.

  • At last Specify the Incident Level.

  • Click on Submit and your monitor is created successfully.

  • Once monitor is created successfully you will be redirected to ALL MONITORS tab.

Step 2: Configure: On-demand Monitor

Note: Key Differences: i. The "Frequency" field is not relevant for configuring any On Demand monitors and is therefore neglected.

ii. “Enable Smart Training”"Auto Threshold" options is neglected when it comes to setting up any On Demand Monitor.

iii. Grouped By is not available for On-demand monitor mode.

  • Select On-Demand as the Monitor Mode and click “Proceed to Monitor Setup”.

  • Within the "Configure" popup, users must complete the necessary fields to save their preferences and successfully set up their field health monitor. These required fields include:

  • Can create multiple tests for each test type per column/table

  • Able to add a test name to differentiate monitors created

  • Monitor Name

  • Monitor Description is Optional.

  • Row Creation: Select the Row creation from given options:

    • Timestamp (Select a timestamp column from the dropdown)

    • Validation for SQL Expression (when SQL Expression is chosen)

    • All Records

  • Lookback Period

  • Threshold

  • Incident Levels

Custom Notifications: Custom alerts can be configured as in scheduled monitors.

Finalizing On-Demand Monitor Setup

  • Upon completing the required fields to create an On Demand monitor for field health, users can complete the setup process by choosing one of the following confirmation buttons based on their preferred use case:

    • Save: Creates the monitor without running it. This is an applicable option for users who wish to set up an on demand monitor without running the monitor scan immediately after creation.

    • Save and run : This option is applicable for users who wish to run the On Demand Monitor immediately upon creation. For the next time users wish to run the on demand monitor again, users can navigate to the All Monitor's.

    • After selecting the above option you will be redirected to ALL MONITORS tab.

  • Modify Monitoring

To modify an existing monitor:

  1. Click the ellipsis (︙) and select View Monitor.

  2. Click on Run once to run the monitor manually.

For a detailed understanding of monitor modes, check out

Frequency (Learn more about )

Quality Dimension (Optional) for more understanding refer to

Quality Dimension (Optional) for more understanding refer to

Go to

✅
Catalog: Add/Modify Monitor
Custom SQL Monitor
Available Monitor Modes
Custom Scheduling For Monitors
supported test types
Get Notified/Custom Alert
supported test types
Get Notified/Custom Alert
All Monitors.
Selecting Field Health Card
Setting-up Monitor
Grouped-By Disabled
Grouped-by enabled
Choose Monitor Mode
Overview from previous selection
Overview for setting-up frequency
Overview of set-up monitor with supported quality dimension
Setting-up notification/custom alert
Choosing monitor mode
On-demand monitor configuration
Modify Monitor from All Monitors