Decube
Try for free
  • 🚀Overview
    • Welcome to decube
    • Getting started
      • How to connect data sources
    • Security and Compliance
    • Data Policy
    • Changelog
    • Public Roadmap
  • 🔌Data Warehouses
    • Snowflake
    • Redshift
    • Google Bigquery
    • Databricks
    • Azure Synapse
  • 🔌Relational Databases
    • PostgreSQL
    • MySQL
    • SingleStore
    • Microsoft SQL Server
    • Oracle
  • 🔌Transformation Tools
    • dbt (Cloud Version)
    • dbt Core
    • Fivetran
    • Airflow
    • AWS Glue
    • Azure Data Factory
    • Apache Spark
      • Apache Spark in Azure Synapse
    • OpenLineage (BETA)
    • Additional configurations
  • 🔌Business Intelligence
    • Tableau
    • Looker
    • PowerBI
  • 🔌Data Lake
    • AWS S3
    • Azure Data Lake Storage (ADLS)
      • Azure Function for Metadata
    • Google Cloud Storage (GCS)
  • 🔌Ticketing and Collaboration
    • ServiceNow
    • Jira
  • 🔒Security and Connectivity
    • Enabling VPC Access
    • IP Whitelisting
    • SSH Tunneling
    • AWS Identities
  • ✅Data Quality
    • Incidents Overview
    • Incident model feedback
    • Enable asset monitoring
    • Available Monitor Types
    • Available Monitor Modes
    • Catalog: Add/Modify Monitor
    • Set Up Freshness & Volume Monitors
    • Set Up Field Health Monitors
    • Set Up Custom SQL Monitors
    • Grouped-by Monitors
    • Modify Schema Drift Monitors
    • Modify Job Failure Monitors (Data Job)
    • Custom Scheduling For Monitors
    • Config Settings
  • 📖Catalog
    • Overview of Asset Types
    • Assets Catalog
    • Asset Overview
    • Automated Lineage
      • Lineage Relationship
      • Supported Data Sources and Lineage Types
    • Add lineage relationships manually
    • Add tags and classifications to fields
    • Field Statistcs
    • Preview sample data
  • 📚Glossary
    • Glossary, Category and Terms
    • Adding a new glossary
    • Adding Terms and Linked Assets
  • Moving Terms to Glossary/Category
  • AI Copilot
    • Copilot's Autocomplete
  • 🤝Collaboration
    • Ask Questions
    • Rate an asset
  • 🌐Data Mesh [BETA]
    • Overview on Data Mesh [BETA]
    • Creating and Managing Domains/Sub-domains
    • Adding members to Domain/Sub-domain
    • Linking Entities to Domains/Sub-domains
    • Adding Data Products to Domains/Subdomains
    • Creating a draft Data Asset
    • Adding a Data Contract - Default Settings
    • Adding a Data Contract - Freshness Test
    • Adding a Data Contract - Column Tests
    • Publishing the Data Asset
  • 🏛️Governance
    • Governance module
    • Classification Policies
    • Auto-classify data assets
  • ☑️Approval Workflow
    • What are Change Requests?
    • Initiate a change request
    • What are Access Requests?
    • Initiate an Access Request
  • 📑Data reconciliation
    • Adding a new recon
    • Understand your recon results
    • Supported sources for Recon
  • 📋Reports
    • Overview of Reports
    • Supported sources for Reports
    • Asset Report: Data Quality Scorecard
  • 📊Dashboard
    • Dashboard Overview
    • Incidents
    • Quality
  • ⏰Alert Notifications
    • Get alerts on email
    • Connect your Slack channels
    • Connect to Microsoft Teams
    • Webhooks integration
  • 🏛️Manage Access
    • User Management - Overview
    • Invite users
    • Deactivate or re-activate users
    • Revoke a user invite
  • 🔐Group-based Access Controls
    • Groups Management - Overview
    • Create Groups & Assign Policies
    • Source-based Policies
    • Administrative-based Policies
    • Module-based Policies
    • What is the "Owners" group?
  • 🗄️Org Settings
    • Multi-factor authentication
    • Single Sign-On (SSO) with Microsoft
    • Single Sign-On (SSO) with JumpCloud
  • ❓Support
    • Supported Features by Integration
    • Frequently Asked Questions
    • Supported Browsers and System Requirements
  • Public API (BETA)
    • Overview
      • Data API
        • Glossary
        • Lineage
        • ACL
          • Group
      • Control API
        • Users
    • API Keys
Powered by GitBook
On this page
  1. Data reconciliation

Adding a new recon

Evaluate data differences with reconciliation feature with a simple interface.

PreviousInitiate an Access RequestNextUnderstand your recon results

Last updated 22 days ago

The Data Recon feature will reach end-of-life on 15 May 2025. It will be deprecated and permanently removed from the platform after this date.

Please note that Data Recon is an opt-in module because to complete the data recon, we will need to egress some data. Please refer to our to understand how we handle your data before proceeding.

Data Recon module is where you'll be able to validate data differences between two selected datasets. Within the Data Recon main page, you will be able to see all data recon jobs that have been run in your organization.

To add a new recon job, you will need to click on Add New Recon on the top right.

How to add a new data recon

To run a new recon, you will first need to add a recon configuration.

  1. Once you have made the selection(s), you can then search for the table by clicking on the field and typing in the name of the table.

  2. Once the tables are selected, select the primary keys that are shared between both tables.

  1. Optionally, you can then select the datetime fields for the table selected. This opens up the Recon duration selection where you can inform our scanner how far back into your dataset to be scanned.

When a datetime field is not selected, we will use a sampling mode to enable scanning of large datasets by sampling a few rows instead of a full scan. Regardless of configuration, we will stop finding reconciliation errors after finding 100,000 mismatched rows.

  1. You can also add conditions to your recon configuration. Use this to exclude rows that are not in the scope of the comparison, or set limits to the time period to be checked. We will query your data by using your conditions set in a WHERE clause.

"WHERE" keyword does not need be included. Please ensure your conditions are written in an SQL dialect of the selected sources respectively.

  1. After that, map the columns that you'd like to compare between both tables.

  1. Lastly, define a schedule to run your data recon, if required. You are able to set the following schedule: 1 Hour, 12 Hours, 1 Day, 7 Days, 30 Days.

A new recon job will be started and automatically shown in the Data Recon main page based on the recon configuration.

To stop further scheduled runs of a configuration, disable the toggle on any of the completed runs in Data Recon Details.

Select a . You can also compare between tables from two different sources by toggling Use separate data sources on the top right to enable the second data source selection.

📑
supported data source
Data Policy
Using our recon module - Check for data diffs in your production and staging tables.
Data Recon jobs that have been ran in your organization.
Selection of data sources, tables and primary keys.
Recon duration selection.
Filter conditions for your recon configuration.
Mapping columns between both tables.
Set an auto-schedule so you can run recons automatically.