Decube
Try for free
  • 🚀Overview
    • Welcome to decube
    • Getting started
      • How to connect data sources
    • Security and Compliance
    • Data Policy
    • Changelog
    • Public Roadmap
  • 🔌Data Warehouses
    • Snowflake
    • Redshift
    • Google Bigquery
    • Databricks
    • Azure Synapse
  • 🔌Relational Databases
    • PostgreSQL
    • MySQL
    • SingleStore
    • Microsoft SQL Server
    • Oracle
  • 🔌Transformation Tools
    • dbt (Cloud Version)
    • dbt Core
    • Fivetran
    • Airflow
    • AWS Glue
    • Azure Data Factory
    • Apache Spark
      • Apache Spark in Azure Synapse
    • OpenLineage (BETA)
    • Additional configurations
  • 🔌Business Intelligence
    • Tableau
    • Looker
    • PowerBI
  • 🔌Data Lake
    • AWS S3
    • Azure Data Lake Storage (ADLS)
      • Azure Function for Metadata
    • Google Cloud Storage (GCS)
  • 🔌Ticketing and Collaboration
    • ServiceNow
    • Jira
  • 🔒Security and Connectivity
    • Enabling VPC Access
    • IP Whitelisting
    • SSH Tunneling
    • AWS Identities
  • ✅Data Quality
    • Incidents Overview
    • Incident model feedback
    • Enable asset monitoring
    • Available Monitor Types
    • Available Monitor Modes
    • Catalog: Add/Modify Monitor
    • Set Up Freshness & Volume Monitors
    • Set Up Field Health Monitors
    • Set Up Custom SQL Monitors
    • Grouped-by Monitors
    • Modify Schema Drift Monitors
    • Modify Job Failure Monitors (Data Job)
    • Custom Scheduling For Monitors
    • Config Settings
  • 📖Catalog
    • Overview of Asset Types
    • Assets Catalog
    • Asset Overview
    • Automated Lineage
      • Lineage Relationship
      • Supported Data Sources and Lineage Types
    • Add lineage relationships manually
    • Add tags and classifications to fields
    • Field Statistcs
    • Preview sample data
  • 📚Glossary
    • Glossary, Category and Terms
    • Adding a new glossary
    • Adding Terms and Linked Assets
  • Moving Terms to Glossary/Category
  • AI Copilot
    • Copilot's Autocomplete
  • 🤝Collaboration
    • Ask Questions
    • Rate an asset
  • 🌐Data Mesh [BETA]
    • Overview on Data Mesh [BETA]
    • Creating and Managing Domains/Sub-domains
    • Adding members to Domain/Sub-domain
    • Linking Entities to Domains/Sub-domains
    • Adding Data Products to Domains/Subdomains
    • Creating a draft Data Asset
    • Adding a Data Contract - Default Settings
    • Adding a Data Contract - Freshness Test
    • Adding a Data Contract - Column Tests
    • Publishing the Data Asset
  • 🏛️Governance
    • Governance module
    • Classification Policies
    • Auto-classify data assets
  • ☑️Approval Workflow
    • What are Change Requests?
    • Initiate a change request
    • What are Access Requests?
    • Initiate an Access Request
  • 📑Data reconciliation
    • Adding a new recon
    • Understand your recon results
    • Supported sources for Recon
  • 📋Reports
    • Overview of Reports
    • Supported sources for Reports
    • Asset Report: Data Quality Scorecard
  • 📊Dashboard
    • Dashboard Overview
    • Incidents
    • Quality
  • ⏰Alert Notifications
    • Get alerts on email
    • Connect your Slack channels
    • Connect to Microsoft Teams
    • Webhooks integration
  • 🏛️Manage Access
    • User Management - Overview
    • Invite users
    • Deactivate or re-activate users
    • Revoke a user invite
  • 🔐Group-based Access Controls
    • Groups Management - Overview
    • Create Groups & Assign Policies
    • Source-based Policies
    • Administrative-based Policies
    • Module-based Policies
    • What is the "Owners" group?
  • 🗄️Org Settings
    • Multi-factor authentication
    • Single Sign-On (SSO) with Microsoft
    • Single Sign-On (SSO) with JumpCloud
  • ❓Support
    • Supported Features by Integration
    • Frequently Asked Questions
    • Supported Browsers and System Requirements
  • Public API (BETA)
    • Overview
      • Data API
        • Glossary
        • Lineage
        • ACL
          • Group
      • Control API
        • Users
    • API Keys
Powered by GitBook
On this page
  • Example output of generated report
  • How to generate the report
  • Supported Monitors with default Quality Dimension
  1. Reports

Asset Report: Data Quality Scorecard

This report shows the data quality scoring for supported Field Health monitors configured.

PreviousSupported sources for ReportsNextDashboard Overview

Last updated 2 months ago

Example output of generated report

The output of the report will show the report based on each monitor that was enabled based on the time period selected. For example, if you had enabled a Not Null monitor on a column, it will show the DQ score for the monitor for the selected time period.

The output shown on the UI is limited to a preview of 25 rows. You will need to download the report as a csv to see the entire output of the generated report if it is beyond 25 rows.

The output csv will include the following columns:

  • report_generation_date: The date the report was run on the platform.

  • data_owner: The current assigned owner in the Data Catalog.

  • qual_id: The fully qualified name of the backend object inside the Catalog.

  • source, database, collection, dataset, column: Name of the Catalog object and where it originates from.

  • Tags: Any tags that are added to the object. This is separated by ";" if there are more than 1 tag.

  • Dimension: The dimension that is associated to the monitor type.

Dimensions include: Validity, Completeness, Accuracy, Uniqueness, Timeliness, Consistency, Granularity, Others.

  • Monitor_name & monitor_description: the name and description added by the user to the monitor. (currently supported on Data Contracts only)

  • Monitor_id: Unique identifier for the added monitor.

  • Monitor_mode: It will show if the monitor was run on scheduled or on-demand to get the metrics.

  • Filter_mode: The monitor configuration for scanning, either by incrementally scanning (timestamp, sql expression) or by entire table (all records). This is important to accurately calculate the DQ score per monitor by grouping them with the filter_mode (see more info below).

  • Agg_error_row_count: The aggregated count of rows with errorneous records (for incremental scans).

  • Agg_total_row_count: The aggregated total count of rows which were scanned for the selected time period (for incremental scans).

  • Agg_dq_score: The DQ score based on the ratio of agg_error_row_count to agg_total_row_count.

  • Latest_error_row_count: The count of rows with errorneous records (for all records scan).

  • Latest_total_row_count: The total count of rows that were scanned for (for all record scans).

  • Latest_dq_score: The DQ score based on the ratio of latest_error_row_count to latest_total_row_count.

It is recommended that where the configured filter_mode = timestamp / sql expression, that the agg_dq_score is taken as the DQ score for the monitor as this is the average dq scoring for all scans that were done incrementally for the time period.

Whereas where filter_mode = all records, where the entire table is scanned, the latest_dq_score should be taken instead to see the metrics taken from the last scan performed.

How to generate the report

By default, users will need to select the Data Source, start date and end date to generate the report. Without adding any filters, the output .csv will include all the monitors that have been enabled in the data source.

However, users can additionally add filters to the report to narrow down the output. These filters include:

  • Add Schemas or Add Tables: This limits the output of the csv to the selected schema or tables only.

  • Quality dimensions: Limit the output to only the test types with the selected Quality Dimensions.

  • Add an asset owner: Limit the output to only objects where this user is designated as the asset owner.

  • Filter by tags: Limit the output of the csv to only objects with the selected tags.

  • Filter by classifications: Limit the output of the csv to only objects that have classifications added to them.

Supported Monitors with default Quality Dimension

Test Type
Dimension
Current Feature Availability

Is Unique

Uniqueness

Config & Data Mesh > Data Contract

Regex

Accuracy

Config & Data Mesh > Data Contract

Is Email

Validity

Config & Data Mesh > Data Contract

Is UUID

Validity

Config & Data Mesh > Data Contract

Not Null

Completeness

Config & Data Mesh > Data Contract

Value is

Validity

Data Mesh > Data Contract

Value in

Accuracy

Data Mesh > Data Contract

Date in the past

Validity

Data Mesh > Data Contract

Date in the future

Validity

Data Mesh > Data Contract

For monitors that have been enabled on your account but not listed here, they will not be included in the report for DQ scoring.

Test_type: See below.

📋
supported test types
Example of output csv