Quality
The Quality tab provides a detailed view of data quality across various dimensions, supported by different test types. Here’s what you can expect:
Quality Dimensions and Test Types
Our platform currently supports four quality dimensions, each associated by default with specific test types:
Accuracy: Measures how close the data values are to the true values. Tests include “Regex” and “Value in.”
Completeness: Measures the extent to which all required data elements are present. Tests include “Not Null.”
Uniqueness: Checks each data record to ensure it is unique within the dataset. Tests include “Is Unique.”
Validity: Ensures data conforms to acceptable standards, such as ranges and formats. Tests include “Is Email” and “Is UUID.”
Timeliness: Measures how up-to-date the data is.
Consistency: Measures reliability and uniformity of data within datasets.
Granularity: Measures level of detail or the degree of aggregation present.
Others: Any other tests not within the other categories.
You can customize the association of each Dimension to a supported monitor that can output a Data Quality score.

The Data Quality (DQ) score is calculated daily using the formula:
Where:
Error Rows: Total number of rows that failed the data quality check
Total Rows: Total number of rows scanned by the monitor
For example, if a “Not Null” test finds 10 null rows out of 1000 total rows, the score would be:
Per Dimension (Health Score)
Each DQ dimension (e.g., Accuracy, Completeness, Validity) groups multiple monitors.
To calculate the score for a dimension:
Sum up error rows across all monitors under the dimension.
Sum up total scanned rows across those monitors.
Apply the same DQ score formula:
This gives a weighted average, ensuring larger scans influence the score more than small ones.
Overall Data Health Score
The final DQ Health Score (shown on top of the dashboard) is:
Dimensions without any scanned rows are excluded from the average.

Overall Data Health Score
The Data Health Score represents the average score for all dimensions over the selected time period. The scores are color-coded for easy interpretation:
• Green (> 98%): Excellent health
• Yellow (95% - 98%): At risk
• Red (< 95%): Poor health
Custom Date Range and Filters
The custom date range supports up to six months, allowing for in-depth analysis over a quarter. The Quality dashboard also includes various filters to help you narrow down your data view, such as:
• Domains
• Data sources
• Data Owners
• Monitor mode (Scheduled, On-demand)
• Row creation preferences (filter for 'All Records' scan only)
• Tags
• Classifications

Source/Domain Summary
The Source/Domain Summary in the Quality tab provides results based on selected domains and shows scores for key quality metrics. This helps you gain a deeper understanding of your data’s health across different data sources and domains, making it easier to pinpoint areas for improvement.

Last updated