# Incidents Overview

## Transform Reactive Incident Response into Proactive Data Quality Monitoring

**Decube's Data Quality module** empowers your team to shift from reactive incident response to proactive data trust. Our ML-powered monitoring system detects anomalies before they impact downstream systems, AI/ML models, and business decisions.

### Why Data Quality Monitoring Matters

* **🚨 Early Detection**: Catch data issues before they cascade downstream
* **🤖 AI-Ready Data**: Ensure clean inputs for accurate ML models and AI systems
* **📊 Business Confidence**: Make decisions based on trusted, validated data
* **⚡ Faster Resolution**: Automated alerts enable immediate response to quality issues

***

## Getting Started with Data Quality

### 🚀 Quick Setup (15 minutes)

1. [**Enable Asset Monitoring**](https://docs.decube.io/data-quality/enable-asset-monitoring) - Start monitoring your critical tables
2. [**Set Up Alert Notifications**](https://docs.decube.io/alert-notifications/notification-alerts) - Get notified when issues occur
3. [**Review Incidents**](#understanding-incidents) - Learn to manage quality issues

***

## Monitor Types & Capabilities

### 🔍 Available Monitor Types

**Core Monitoring:**

* [**Freshness**](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-freshness-and-volume-monitors) - Detect when data stops updating
* [**Volume**](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-freshness-and-volume-monitors) - Monitor row count changes and anomalies
* [**Field Health**](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-field-tests) - Validate data quality at the column level
* [**Schema Drift**](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-schema-drift-monitors) - Alert on table structure changes

**Advanced Monitoring:**

* [**Custom SQL**](https://docs.decube.io/data-quality/how-to-set-up-monitors/custom-sql-monitors) - Write custom validation logic for specific business rules
* [**Job Failure**](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-data-job-job-failure-monitors) - Monitor ETL pipeline job execution
* [**Grouped-By**](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-grouped-by-monitors) - Segment monitoring by dimension values

### 🎯 Monitor Modes

* **Scheduled**: Continuous monitoring with configurable frequency
* **On-Demand**: Manual execution for ad-hoc validation

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-38918ecce4e5766b9f8e91f810e95a719f2aebbf%2Fimage.png?alt=media" alt=""><figcaption><p>Incident Overview</p></figcaption></figure>

## Understanding Incidents

**Data Quality incidents** are automatically triggered when monitors detect anomalies or threshold violations. Each incident provides detailed context to help you understand and resolve data quality issues quickly.

### Incident Details

Our monitoring system categorizes incidents into six main types:

| **Type**         | **Purpose**                  | **Use Case**                                               |
| ---------------- | ---------------------------- | ---------------------------------------------------------- |
| **Freshness**    | Data update delays           | Critical for real-time dashboards and daily reports        |
| **Volume**       | Unexpected row count changes | Detect missing data loads or data pipeline issues          |
| **Field Health** | Column-level data quality    | Validate nulls, uniqueness, ranges, and patterns           |
| **Schema Drift** | Table structure changes      | Prevent downstream application failures                    |
| **Custom SQL**   | Business rule violations     | Monitor complex business logic and data relationships      |
| **Job Failure**  | ETL pipeline failures        | Ensure data transformation processes complete successfully |

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-94ca16942af580f5c5c9323422d330a546b3169c%2Fimage.png?alt=media" alt=""><figcaption><p>Incident Overview</p></figcaption></figure>

### Incident Details

{% embed url="<https://www.loom.com/share/a1d5912d5d48409cabda52dd65483b46?sid=bc50de8d-b7d1-43ec-b65f-ce4c0054b967>" %}

### Incident Management Workflow

### Incident Overview

The incident dashboard provides a consolidated view of all data quality issues across your organization.

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-38918ecce4e5766b9f8e91f810e95a719f2aebbf%2Fimage.png?alt=media" alt=""><figcaption><p>Incident Overview Dashboard</p></figcaption></figure>

### Incident Details & Investigation

By selecting any incident from the Data Quality module, users will be redirected to the **Incident Details** page. This page provides you with a deeper understanding of and historical trend of the chosen incident.

**Key Features:**

* **📋 Assignee Management**: Assign incidents to team members for accountability
* **📈 Historical Trends**: View patterns and frequency of similar incidents
* **🔍 Root Cause Analysis**: Access detailed metrics and context
* **📝 Audit Trail**: Track all actions and changes in the incident lifecycle

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-e6a367c964a7cd4d6a168bf79c3fa0797fbade36%2Fimage.png?alt=media" alt=""><figcaption><p>Example of a Custom SQL incident — Add Assignee functionality, with actions tracked in Audit History</p></figcaption></figure>

On the Incident Details page, you can add an Assignee to the selected incident. All such actions will subsequently be logged and can be reviewed in the **Audit History** section on the bottom right.

### Incident Status Management

When an incident is raised, it creates an `open` incident. You can choose to either:

* **`Close`** an incident when resolved
* **`Mute`** it for a specified time period to prevent alert fatigue

Muting ensures you don't get duplicate alerts when another incident is triggered on the same table/column. Incidents are automatically unmuted after the time period you have set.

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-4593ad1ad6ac97adaeafdf41e15565a3f14c0d1f%2Fimage.png?alt=media" alt=""><figcaption><p>Example of a Custom SQL incident that is currently open on right panel</p></figcaption></figure>

{% hint style="info" %}
To view open, closed, or muted incidents, click 'Apply Filters' on the Incidents Overview page and select the appropriate checkboxes under 'Incident Status'.
{% endhint %}

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-4a28ac9466b68fe426dc46f4d06183c09236e08f%2Fimage.png?alt=media" alt=""><figcaption><p>Filter incidents by status</p></figcaption></figure>

#### Bulk Update Incident Status <a href="#bulk-update-incident-status" id="bulk-update-incident-status"></a>

You can now manage multiple incidents at once from the 'Incident Overview' page without opening each one individually.

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-9a21cab76e8ae66761cd6d8231f40d8296ae133f%2Fimage%20(280).png?alt=media" alt=""><figcaption></figcaption></figure>

* A checkbox column appears at the left of the incidents table.
* Check individual rows to select specific incidents, or use the header checkbox to select all incidents currently loaded on the page.
* Selected rows are highlighted for easy visual tracking.
* Filtering or searching within the page preserves your current selection unless you navigate out of the 'Incidents Overview' page.
* The top-left of the table always shows how many incidents are currently selected e.g. `3 incidents selected`.
* Use the **Clear selection** button to deselect everything and start fresh.

{% hint style="info" %}
**Note:** Incidents from assets you don't have edit access to will display a lock icon instead of a checkbox and cannot be selected for bulk actions.
{% endhint %}

**Performing a bulk action**

Once at least one editable incident is selected, the **Perform bulk action** button activates. Click it to open the bulk update modal, where you can choose a target status to apply across all selected incidents:

* Unmute incident
* Close incident
* Mute for 1 day
* Mute for 1 week
* Mute for 1 month

Before confirming, Decube previews exactly what will happen to each group of incidents, including which will update, which will be skipped:

* Incidents already in the target state are skipped automatically. However, "Muted to Muted" transitions resets to the newly selected duration.
* Closed incidents cannot be re-opened or muted via bulk action as no status changes can be made to closed incidents.
* The **Confirm changes** button only activates when at least one incident will actually be changed.

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2FSkjDZJ3FZAmnttaFSBcq%2FFrame%201.png?alt=media&#x26;token=05ed0326-d489-455d-8884-17320568a161" alt=""><figcaption></figcaption></figure>

{% hint style="warning" icon="triangle-exclamation" %}
**Warning:** Bulk updating incident statuses cannot be undone once confirmed.
{% endhint %}

**Traceability**

Each updated incident's audit history logs the action with a `via bulk action` label.

**Limits**

* A maximum of 1,000 incidents can be selected at one time.
* Selection resets when you navigate away from the Incident Overview page.

### Historical Analysis

In the **History** tab, a list of past monitor runs is shown, including the metrics of successful scans. This is a quick way to identify the scans that had failed along with the values that caused the failures.

**Use Cases:**

* **📊 Pattern Recognition**: Identify recurring issues and trends
* **⚖️ Threshold Validation**: Verify if alert thresholds are appropriate
* **🔄 Root Cause Analysis**: Understand what changed to cause failures

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-dae0643d9c2687f1f29cb74fddcf766388b990b1%2Fimage.png?alt=media" alt=""><figcaption><p>Example of a History section for a Field Health incident.</p></figcaption></figure>

### Impact Assessment

Based on the downstream lineage, a list of impacted assets is able to be generated to show to the user the potential downstream tables or jobs or dashboards that may be affected by the incident. Users can then opt to export this list as a CSV and send them to the respective owners that were designated in the Catalog.

**Key Benefits:**

* **🎯 Targeted Communication**: Know exactly who to notify about data issues
* **📈 Business Impact**: Understand which reports and dashboards may be affected
* **⚡ Faster Resolution**: Prioritize fixes based on downstream impact

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-06a670e98385f0e909aadcaaac3f43b1edb14b99%2Fimage.png?alt=media" alt=""><figcaption><p>Example of the list of assets impacted by the Custom SQL incident</p></figcaption></figure>

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-b56ff4cd9268bf17f89280ad2df0199d16684247%2Fimage.png?alt=media" alt=""><figcaption><p>Continued list of assets impacted by the Custom SQL incident</p></figcaption></figure>

***

## Advanced Features & Configuration

### 🛠️ Setup & Configuration

**Essential Setup Pages:**

{% content-ref url="enable-asset-monitoring" %}
[enable-asset-monitoring](https://docs.decube.io/data-quality/enable-asset-monitoring)
{% endcontent-ref %}

{% content-ref url="config-settings" %}
[config-settings](https://docs.decube.io/data-quality/config-settings)
{% endcontent-ref %}

**Monitor Configuration:**

{% content-ref url="monitor-configuration-settings/custom-scheduling-for-monitors" %}
[custom-scheduling-for-monitors](https://docs.decube.io/data-quality/monitor-configuration-settings/custom-scheduling-for-monitors)
{% endcontent-ref %}

### 📊 Reporting & Analytics

{% content-ref url="../reports/asset-report-data-quality-scorecard" %}
[asset-report-data-quality-scorecard](https://docs.decube.io/reports/asset-report-data-quality-scorecard)
{% endcontent-ref %}

{% content-ref url="../dashboard/health-score" %}
[health-score](https://docs.decube.io/dashboard/health-score)
{% endcontent-ref %}

### 🔧 Advanced Topics

{% content-ref url="incident-model-feedback" %}
[incident-model-feedback](https://docs.decube.io/data-quality/incident-model-feedback)
{% endcontent-ref %}

## Best Practices & Tips

### 🎯 Getting the Most from Data Quality Monitoring

**Start Small, Scale Gradually:**

1. **Begin with critical tables** - Focus on business-critical data assets first
2. **Use default settings** - Start with out-of-the-box configurations
3. **Monitor feedback** - Adjust thresholds based on false positive rates
4. **Expand coverage** - Gradually add more tables and custom rules

**Alerting Strategy:**

* **⚠️ High Priority**: Real-time alerts for business-critical data
* **📧 Medium Priority**: Daily digest for operational monitoring
* **📋 Low Priority**: Weekly reports for trend analysis

**Team Collaboration:**

* **👤 Assign owners** to critical data assets and monitors
* **📝 Document context** in incident descriptions and monitor names
* **🔄 Regular reviews** of monitor effectiveness and threshold accuracy

### 💡 Pro Tips

* **Use Custom SQL monitors** for complex business rule validation
* **Set up Grouped-By monitoring** for dimension-based quality checks
* **Leverage Smart Training** to reduce false positives
* **Export incident impact lists** to communicate with stakeholders

***

**Need Help?** Contact our support team at <support@decube.io> with data quality monitoring setup and optimization.
