> For the complete documentation index, see [llms.txt](https://docs.decube.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.decube.io/data-quality/anomaly-detection-explained.md).

# How Anomaly Detection Works

Understanding how Decube's anomaly detection operates helps you configure monitors that behave predictably and interpret incidents accurately.

## What a monitor is

A monitor represents one test applied to one asset. Each monitor produces its own independent incident stream — if you apply three different tests to the same table column, you have three monitors, each of which can open and close incidents independently.

## How the ML model learns

For monitors that use **Smart Training**, Decube runs an ML model against the asset's historical data to learn the normal range for a given metric. The model builds a confidence interval — an expected upper and lower bound — for each scan point. When a new scan falls outside that interval, Decube opens an incident.

The confidence interval widens or narrows based on the **Sensitivity** setting you choose. See [Sensitivity](#sensitivity) below.

### Historical lookback by scan frequency

When a new monitor is created (or retrained), the model collects historical data to train on. The amount of history collected depends on the scan frequency you configure:

| Scan frequency | Historical lookback |
| -------------- | ------------------- |
| Hourly         | 7 days              |
| Every 6 hours  | 30 days             |
| Every 12 hours | 60 days             |
| Daily          | 192 days            |
| Weekly         | 395 days            |

Monitors scan on a schedule after training is complete. During the training period, the monitor is visible in **All Monitors** but does not produce incidents.

### Sparse data and silent skipping

The ML model requires a minimum amount of valid signal before it can produce a reliable confidence interval. If a scan finds fewer than **5 valid data points in the last 30 observations**, the model will not run the test based on the collected metrics yet until the threshold is set.

This is intentional behaviour: firing an alert on insufficient data would produce unreliable signals. However, it means a monitor on a low-volume or infrequently-updated table may appear inactive. If your monitors are not producing incidents on a table you expect to have anomalies, check whether the table has enough scan history to meet the threshold.

{% hint style="warning" %}
Monitors on sparse tables can train and appear healthy while silently skipping every scan. If you need coverage on a low-volume table, consider On-Demand mode with a manual threshold instead of Smart Training.
{% endhint %}

## Sensitivity

The Sensitivity setting controls how wide or narrow the model's confidence interval is, which in turn controls how easy it is for a data point to fall outside it and trigger an incident.

The scale runs from **–5 to +5**:

| Value | Effect                                                         |
| ----- | -------------------------------------------------------------- |
| –5    | Widest confidence interval — least sensitive, fewest incidents |
| 0     | Default — balanced sensitivity                                 |
| +5    | Narrowest confidence interval — most sensitive, most incidents |

You set Sensitivity via the feedback mechanism on an incident.

{% content-ref url="<https://github.com/DecubeIO/decube-docs/blob/public/incident-model-feedback.md>" %}
<https://github.com/DecubeIO/decube-docs/blob/public/incident-model-feedback.md>
{% endcontent-ref %}

## Monitors that do not use the ML model

Not all monitor types use the ML model. Monitors with a fixed threshold configuration (Absolute, Percentage, Positive Range, or Any Range) compare each scan result directly against the bounds you define — no training period, no confidence interval.

For these monitors, the sparse-data rule and training timeout do not apply.

{% content-ref url="/pages/JO3VkDrrA78BnGdP7coz" %}
[Setting Up Your Data Quality Thresholds](/data-quality/monitor-configuration-settings/setting-up-your-data-quality-thresholds.md)
{% endcontent-ref %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.decube.io/data-quality/anomaly-detection-explained.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
