> For the complete documentation index, see [llms.txt](https://docs.decube.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-freshness-monitors.md).

# Set Up Freshness Monitors

A Freshness monitor learns when your data normally arrives and alerts you when it doesn't show up as expected. Unlike a simple staleness check, the scheduled Freshness monitor builds a probability model from historical arrival patterns — so it knows not to alert on weekends if your data never arrives on weekends.

## Freshness vs Volume: which to use

| Goal                                              | Use           |
| ------------------------------------------------- | ------------- |
| Detect that data **arrived** (or didn't)          | **Freshness** |
| Detect that the **right amount** of data arrived  | **Volume**    |
| Table receives data on a known schedule           | **Freshness** |
| Table grows by a predictable row count per period | **Volume**    |

If you need both signals, create one monitor of each type on the same table.

## How the scheduled Freshness monitor works

The scheduled Freshness monitor uses an ML model trained on your table's historical write timestamps. The model learns the expected arrival windows for your data and flags a run as anomalous when the probability of data having arrived drops below 50%.

This means:

* If your pipeline never runs on weekends, the model accounts for this and does not raise weekend incidents.
* If your data normally arrives between 08:00 and 10:00, a scan at 07:00 that finds no new data will not trigger an alert — the model knows it is too early.

Each incident for a Freshness monitor includes the **`time_since_last_write`** metric, which shows exactly how long ago the table last received a write. Use this to triage whether a delay is minor or critical.

## How the on-demand Freshness monitor works

The on-demand Freshness monitor performs a simpler presence check: it queries whether any new rows exist since the lookback period you specify at run time. It does not use the ML model or learn arrival patterns — it returns a pass or fail based purely on whether rows are present.

## Before you begin

* You need at least one data source connected and a table available under that source.
* To use Smart Training, the table must have a timestamp column (or you must provide an SQL expression that produces one). Smart Training is not available in On-Demand mode.

***

## Step 1: Set up

1. In the **Data Quality** module, go to the **Config** tab and select **Create**.
2. Select the **Freshness** monitor card.
3. In the **Create a New Monitor** form, select your **Source** (Schema is optional) and **Dataset**.
4. Choose **Monitor mode**: **Scheduled** or **On-Demand**.
5. Optionally enable **Grouped By** — select a column to group on and click **Validate** to confirm the column is valid.
6. Click **Proceed to Monitor Setup**.

{% hint style="info" %}
Grouped By creates one sub-monitor per distinct value in the group column. The [Grouped-By Monitors](/data-quality/how-to-set-up-monitors/set-up-grouped-by-monitors.md) page explains the 100-distinct-values limit and other constraints.
{% endhint %}

***

## Step 2: Configure — Scheduled monitor

Complete the required fields in the **Configure** form:

| Field                   | Description                                                                                                                                                                                   |
| ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Monitor Name**        | A descriptive name for this monitor. You can create multiple Freshness monitors on the same table.                                                                                            |
| **Monitor Description** | Optional.                                                                                                                                                                                     |
| **Row Creation**        | How Decube identifies new rows: **Timestamp** (select a timestamp column), **SQL Expression** (provide an expression that produces a timestamp), or **All Records**.                          |
| **Smart Training**      | Toggle on to train the model on historical data. Requires Timestamp or SQL Expression row creation. Enabling Smart Training also makes the **Lookback Period** selectable.                    |
| **Frequency**           | How often the monitor scans. See [Custom Scheduling for Monitors](https://github.com/DecubeIO/decube-docs/blob/public/data-quality/how-to-set-up-monitors/custom-scheduling-for-monitors.md). |
| **Incident Level**      | Severity assigned to incidents this monitor opens.                                                                                                                                            |

{% hint style="warning" %}
**All Records** mode does not support Smart Training. The monitor runs without a probability model and compares total row counts directly.
{% endhint %}

### SQL Expression

Use an SQL Expression when your table stores timestamps in a non-standard format (string, Unix epoch, or split date/time columns). The expression must produce a valid timestamp in your data source's SQL dialect.

| Format                       | BigQuery example                                           | PostgreSQL example                             |
| ---------------------------- | ---------------------------------------------------------- | ---------------------------------------------- |
| String → timestamp           | `CAST(your_col AS DATETIME)`                               | `your_col::timestamp`                          |
| Unix seconds → timestamp     | `TIMESTAMP_SECONDS(your_col)`                              | `TO_TIMESTAMP(your_col)`                       |
| Separate date + time columns | `PARSE_DATETIME('%F %T', CONCAT(date_col, ' ', time_col))` | `(date_col \|\| ' ' \|\| time_col)::timestamp` |

Validating the expression before saving is required.

### Notifications

Turn on **Notify default channel** to route incidents to a specific email or Slack channel. Click **Submit** to create the monitor.

***

## Step 2: Configure — On-Demand monitor

On-Demand monitors do not use Smart Training, Auto Threshold, frequency scheduling, or Grouped By.

| Field                   | Description                                                                                |
| ----------------------- | ------------------------------------------------------------------------------------------ |
| **Monitor Name**        | A descriptive name.                                                                        |
| **Monitor Description** | Optional.                                                                                  |
| **Row Creation**        | **Timestamp** or **SQL Expression** only — All Records is not available in On-Demand mode. |
| **Lookback Period**     | The time window to check for new rows when the monitor runs.                               |
| **Incident Level**      | Severity assigned to incidents this monitor opens.                                         |

To finish:

* Click **Save** to create the monitor without running it immediately.
* Click **Save and Run** to create and run the monitor straight away.

After creation, you can run the monitor again from **All Monitors** by clicking the ellipsis (︙) and selecting **View Monitor**, then **Run once**.

***

## Related pages

{% content-ref url="/pages/K9nwiEHp0zGa6BN2j0kl" %}
[Set Up Volume Monitors](/data-quality/how-to-set-up-monitors/set-up-volume-monitors.md)
{% endcontent-ref %}

{% content-ref url="/pages/12tZSWlWRX4XEXBTD4L6" %}
[How Anomaly Detection Works](/data-quality/anomaly-detection-explained.md)
{% endcontent-ref %}

{% content-ref url="/pages/KgXixMx9V2joZYvvAjL9" %}
[Retraining Monitors](/data-quality/monitor-configuration-settings/retraining-monitors.md)
{% endcontent-ref %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.decube.io/data-quality/how-to-set-up-monitors/set-up-freshness-monitors.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
