# Set Up Custom SQL Monitors

Custom SQL monitors let you define validation logic that the built-in test types cannot express — cross-table checks, complex business rules, aggregation-based assertions. You write a SQL query that returns the failing rows; Decube wraps it in a `SELECT COUNT(*) FROM (<your_sql>) AS base` and opens an incident when the row count is greater than zero.

## How Custom SQL works

**Custom SQL attaches to a Source, not a table.** In the setup form you select a data source connection (BigQuery project, Snowflake account, Postgres database, etc.) rather than a specific table. Your SQL query itself determines which tables are scanned.

**Your query must return rows, not a scalar count.** Because Decube wraps your SQL in an outer `COUNT(*)`, writing `SELECT COUNT(*) FROM table WHERE ...` is a common mistake. If there are no failing records, that query returns one row containing the value `0` — and the outer wrapper counts that as 1 row, triggering an incident on every scan. Write your SQL to return the actual failing records instead:

```sql
-- ✓ Correct — returns failing rows
SELECT order_id, amount
FROM orders
WHERE amount < 0

-- ✗ Incorrect — returns a scalar; will always trigger
SELECT COUNT(*) FROM orders WHERE amount < 0
```

**Custom SQL never uses Smart Training.** The Auto threshold mode and the ML confidence interval are not available for Custom SQL monitors. All thresholds must be set manually (the default is `row_count > 0`).

**Changing the SQL query does not trigger a retrain.** Since Custom SQL does not use Smart Training, modifying the SQL expression only requires re-validation — it does not delete historical data.

**Query timeout is 55 seconds.** If your query does not return within 55 seconds, the scan fails with a timeout error. For complex queries on large tables, optimise with filters and avoid full-table scans.

***

## Before you begin

* You need a connected data source with appropriate read permissions on the tables your SQL will reference.
* Your SQL must be written in the dialect of the selected data source (BigQuery standard SQL, Snowflake SQL, PostgreSQL, etc.).
* Validate your query in the form before saving — validation is required.

***

## Step 1: Set up

1. In the **Data Quality** module, go to the **Config** tab and select **Create**.
2. Select the **Custom SQL** monitor card.
3. Select a **Data Source**.
4. Choose **Monitor mode**: Scheduled or On-Demand.
5. Optionally enable **Grouped By** — see [Custom SQL Group By](/data-quality/how-to-set-up-monitors/set-up-grouped-by-monitors.md#configure-on-demand-schedule-monitor-grouped-by-for-custom-sql) for how group mapping works with Custom SQL.
6. Click **Proceed to Monitor Setup**.

***

## Step 2: Configure — Scheduled monitor

| Field                   | Description                                                                                                                                                       |
| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Monitor Name**        | A descriptive name. You can create multiple Custom SQL monitors on the same source.                                                                               |
| **Monitor Description** | Optional.                                                                                                                                                         |
| **Frequency**           | How often the monitor runs. See [Custom Scheduling for Monitors](/data-quality/monitor-configuration-settings/custom-scheduling-for-monitors.md).                 |
| **Custom SQL query**    | Enter your SQL, then click **Validate**. A "Query successfully validated" message confirms it. If you edit the query after validating, re-validate before saving. |
| **Incident Level**      | Severity assigned to incidents this monitor opens.                                                                                                                |

Turn on **Notify default channel** to route incidents to an email address or Slack channel, then click **Submit**.

***

## Step 2: Configure — On-Demand monitor

On-Demand monitors do not use frequency scheduling.

| Field                   | Description                                        |
| ----------------------- | -------------------------------------------------- |
| **Monitor Name**        | A descriptive name.                                |
| **Monitor Description** | Optional.                                          |
| **Custom SQL query**    | Enter your SQL and click **Validate**.             |
| **Incident Level**      | Severity assigned to incidents this monitor opens. |

To finish:

* Click **Save** to create the monitor without running it immediately.
* Click **Save and Run** to create and run the monitor straight away.

To run the monitor again, go to **All Monitors**, click the ellipsis (︙) on the monitor, select **View Monitor**, then **Run once**.

***

## Including Custom SQL in the DQ Scorecard <a href="#dq-scorecard" id="dq-scorecard"></a>

{% hint style="info" %}
By default, Custom SQL monitors are not included in the [DQ Scorecard](/dashboard/health-score.md) or [DQ Reports](/reports/asset-report-data-quality-scorecard.md). The scorecard requires both an error row count and a total row count to calculate a percentage score. Custom SQL monitors only produce a row count of failures — the total row count must be provided separately.
{% endhint %}

To include a Custom SQL monitor in the DQ score:

1. In the monitor configuration, enable **Include monitor into Data Quality Scorecard**.
2. Provide a **Total Row Count Query** — a SQL query that returns the total number of rows in the search space your main query checks against.

Both queries run on every scan. The score for that monitor is calculated as:

```
DQ Score = 1 - (error_row_count / total_row_count)
```

{% embed url="<https://www.loom.com/share/fd585bcd7602408c965a2a6a4cbf8a2a?sid=4d031395-953c-4caf-8c46-1f4e41bed154>" %}

### Example

**Scenario:** Detect rows with invalid `event_type` values in `analytics.events` and include the monitor in the DQ Scorecard.

{% code title="Custom SQL query (error rows)" overflow="wrap" %}

```sql
SELECT COUNT(*) AS value
FROM analytics.events
WHERE event_type NOT IN ('click', 'view', 'purchase')
  AND event_time >= '2025-09-01'::timestamp
  AND event_time < '2025-09-02'::timestamp
```

{% endcode %}

{% code title="Total Row Count Query" overflow="wrap" %}

```sql
SELECT COUNT(*) AS value
FROM analytics.events
WHERE event_time >= '2025-09-01'::timestamp
  AND event_time < '2025-09-02'::timestamp
```

{% endcode %}

{% hint style="warning" %}
If `total_row_count` is ever less than `error_row_count` (for example, due to a logic error in the total query), that observation is treated as invalid and omitted from the DQ score calculation. Take care to ensure both queries operate over the same scope.
{% endhint %}

### Viewing scan history with row counts

When a Total Row Count Query is enabled, the monitor's history in **Config > All Monitors > Monitor Info** shows an additional `row_count` column alongside the error count for each scan.

***

## Tips and Tricks

### Label invalidation

Alert when records within a recent window have a specific status value:

```sql
SELECT *
FROM public.sale
WHERE created_at BETWEEN NOW() - '30mins'::INTERVAL AND NOW()
  AND status IN ('pending', 'rejected')
```

### Value sum exceeds threshold

Alert when the sum of a value over a time window crosses a threshold:

```sql
SELECT 1
FROM public.sale
WHERE created_at BETWEEN NOW() - '30mins'::INTERVAL AND NOW()
HAVING SUM(price * quantity) > 10000
```

### Cross-table discrepancy

Alert when two tables fall out of sync:

```sql
WITH sale AS (
    SELECT SUM(price * quantity) AS amount
    FROM public.sale
    WHERE created_at BETWEEN NOW() - '30mins'::INTERVAL AND NOW()
),
in_flow AS (
    SELECT SUM(amount) AS amount
    FROM public.in_flow
    WHERE created_at BETWEEN NOW() - '30mins'::INTERVAL AND NOW()
)
SELECT (SELECT amount FROM sale) - (SELECT amount FROM in_flow) AS discrepancy
FROM (SELECT 1) AS dummy
WHERE (SELECT amount FROM sale) - (SELECT amount FROM in_flow) > 10000
```

***

{% content-ref url="/pages/PZ2RI1yYFjLNn2k7GoMG" %}
[Monitor Configuration Reference](/data-quality/monitor-configuration-settings/configuration-reference.md)
{% endcontent-ref %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.decube.io/data-quality/how-to-set-up-monitors/custom-sql-monitors.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
