# Redshift

## Supported Capabilities

{% tabs %}
{% tab title="Supported Capabilities" %}
**General**

* **Metadata** — metadata extraction and display of asset information (tables, columns, schemas). Types collected: Schema, Table, Column, View, Data Job, Data Run, Data Task
* **Sync Objects Descriptions** — syncs object descriptions from Redshift to the Catalog
* **Profiling** — data profiling on the Profiler tab
* **Preview** — sample data preview
* **Data Quality** — data quality monitoring and observability
* **Configurable Collection** — selective ingestion of schemas/workspaces in Data Source Management
* **View Table** — view tables, which are virtual tables based on SQL queries
* **Stored Procedure** — stored procedures (precompiled SQL; listed as Data Jobs in Metadata)

**Data Quality Monitors**

* Freshness
* Volume
* Field Health
* Custom SQL
* Schema Drift
* Job Failure

**Lineage**

* **View Table Lineage** — tracks virtual tables (views) and their data dependencies
* **SQL Query Lineage** — maps data movement through SQL queries (SELECT, JOIN, INSERT, etc.)
* **Foreign Key Lineage** — tracks relationships between tables via primary and foreign keys
* **Stored Procedure Lineage** — tracks data flow through stored procedures as they execute
  {% endtab %}

{% tab title="Not Supported" %}
**General**

* External Table

**Lineage**

* External Table Lineage
  {% endtab %}
  {% endtabs %}

## Connection Requirements

### Allowing Access

To allow our connector to access your Redshift instance, you will need to either:

1. Allow public access
2. Connect through a SSH Bastion

### Allowing Public Access

You can still limit who can connect to your Redshift instance through security-group inbound rules when you enable public access.

Go to Actions > Modify publicly accessible setting

<figure><img src="/files/cw6aBZfKgdyWoYDs1xAU" alt=""><figcaption><p>Ref 1.1</p></figcaption></figure>

Check `Turn on Publicly accessible` and select an Elastic IP address

<div align="center"><figure><img src="/files/mtvxVoOcBhSlwuPGO7V7" alt=""><figcaption><p>Ref 1.2</p></figcaption></figure></div>

Navigate to the `Properties` tab

<figure><img src="/files/YQyhpSPrEpDyW86DQXYk" alt=""><figcaption><p>Ref 1.3</p></figcaption></figure>

Scroll down to the `Network and security settings` and click through to your security group

<figure><img src="/files/xSy14bSiXsmWgDYLIA1T" alt=""><figcaption><p>Ref 1.4</p></figcaption></figure>

Navigate to the `Inbound rules` tab and click `Edit inbound rules`

<figure><img src="/files/rR5b88NN5EMo0UGnHhji" alt=""><figcaption><p>Ref 1.5</p></figcaption></figure>

Click `Add rule` and in `Type` choose Redshift and in the `Source` section, add all of Decube's collector IPs.

{% hint style="info" %}
You will need to post-fix the IP with /32 to limit it to only that IP. I.e. `xxx.xxx.xxx.xxx/32`
{% endhint %}

<figure><img src="/files/mlsOPXdG37usIC3i5yLU" alt=""><figcaption><p>Ref 1.5</p></figcaption></figure>

{% hint style="warning" %}
Be careful with modifying inbound rule policies. It can affect connectivity within your own VPC if you remove existing rules.
{% endhint %}

### SSH Bastion

You can also use a SSH Bastion if enabling public access is not an option. Setting up a Bastion host is out of the scope of this guide but you can refer to [SSH Tunneling](https://docs.decube.io/how-to-connect-data-sources/enabling-vpc-access/ssh-tunneling) guide for more information.

Once you have setup a Bastion host, modify your Redshift security group inbound rule (refer to Ref 1.5) to allow source connection from your Bastion host's private IP address instead.

### Connection Details

Connecting to decube is as easy as providing us with credentials to your Redshift database. At a minimum, we require

* `username`
* `password`
* `host address`
* `host port`
* `database name`

<figure><img src="/files/mlAJtayTd8DTGMMypkPw" alt=""><figcaption><p>Amazon Redshift</p></figcaption></figure>

The `source name` will be for you to differentiate and recognize particular sources within the decube application.

We strongly encourage you to create a decube **read-only** user for this credential purpose, which you can follow [these steps](#custom-user-for-decube).

### **Security Concerns**

### Custom User for decube

We highly recommend that you create a Read-Only user for decube. We have prepared a script that you may run on your Redshift database to create this user.

1. Create a New User for decube

```
CREATE USER decube PASSWORD 'a_new_password';
```

2\. Add access to SYSLOG to build lineage and ingest Stored Procedures.

```
ALTER USER decube WITH SYSLOG ACCESS UNRESTRICTED;
```

3. Add access to information\_schema.

```
GRANT USAGE ON SCHEMA information_schema TO decube;
GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO decube;
```

4\. You may need to run this per schema that you have based on the default behavior of the schema.

```
GRANT USAGE ON SCHEMA <schema_name> TO decube;
GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO decube;
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.decube.io/warehouses/redshift.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
