Redshift
Adding Redshift to your decube connections helps your team to find relevant datasets, understand their quality via incident monitoring and apply governance policies via our data catalog.
Supported Capabilities
General
Metadata — metadata extraction and display of asset information (tables, columns, schemas). Types collected: Schema, Table, Column, View, Data Job, Data Run, Data Task
Sync Objects Descriptions — syncs object descriptions from Redshift to the Catalog
Profiling — data profiling on the Profiler tab
Preview — sample data preview
Data Quality — data quality monitoring and observability
Configurable Collection — selective ingestion of schemas/workspaces in Data Source Management
View Table — view tables, which are virtual tables based on SQL queries
Stored Procedure — stored procedures (precompiled SQL; listed as Data Jobs in Metadata)
Data Quality Monitors
Freshness
Volume
Field Health
Custom SQL
Schema Drift
Job Failure
Lineage
View Table Lineage — tracks virtual tables (views) and their data dependencies
SQL Query Lineage — maps data movement through SQL queries (SELECT, JOIN, INSERT, etc.)
Foreign Key Lineage — tracks relationships between tables via primary and foreign keys
Stored Procedure Lineage — tracks data flow through stored procedures as they execute
General
External Table
Lineage
External Table Lineage
Connection Requirements
Allowing Access
To allow our connector to access your Redshift instance, you will need to either:
Allow public access
Connect through a SSH Bastion
Allowing Public Access
You can still limit who can connect to your Redshift instance through security-group inbound rules when you enable public access.
Go to Actions > Modify publicly accessible setting

Check Turn on Publicly accessible and select an Elastic IP address

Navigate to the Properties tab

Scroll down to the Network and security settings and click through to your security group

Navigate to the Inbound rules tab and click Edit inbound rules

Click Add rule and in Type choose Redshift and in the Source section, add all of Decube's collector IPs.
You will need to post-fix the IP with /32 to limit it to only that IP. I.e. xxx.xxx.xxx.xxx/32

Be careful with modifying inbound rule policies. It can affect connectivity within your own VPC if you remove existing rules.
SSH Bastion
You can also use a SSH Bastion if enabling public access is not an option. Setting up a Bastion host is out of the scope of this guide but you can refer to SSH Tunneling guide for more information.
Once you have setup a Bastion host, modify your Redshift security group inbound rule (refer to Ref 1.5) to allow source connection from your Bastion host's private IP address instead.
Connection Details
Connecting to decube is as easy as providing us with credentials to your Redshift database. At a minimum, we require
usernamepasswordhost addresshost portdatabase name

The source name will be for you to differentiate and recognize particular sources within the decube application.
We strongly encourage you to create a decube read-only user for this credential purpose, which you can follow these steps.
Security Concerns
Custom User for decube
We highly recommend that you create a Read-Only user for decube. We have prepared a script that you may run on your Redshift database to create this user.
Create a New User for decube
2. Add access to SYSLOG to build lineage and ingest Stored Procedures.
Add access to information_schema.
4. You may need to run this per schema that you have based on the default behavior of the schema.
Last updated