# Overview

Decube is designed from the ground up to ensure that enterprise-grade security, compliance, and scalability are built into every layer of our platform. This document outlines Decube’s infrastructure design, security protocols, and compliance posture.

For deployment-specific models details, refer to our [Architecture ](https://docs.decube.io/security-and-infrastructure/deployment-methods).

## How we handle your data?

### Control & Data Plane Separation

Decube follows a strict separation of concerns between the Control Plane and the Data Plane:

* **Control Plane (Decube-managed)**: Handles authentication, user management, licensing, configuration, scheduling, and alerting.
* **Data Plane (deployment-dependent)**: Executes metadata collection, monitors data quality, and stores metadata, either in a shared or isolated environment based on the deployment model.

Details for each model are available in:

* [Multi-Tenant SaaS](https://docs.decube.io/security-and-infrastructure/deployment-methods/saas-multi-tenant)
* [Single-Tenant SaaS](https://docs.decube.io/security-and-infrastructure/deployment-methods/saas-single-tenant)
* [BYOC](https://docs.decube.io/security-and-infrastructure/deployment-methods/bring-your-own-cloud-byoc)

### Collection

* Decube's data collectors only extract metadata, query logs, and aggregated statistics into its cloud service.
* Data extracted from these scans is solely for assessing your data's reliability and providing statistics and incident alerts of which you have opted-in.
* Decube uses encrypted connections (HTTPS and TLS) to protect the contents of data in transit.
* Decube's architecture also supports a setup specifically for enterprise customers where you can host the data collectors within your own cloud infrastructure so you never have to expose any of your data sources to decube's cloud service.

<figure><img src="https://1779874722-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FTw0qpCVzfrIXqS4FEg4T%2Fuploads%2Fgit-blob-6422a927aa3841f98de2b2901d7061e60b6abbfc%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

### Compliance

* Decube is currently SOC2 certified. Reach out to us if you would like a copy of the SOC2 report.
* Decube will sign any NDAs and/or DPAs where it is appropriate.
* Decube, while collecting metadata, query logs, and metrics for the purposes of running the monitoring, cataloging, acknowledges that personal data may be collected and processed. If any such data is passed into Decube, it is used only for the sole purpose of running the monitoring and cataloging.
* Usage of all SaaS applications internally within Decube for operational purposes is vetted with due diligence so that confidential company and personnel data are protected.

### Organizational Security and Privacy Practices

Decube's team practices industry best practices across the board to protect the security of the application, and the data privacy of its customers.

* Decube engages a third party to perform an annual penetration test over the application layers of the platform.
* Processing of collected data is conducted on secure servers hosted on Amazon Web Services.
* Decube employees engage in privacy and security training during the onboarding and are required to take an examination after the training. All Decube personnel are required to acknowledge, electronically, that they have attended training and understand the security policy.
* Access to all critical systems and production environments are protected using strong passwords and multi-factor authentication. SSO is also used to centralize access control for certain applications. Access rights are reviewed before being granted, and then periodically reviewed thereafter.

### Data collected by Decube

The following information may be processed and stored by Decube on its cloud services:

| Collected Data        | Details                                                                                                                                                 | Purpose of collection                                                                                                                                              |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Metadata              | Asset names such as tables and columns, field types, status of transformation jobs and other such metadata.                                             | To populate the data catalog with information about the assets available (table, columns, jobs etc.) within the data warehouses, databases and other data sources. |
| Metrics               | Row counts, last updated and other similar metrics.                                                                                                     | Enable tracking of metrics such as freshness, volume and other metrics.                                                                                            |
| Aggregated Statistics | Measures the data in selected table which is opt-in only by the user. Statistics may include null percentiles, distinctness, and other similar metrics. | Enable tracking of data health and setup of monitors by user via preset or custom SQL.                                                                             |

#### Multi-AZ Compute

Decube’s cloud infrastructure spans multiple AWS Availability Zones to ensure high availability and business continuity.

### Metadata Collection Approach

Decube collects only the metadata required for observability, governance, and quality monitoring:

#### Metadata Types Collected

| **Aspect**                  | **Description**                                                  | **Sample**                                                               | **Encrypted**         | **Retention** | **Location**                |
| --------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------ | --------------------- | ------------- | --------------------------- |
| **Data Source Schema**      | Schema metadata about data sources, broken into logical grouping | Table name, Column names, Column data type, Constraints, Dashboard names | At-Rest Encryption    | Indefinite    | Metadata DB & Elasticsearch |
| **Monitoring Metrics**      | Aggregated metrics from data quality monitoring                  | Row count, Nulls, Unique values                                          | At-Rest Encryption    | Indefinite    | Metadata DB                 |
| **Change Requests**         | User changes on metadata                                         | Asset description, Tags, Classification                                  | At-Rest Encryption    | Indefinite    | Metadata DB                 |
| **Profiling Info**          | Statistical values (may contain raw data)                        | Null %, Min/Max/Average values                                           | Dual Layer Encryption | 30 Days       | Blob Storage                |
| **User Attachments**        | Uploaded files                                                   | PDF, CSV, Text                                                           | Dual Layer Encryption | Indefinite    | Blob Storage                |
| **Query History**           | Historic queries on platforms                                    | Synapse, Databricks logs                                                 | At-Rest Encryption    | 60 Days       | Metadata DB                 |
| **Data Source Credentials** | Connection credentials                                           | Username, Host, Password                                                 | Dual Layer Encryption | Indefinite    | Metadata DB                 |

## Data Access & Encryption

All connections to customer systems use read-only credentials and are encrypted with TLS.

Decube supports dual-layer encryption:

* Content Encryption using AES-256-GCM
* At-Rest Encryption for data stored in our services

**Encryption Details**

* **Dual-Layer Encryption for Maximum Security**
  * Decube employs a robust **dual-layer encryption** approach to ensure the confidentiality and integrity of sensitive data across all deployment models.
* **Content Encryption - AES-256-GCM**
  * All sensitive data—such as query history, profiling metrics, and credentials—is encrypted using the Advanced Encryption Standard (AES) with 256-bit keys in Galois/Counter Mode (GCM). This method guarantees both data confidentiality and integrity by protecting the actual content from unauthorized access or tampering.
* **At-Rest Encryption - AES-256-GCM**
  * In addition to content-level encryption, all stored data benefits from at-rest encryption. This additional layer secures data while it’s stored in databases or object storage, offering comprehensive protection against breaches or unauthorized access at the storage level.
* **Indefinite Retention**
  * Data stored until the underlying data source is deleted from Decube’s platform. For attachments; until no reference to the attached file exists.

### Scalability & Reliability - Distributed Computing

#### Scalable Data Engine Worker Pool

Decube’s engine dynamically scales based on workload, provisioning additional compute workers under load while minimizing resource usage during idle periods. The distributed architecture also ensures fault tolerance against node-level failures.

#### High Performance & Reliable Job Scheduling

Our custom-built job scheduler, written in Rust, delivers:

* **At-most-once scheduling** to prevent duplicate task execution.
* **At least once execution** to maintain coverage for critical workloads.

Understand further how these models are handled refer to decube architecture.
