Decube
Try for free
  • 🚀Overview
    • Welcome to decube
    • Getting started
      • How to connect data sources
    • Security and Compliance
    • Data Policy
    • Changelog
    • Public Roadmap
  • 🔌Data Warehouses
    • Snowflake
    • Redshift
    • Google Bigquery
    • Databricks
    • Azure Synapse
  • 🔌Relational Databases
    • PostgreSQL
    • MySQL
    • SingleStore
    • Microsoft SQL Server
    • Oracle
  • 🔌Transformation Tools
    • dbt (Cloud Version)
    • dbt Core
    • Fivetran
    • Airflow
    • AWS Glue
    • Azure Data Factory
    • Apache Spark
      • Apache Spark in Azure Synapse
    • OpenLineage (BETA)
    • Additional configurations
  • 🔌Business Intelligence
    • Tableau
    • Looker
    • PowerBI
  • 🔌Data Lake
    • AWS S3
    • Azure Data Lake Storage (ADLS)
      • Azure Function for Metadata
    • Google Cloud Storage (GCS)
  • 🔌Ticketing and Collaboration
    • ServiceNow
    • Jira
  • 🔒Security and Connectivity
    • Enabling VPC Access
    • IP Whitelisting
    • SSH Tunneling
    • AWS Identities
  • ✅Data Quality
    • Incidents Overview
    • Incident model feedback
    • Enable asset monitoring
    • Available Monitor Types
    • Available Monitor Modes
    • Catalog: Add/Modify Monitor
    • Set Up Freshness & Volume Monitors
    • Set Up Field Health Monitors
    • Set Up Custom SQL Monitors
    • Grouped-by Monitors
    • Modify Schema Drift Monitors
    • Modify Job Failure Monitors (Data Job)
    • Custom Scheduling For Monitors
    • Config Settings
  • 📖Catalog
    • Overview of Asset Types
    • Assets Catalog
    • Asset Overview
    • Automated Lineage
      • Lineage Relationship
      • Supported Data Sources and Lineage Types
    • Add lineage relationships manually
    • Add tags and classifications to fields
    • Field Statistcs
    • Preview sample data
  • 📚Glossary
    • Glossary, Category and Terms
    • Adding a new glossary
    • Adding Terms and Linked Assets
  • Moving Terms to Glossary/Category
  • AI Copilot
    • Copilot's Autocomplete
  • 🤝Collaboration
    • Ask Questions
    • Rate an asset
  • 🌐Data Mesh [BETA]
    • Overview on Data Mesh [BETA]
    • Creating and Managing Domains/Sub-domains
    • Adding members to Domain/Sub-domain
    • Linking Entities to Domains/Sub-domains
    • Adding Data Products to Domains/Subdomains
    • Creating a draft Data Asset
    • Adding a Data Contract - Default Settings
    • Adding a Data Contract - Freshness Test
    • Adding a Data Contract - Column Tests
    • Publishing the Data Asset
  • 🏛️Governance
    • Governance module
    • Classification Policies
    • Auto-classify data assets
  • ☑️Approval Workflow
    • What are Change Requests?
    • Initiate a change request
    • What are Access Requests?
    • Initiate an Access Request
  • 📑Data reconciliation
    • Adding a new recon
    • Understand your recon results
    • Supported sources for Recon
  • 📋Reports
    • Overview of Reports
    • Supported sources for Reports
    • Asset Report: Data Quality Scorecard
  • 📊Dashboard
    • Dashboard Overview
    • Incidents
    • Quality
  • ⏰Alert Notifications
    • Get alerts on email
    • Connect your Slack channels
    • Connect to Microsoft Teams
    • Webhooks integration
  • 🏛️Manage Access
    • User Management - Overview
    • Invite users
    • Deactivate or re-activate users
    • Revoke a user invite
  • 🔐Group-based Access Controls
    • Groups Management - Overview
    • Create Groups & Assign Policies
    • Source-based Policies
    • Administrative-based Policies
    • Module-based Policies
    • What is the "Owners" group?
  • 🗄️Org Settings
    • Multi-factor authentication
    • Single Sign-On (SSO) with Microsoft
    • Single Sign-On (SSO) with JumpCloud
  • ❓Support
    • Supported Features by Integration
    • Frequently Asked Questions
    • Supported Browsers and System Requirements
  • Public API (BETA)
    • Overview
      • Data API
        • Glossary
        • Lineage
        • ACL
          • Group
      • Control API
        • Users
    • API Keys
Powered by GitBook
On this page
  • Asset Types (General)
  • Simplified diagram for hierarchy of assets
  • Asset Types in Data Mesh
  • Simplified diagram for hierarchy of the Data Mesh
  • Asset Types in Glossary
  1. Catalog

Overview of Asset Types

This document provides a comprehensive overview of the various asset types used on the Decube platform in the Catalog such as Category, Chart, Collection, etc.

PreviousConfig SettingsNextAssets Catalog

Last updated 4 months ago

When navigating the top panel in the Catalog, clicking on a specific pill displays a list of associated items according to the selected pill. For example, selecting the “Collection” pill will display a list of all Schemas and Folders, as shown below:

Asset Types (General)

This section categorizes and defines the asset types available on the Decube platform. These assets help organize data and enhance usability across the platform.

  • Chart: Represents a single visualization of data, typically found in Business Intelligence (BI) or Visualization tools. A single Chart can be a part of multiple Dashboards. Charts can have description, owner, lineage, documentation attached to them. Examples include a Tableau or Looker Chart.

  • Dashboard: Dashboard is a collection of Charts used for data visualization. Dashboards can include descriptions, lineage, documentation attached to them. Examples include a Power Bi and Looker.

  • Collection: An asset that is primarily a container for other assets. These assets typically do not hold material information about data, but may contain other Collection subtype within it, For example: Schema and Folder.

Note: Schema Tab in Asset Details: For tables, the Schema tab displays columns, column types, descriptions, tags, classifications, and glossary-linked terms. This is not the same as the meaning of Schema in a data warehouse, which serves as a collection of tables.

  • Data Job: An executable job that processes data assets, where "processing" implies consuming data, producing data, or both. Data Jobs can have descriptions, data owners attached to them. For example can be found in tools like Airflow, Dbt, Fivetran. The Data Job Subtype includes DataJobRun & DataTaskRun

  • Data Task: A Data Task refers to a discrete, executable operation within a data pipeline or workflow. It encompasses individual data processing steps, such as transformation, validation, quality assurance, enrichment, or data transfer, that collectively contribute to achieving a broader data processing objective.

  • Dataset: A Dataset is a structured collection of data, typically organized in a tabular format (rows and columns) or other structured forms such as key-value pairs. Examples include database tables, materialized or virtual views, message streams (e.g., Kafka topics), or data bundles stored in object storage systems (e.g., S3, GCS). Dataset Subtypes includes:

    • Table: Includes physical tables and views.

    • Virtual Table: Logical representations of data derived from other sources or transformations.

  • Source: A Source is a foundational data entity that serves as the origin for all other downstream data assets within its scope. Examples include transactional databases (e.g., PostgreSQL, MySQL), analytical platforms (e.g., BigQuery, Redshift, Snowflake), and business intelligence tools (e.g., Tableau). Source Subtypes includes:

    • Database: Relational or non-relational systems storing data in structured formats.

    • Business Intelligence (BI): Tools or platforms used for data visualization and analysis.

  • Property: An asset that is the smallest logical representation of data. Example, Column or JSON Attributes. May have other Property as nested Property. E.g. an attribute that points to a map data structure.

    • Column - Column within a physical table or view.

    • VirtualColumn - Column from a Virtual Table.

Simplified diagram for hierarchy of assets

Asset Types in Data Mesh

  • Data Domain: Domains are curated, top-level folders or categories where related assets can be explicitly grouped. It organizes assets like datasets, dashboards, and reports under a common theme, making it easier to manage, apply governance, and control access. You can add subdomains under domain.

  • Data Subdomain: A Data Subdomain represents a finer-grained classification within a broader Data Domain. It is used to segment and manage data assets based on specific business processes, organizational units, or data structures, enabling more targeted and modular data governance practices.

  • Entities: An entity is the Catalog object that has been identified as part of the domain and added to the Data Domain. It is used to define Data Assets in Data Products.

  • Data Product: A Data Product is a curated data asset designed for consumption by other teams or domains within an organization. It includes datasets, dashboards, or reports, and follows governance standards with defined data contracts, ensuring reliability, quality, and accessibility. Data Products enable teams to manage and share data assets efficiently across various business functions. Data Product can be created under Domain or sub-domain in Data Mesh.

  • Data Asset: A collection of data. tables, schemas, columns, reports, files or views that has been added into a Data Product are all modeled as "Datasets". Common types of data assets include datasets, dashboards, and sources like databases**.** These assets represent key entities sourced from customer databases, BI tools etc. Examples include Postgres Tables, or S3 files.

  • Data Asset Property: A Data Asset Property refers to specific attributes of a data asset, such as the data type, structure, or metadata (e.g., column types in tables). For Example as shown below in the screenshot: Transaction History for Credit Card is the data asset derived from transaction table, and columns id, user_id are the data asset property.

Simplified diagram for hierarchy of the Data Mesh

Asset Types in Glossary

  • Glossary: A Glossary (or Business Glossary) is a centralized repository of standardized terms and definitions. It ensures consistent communication across teams by aligning business and technical vocabularies. Assets can be linked to Glossary terms for better context.

  • Term: A Term (or Business Term) is an individual entry in the Glossary. It defines specific business concepts, data entities, or key metrics, ensuring clarity and uniformity. Assets can also be linked to the terms.

  • Category: A Category is a hierarchical structure for grouping related Glossary terms. Categories improve navigation and organization by linking terms to specific business domains or processes.

📖
Asset Type Pills
Overview of how pill selection works
Overview of Data Asset Property