Overview of Asset Types
This document provides a comprehensive overview of the various asset types used on the Decube platform in the Catalog such as Category, Chart, Collection, etc.
Last updated
This document provides a comprehensive overview of the various asset types used on the Decube platform in the Catalog such as Category, Chart, Collection, etc.
Last updated
When navigating the top panel in the Catalog, clicking on a specific pill displays a list of associated items according to the selected pill. For example, selecting the “Collection” pill will display a list of all Schemas and Folders, as shown below:
This section categorizes and defines the asset types available on the Decube platform. These assets help organize data and enhance usability across the platform.
Chart: Represents a single visualization of data, typically found in Business Intelligence (BI) or Visualization tools. A single Chart can be a part of multiple Dashboards. Charts can have description, owner, lineage, documentation attached to them. Examples include a Tableau or Looker Chart.
Dashboard: Dashboard is a collection of Charts used for data visualization. Dashboards can include descriptions, lineage, documentation attached to them. Examples include a Power Bi and Looker.
Collection: An asset that is primarily a container for other assets. These assets typically do not hold material information about data, but may contain other Collection subtype within it, For example: Schema and Folder.
Note: Schema Tab in Asset Details: For tables, the Schema tab displays columns, column types, descriptions, tags, classifications, and glossary-linked terms. This is not the same as the meaning of Schema in a data warehouse, which serves as a collection of tables.
Data Job: An executable job that processes data assets, where "processing" implies consuming data, producing data, or both. Data Jobs can have descriptions, data owners attached to them. For example can be found in tools like Airflow, Dbt, Fivetran. The Data Job Subtype includes DataJobRun & DataTaskRun
Data Task: A Data Task refers to a discrete, executable operation within a data pipeline or workflow. It encompasses individual data processing steps, such as transformation, validation, quality assurance, enrichment, or data transfer, that collectively contribute to achieving a broader data processing objective.
Dataset: A Dataset is a structured collection of data, typically organized in a tabular format (rows and columns) or other structured forms such as key-value pairs. Examples include database tables, materialized or virtual views, message streams (e.g., Kafka topics), or data bundles stored in object storage systems (e.g., S3, GCS). Dataset Subtypes includes:
Table: Includes physical tables and views.
Virtual Table: Logical representations of data derived from other sources or transformations.
Source: A Source is a foundational data entity that serves as the origin for all other downstream data assets within its scope. Examples include transactional databases (e.g., PostgreSQL, MySQL), analytical platforms (e.g., BigQuery, Redshift, Snowflake), and business intelligence tools (e.g., Tableau). Source Subtypes includes:
Database: Relational or non-relational systems storing data in structured formats.
Business Intelligence (BI): Tools or platforms used for data visualization and analysis.
Property: An asset that is the smallest logical representation of data. Example, Column or JSON Attributes. May have other Property as nested Property. E.g. an attribute that points to a map data structure.
Column - Column within a physical table or view.
VirtualColumn - Column from a Virtual Table.
Data Domain: Domains are curated, top-level folders or categories where related assets can be explicitly grouped. It organizes assets like datasets, dashboards, and reports under a common theme, making it easier to manage, apply governance, and control access. You can add subdomains under domain.
Data Subdomain: A Data Subdomain represents a finer-grained classification within a broader Data Domain. It is used to segment and manage data assets based on specific business processes, organizational units, or data structures, enabling more targeted and modular data governance practices.
Entities: An entity is the Catalog object that has been identified as part of the domain and added to the Data Domain. It is used to define Data Assets in Data Products.
Data Product: A Data Product is a curated data asset designed for consumption by other teams or domains within an organization. It includes datasets, dashboards, or reports, and follows governance standards with defined data contracts, ensuring reliability, quality, and accessibility. Data Products enable teams to manage and share data assets efficiently across various business functions. Data Product can be created under Domain or sub-domain in Data Mesh.
Data Asset: A collection of data. tables, schemas, columns, reports, files or views that has been added into a Data Product are all modeled as "Datasets". Common types of data assets include datasets, dashboards, and sources like databases**.** These assets represent key entities sourced from customer databases, BI tools etc. Examples include Postgres Tables, or S3 files.
Data Asset Property: A Data Asset Property refers to specific attributes of a data asset, such as the data type, structure, or metadata (e.g., column types in tables). For Example as shown below in the screenshot: Transaction History for Credit Card is the data asset derived from transaction table, and columns id, user_id are the data asset property.
Glossary: A Glossary (or Business Glossary) is a centralized repository of standardized terms and definitions. It ensures consistent communication across teams by aligning business and technical vocabularies. Assets can be linked to Glossary terms for better context.
Term: A Term (or Business Term) is an individual entry in the Glossary. It defines specific business concepts, data entities, or key metrics, ensuring clarity and uniformity. Assets can also be linked to the terms.
Category: A Category is a hierarchical structure for grouping related Glossary terms. Categories improve navigation and organization by linking terms to specific business domains or processes.