CSV Template Structure (Edit existing items)

This document provides a comprehensive guide on how to structure your CSV files to update existing objects in Decube using the Export/Import feature.

1. Purpose

This document provides formal guidance on structuring CSV files to update existing metadata objects in Decube using the Export/Import feature. It is intended for data stewards and administrators responsible for bulk metadata maintenance. The guide details supported object types, identifier requirements, editable attributes, field-level constraints, and provides examples to ensure accurate template completion.

2. Supported Object Types

The "Edit Existing Items" workflow supports updates to the following object types:

  • Catalog Objects: Datasets (tables, columns), Non-Datasets (data jobs, charts, dashboards)

  • Glossary Objects: Glossary, Category, and Term

  • Classification Policies

3. Identifier Fields

Each object type has a well-defined set of identifier fields (non-editable) used to locate the object within Decube. These fields must match exactly with the values in the platform to ensure updates are applied correctly. Do not modify identifier values in your CSV file.

  • Catalog (Dataset): Source, Schema, Table, Column, Type

  • Catalog (Non-Dataset): Source, Name, Type

  • Glossary: Glossary, Category, Term, Type

  • Classification Policy: Policy Name, Policy Tag

Note: Any change to identifier fields will result in failure to match and update the object.

4. Important Requirements for Editing Existing Items

Please observe the following requirements to ensure a successful update:

  • Do not modify identifier fields: Identifiers are used to locate existing objects and must remain unchanged. Any modification will result in failure to match and update the object.

  • Edit only allowed attributes: Do not change object names or hierarchy-related fields. Only update permitted attributes such as Description, Tags, Data Owners, Business Owners, etc.

  • Single data source per export: You may only export and edit items from one data source at a time. Mixing sources in a single CSV is not supported and will cause errors.

  • Glossary updates: Glossary, Category, and Term updates can be performed in a single file. Ensure the Type field is set correctly for each row. Classifications and Related Terms can only be applied to Terms. Maintain the correct hierarchy using Parent_1 and Parent_2 fields.

  • Column limits:

    • Tags: Maximum 3 per object

    • Data Owners: Maximum 3 per object

    • Description: Maximum 8,000 characters

    • Name: Maximum 100 characters

  • File size:

    • Maximum file size: 10MB

    • Maximum number of rows: 10,000

  • CSV format integrity:

    • Do not rename columns.

    • Do not add unsupported columns.

    • Ensure headers match the template exactly.

  • Common failure triggers:

    • Incorrect object type (e.g., using “Dataset” instead of “Table”)

    • Empty or malformed identifier fields

    • Including rows for deleted or non-existent objects

  • Irreversible changes: Once imported, updates are applied immediately and cannot be undone. Always validate your file carefully before importing.

Note: All empty values in any editable attribute will overwrite and clear the existing value.

Editable Attributes and Constraints

Below is a breakdown of editable fields, format requirements, character limits, and constraints by object type.

1. Catalog with tables (Dataset)

Field

Required

Description

Constraints

Example

Field Type

Source

Yes

Source system name. Required for editing table and column.

Must exist in system

RedshiftPRD

Identifier/not editable

Schema

Conditional (see description)

Schema name. Required if editing column; optional if editing table.

Must exist in source

public

Identifier/not editable

Table

Conditional (see description)

Table name. Required for editing table and column.

Must exist in schema

sales

Identifier/not editable

Column

Conditional (see description)

Column name (only for column updates).

Optional

created_at

Identifier/not editable

Type

Yes

Object type: Table or Column

One of Table, Column

Table

Identifier/not editable

Data Owners

No

Designated data owners

Max 3, email format

Editable Attribute

Business Owners

No

Designated business owners

Email format

Editable Attribute

Description

No

Description text

Max 8000 characters

Some long description

Editable Attribute

Tags

No

Tags

Comma-separated, max 3

Sales,Marketing

Editable Attribute

Classifications

No

Policy tags applied

Must match existing policies

PII,GDPR

Editable Attribute

Linked Terms

No

Related Glossary Terms

Format: glossary.category.term

Glossary_1.Term_1

Editable Attribute

Source
Schema
Table
Column
Type
Data Owners
Business Owners
Description
Tags
Classifications
Linked Terms

RedshiftPRD

public

sales

Table

Sales table

Glossary_1.Term_1

RedshiftPRD

public

sales

created_at

Column

Creation timestamp

Sales,Marketing

PII,GDPR

Glossary_1.Term_2

Example: Valid CSV input for Catalog Dataset objects

2. Catalog (Non-Dataset: Data Job, Chart, Dashboard)

Field

Required

Description

Constraints

Example

Field Type

Source

Yes

Source system name

Must exist in system

TableauPRD

Identifier/not editable

Name

Yes

Name of the object

Unique within source

sales_dashboard

Identifier/not editable

Type

Yes

Object type: DataJob, Chart, Dashboard

One of DataJob, Chart, Dashboard

Chart

Identifier/not editable

Data Owners

No

Data owners

Max 3, email format

Editable Attribute

Business Owners

No

Business owners

Email format

Editable Attribute

Description

No

Description text

Max 8000 characters

Dashboard for monthly sales

Editable Attribute

Linked Terms

No

Related Glossary Terms

Format: glossary.category.term

Glossary_1.Term_2

Editable Attribute

Source
Name
Type
Data Owners
Business Owners
Description
Linked Terms

TableauPRD

sales_dashboard

Dashboard

Dashboard for monthly sales

Glossary_1.Term_2

TableauPRD

etl_job_1

DataJob

ETL job for sales data

Glossary_1.Term_3

Example: Valid CSV input for Catalog Non-Dataset objects

3. Glossary, Category and Term

The same template is used to update Glossary, Category, and Term. Identify the type of item via the Type column and fill relevant identifier fields accordingly.

Field

Required

Description

Constraints

Example

Field Type

Glossary

Yes

Glossary name

Must exist

Glossary_1

Identifier/not editable

Category

Conditional (Read description)

Category name, Required (if updating term under category or category). Not required (for adding glossary)

Must exist, Max character limit 100

Category_1

Identifier/not editable

Term

Conditional (Read description)

Term name, Required (if updating term). Not required (for adding glossary)

Must exist, Max character limit 100

Term_1

Identifier/not editable

Type

Yes

Type of object

One of Glossary, Category, Term

Term

Identifier/not editable

Data Owners

No

Designated Data owners

Max 3, email format

Editable Attribute

Business Owners

No

Designated Business owners

Email format

Editable Attribute

Description

Yes

Glossary, category and term level description

Max 8000 characters

Meaning of sales process term

Editable Attribute

Classifications

No

Policy tags (only applicable for Terms)

Must match existing policies

PII,GDPR

Editable Attribute

Related Terms

No

Related Terms (only applicable for term)

Must exist, Format: glossary.category.term

Glossary_1.Category.Term_2

Editable Attribute

Glossary
Category
Term
Type
Data Owners
Business Owners
Description
Classifications
Related Terms

Glossary_1

Category_1

Category

Category for sales-related terms

Glossary_1

Category_1

Term_1

Term

Definition of sales process

PII

Glossary_1.Category_1.Term_2

Glossary_1

Glossary

Glossary for business terms

Example: Valid CSV input for Glossary, Category, and Term objects

Note:

  • Classifications only be applied to Terms.

  • Related Terms only apply to Terms.

  • Ensure the Type field is correctly set for each row.

4. Classification Policy

Field

Required

Description

Constraints

Example

Field Type

Name

Yes

Policy name

Must exist

Data Privacy Policy

Identifier/not editable

Policy tag

Yes

also known as classification_policy_name

Max 5 characters, unique

PII

Identifier/not editable

Description

No

Description

-

Protects personally identifiable info

Editable Attribute

Purpose

No

Purpose

-

Legal compliance

Editable Attribute

Stewards

No

Email(s) of policy steward

Email format

Editable Attribute

Name
Policy tag
Description
Purpose
Stewards

Data Privacy Policy

PII

Protects personally identifiable info

Legal compliance

Data Retention

RET

Policy for data retention

Regulatory

Example: Valid CSV input for Classification Policy objects

5. Additional Notes and Best Practices

  • Data Owners, Business Owners, and Stewards: All users listed in these fields must be registered users within the Decube platform. Unregistered emails will result in import errors.

  • Classifications and Related Terms: These attributes can only be applied to Terms. Ensure the Type field is set to Term when using these fields.

  • Validation: Always validate your CSV file for correct headers, required fields, and data formats before importing. Use the platform's validation tools if available.

  • Troubleshooting:

    • If your import fails, review the error report for specific row-level issues.

    • Common issues include unregistered user emails, incorrect object types, and malformed identifiers.

  • Support: For further assistance, consult the Decube documentation or contact support.

This format ensures consistent structure and validation for editing metadata in bulk via CSV Export/Import. Make sure identifier fields are correct and that each row adheres to constraints to avoid import failures.

Last updated