Apache Spark

See lineages from spark jobs in Decube Catalog.

Supported Capabilities

Capability
Supported

Metadata Extraction

Metadata Types Collected

Schema, Virtual Table, Virtual Column, Data Job, Data Run, Data Task

Data Profiling

Data Preview

Data Quality

Configurable Collection

External Table

View Table

Stored Procedure

Data Quality Support

Capability
Supported

Freshness

Volume

Field Health

Custom SQL

Schema Drift

Job Failure

Supported Lineage Mapping

Apache Spark can map lineage relationships to upstream and downstream objects from the following connectors:

  • Upstream Connectors: postgresql, adls

  • Downstream Connectors: postgresql, adls

Connection Requirements

Please see the instructions and minimum requirements for configuration in each data source below:

Last updated