Apache Spark
See lineages from spark jobs in Decube Catalog.
Supported Capabilities
Capability
Supported
Metadata Extraction
✅
Metadata Types Collected
Schema, Virtual Table, Virtual Column, Data Job, Data Run, Data Task
Data Profiling
❌
Data Preview
❌
Data Quality
❌
Configurable Collection
❌
External Table
❌
View Table
❌
Stored Procedure
❌
Data Quality Support
Capability
Supported
Freshness
❌
Volume
❌
Field Health
❌
Custom SQL
❌
Schema Drift
❌
Job Failure
✅
Supported Lineage Mapping
Apache Spark can map lineage relationships to upstream and downstream objects from the following connectors:
Upstream Connectors: postgresql, adls
Downstream Connectors: postgresql, adls
Connection Requirements
Please see the instructions and minimum requirements for configuration in each data source below:
Last updated