Apache Spark

See lineages from spark jobs in Decube Catalog.

Supported Capabilities

General

  • Metadata — metadata extraction and display of asset information (tables, columns, schemas). Types collected: Schema, Virtual Table, Virtual Column, Data Job, Data Run, Data Task

Data Quality Monitors

  • Job Failure

Apache Spark can map lineage relationships to upstream and downstream objects from the following connectors:

  • Upstream Connectors: postgresql, adls

  • Downstream Connectors: postgresql, adls

Connection Requirements

Please see the instructions and minimum requirements for configuration in each data source below:

Last updated