Recommended

Datahub

DataHub is an open-source data catalog for the modern data stack, providing unified metadata management.

OpenMetadata

A unified metadata platform for data discovery, observability and governance with in-depth lineage.

Apache Atlas

An extensible set of core foundational governance services for Hadoop data ecosystem.

Apache Spark

A unified analytics engine for large-scale data processing with rich APIs and tools.

Apache Flink

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.