Back HomeTool Map / AI & Data Navigator

AI and Data Tool Map

This is the full BigDataFlowing tool entry point, covering AI engineering, data governance, analytics, data processing, and data storage. Category pages support deeper browsing, while tool pages collect websites, docs, GitHub links, features, and related tools.

Collected107

tools

Airbyte

Open-source ELT data integration and connector platform

Data ProcessingView Details

Alation

AI-powered data catalog

Data GovernanceView Details

Amazon S3

Cloud object storage service

Data StorageView Details

Amundsen

Open-source data catalog for discovery and metadata search

Data GovernanceView Details

AnythingLLM

Desktop and server tool for team knowledge bases, RAG, and AI agents

AI EngineeringView Details

Apache Airflow

Workflow orchestration system

Data ProcessingView Details

Apache Atlas

Metadata management and governance framework

Data GovernanceView Details

Apache Beam

Unified batch and stream processing model

Data ProcessingView Details

Apache Cassandra

Highly available distributed NoSQL database

Data StorageView Details

Apache DolphinScheduler

Distributed visual workflow scheduler for big data platforms

Data ProcessingView Details

Apache Doris

Open-source MPP database for real-time analytics

Data StorageView Details

Apache Flink

Unified stream and batch processing engine

Data ProcessingView Details

Apache Griffin

Data quality solution

Data GovernanceView Details

Apache Hadoop

Open-source distributed big data storage and computing framework

Data StorageView Details

Apache Hudi

Open-source table format for incremental processing on data lakes

Data StorageView Details

Apache Iceberg

Open table format standard

Data StorageView Details

Apache Kafka

Distributed streaming platform

Data ProcessingView Details

Apache Kylin

Open-source multidimensional analytics engine for large datasets

Data Analytics & VisualizationView Details

Apache NiFi

Visual dataflow automation and integration platform

Data ProcessingView Details

Apache Oozie

Hadoop workflow scheduler

Data ProcessingView Details

Apache Paimon

Open-source table store for unified streaming and batch lakehouses

Data StorageView Details

Apache SeaTunnel

Open-source platform for large-scale data synchronization and integration

Data ProcessingView Details

Apache Spark

Unified data processing engine

Data ProcessingView Details

Apache Superset

Open-Source Data Visualization Platform

Data Analytics & VisualizationView Details

Arize Phoenix

Open-source observability platform for LLM, RAG, and ML systems

AI EngineeringView Details

Atlan

Data catalog and governance collaboration platform for modern data teams

Data GovernanceView Details

AutoGen

Microsoft open-source framework for building collaborative multi-agent systems

AI EngineeringView Details

AWS Glue

Serverless ETL service

Data ProcessingView Details

Azkaban

Distributed workflow manager

Data ProcessingView Details

BentoML

Platform for AI model serving and inference deployment

AI EngineeringView Details

Bigeye

Enterprise data quality monitoring and observability platform

Data GovernanceView Details

BigQuery

Google Cloud managed serverless enterprise data warehouse

Data StorageView Details

Ceph

Unified distributed storage system

Data StorageView Details

Chroma

Embedding database for AI-native applications

AI EngineeringView Details

ClickHouse

High-performance columnar OLAP database

Data StorageView Details

Collibra

Enterprise data governance platform

Data GovernanceView Details

CrewAI

Multi-agent orchestration framework for role-based task collaboration

AI EngineeringView Details

Cube

Semantic layer and metrics API platform for data applications

Data Analytics & VisualizationView Details

Dagster

Software-defined orchestration platform centered on data assets

Data ProcessingView Details

Datafold

Platform for data change validation, diffing, and quality governance

Data GovernanceView Details

DataHub

Modern metadata platform

Data GovernanceView Details

dbt Core

SQL transformation and modeling framework for analytics engineering

Data ProcessingView Details

Debezium

Open-source platform for database change data capture

Data ProcessingView Details

DeepEval

Open-source testing and evaluation framework for LLM applications

AI EngineeringView Details

Deequ

Spark-based data quality library

Data GovernanceView Details

Delta Lake

Open-source storage format and transaction layer for lakehouses

Data StorageView Details

Dify

Open-source LLM app development platform for workflows, agents, RAG, and enterprise AI orchestration

AI EngineeringView Details

Dremio

SQL query and semantic layer platform for Apache Iceberg lakehouses

Data ProcessingView Details

DSPy

Framework for programming and optimizing language model pipelines

AI EngineeringView Details

DuckDB

Columnar database for local analytics and embedded use cases

Data StorageView Details

ECharts

Open-Source JavaScript Visualization Library

Data Analytics & VisualizationView Details

Egeria

Enterprise framework for open metadata and governance interoperability

Data GovernanceView Details

Evidence

Open-source framework for building data apps and reports with SQL and Markdown

Data Analytics & VisualizationView Details

FastGPT

Open-source platform for knowledge-base Q&A, RAG, and visual AI workflows

AI EngineeringView Details

Flink CDC

Apache Flink-based framework for real-time CDC data integration

Data ProcessingView Details

Flowise

Low-code LLM app builder for visual chat, RAG, and agent workflows

AI EngineeringView Details

Grafana

Open-source visualization platform for metrics, logs, and observability

Data Analytics & VisualizationView Details

Great Expectations

Data quality and testing framework

Data GovernanceView Details

Haystack

Open-source AI framework for RAG, search, and question answering

AI EngineeringView Details

HBase

Distributed column-oriented database

Data StorageView Details

HDFS

Distributed File System

Data StorageView Details

Hermes Agent

Self-improving AI agent from Nous Research focused on memory, skills, and long-running collaboration

AI EngineeringView Details

Hex

Collaborative data analysis, notebook, and data app platform

Data Analytics & VisualizationView Details

Kestra

Event-driven open-source orchestration and automation platform

Data ProcessingView Details

Kubeflow

Kubernetes-native platform for machine learning workflows

AI EngineeringView Details

LangChain

Open-source framework for building LLM applications and agent workflows

AI EngineeringView Details

Langfuse

Open-source LLM observability, tracing, and evaluation platform

AI EngineeringView Details

LangGraph

Graph-based orchestration framework for controllable agent workflows

AI EngineeringView Details

Lightdash

Open-source BI platform built on dbt semantic models

Data Analytics & VisualizationView Details

LiteLLM

Unified LLM API gateway for routing, observability, and cost control

AI EngineeringView Details

LlamaIndex

Data framework for LLM applications focused on RAG, data connectors, and knowledge indexing

AI EngineeringView Details

Looker

Cloud-Native BI Platform

Data Analytics & VisualizationView Details

Marquez

Open-source metadata and lineage service built around OpenLineage

Data GovernanceView Details

Metabase

Open-Source BI & Data Exploration

Data Analytics & VisualizationView Details

Milvus

Cloud-native open-source vector database

AI EngineeringView Details

MinIO

High-performance object storage system

Data StorageView Details

MLflow

Lifecycle management platform for machine learning and generative AI

AI EngineeringView Details

Monte Carlo

Commercial platform for enterprise data reliability and observability

Data GovernanceView Details

n8n

Extensible automation and AI workflow orchestration platform

AI EngineeringView Details

Observable

Development platform for interactive data visualization and data apps

Data Analytics & VisualizationView Details

Ollama

Local LLM runtime for running and managing open models on developer machines

AI EngineeringView Details

Open WebUI

Self-hosted workspace for local and private LLM chat

AI EngineeringView Details

OpenClaw

Open-source AI agent for personal and team automation with tool execution, messaging channels, and skills

AI EngineeringView Details

OpenLineage

Open standard and ecosystem for data lineage collection

Data GovernanceView Details

OpenLLMetry

OpenTelemetry-based observability for LLM applications

AI EngineeringView Details

OpenMetadata

Unified metadata platform

Data GovernanceView Details

pgvector

Open-source vector similarity search extension for PostgreSQL

AI EngineeringView Details

Pinecone

Managed vector database for production AI applications

AI EngineeringView Details

PostgreSQL

Mature open-source relational database and enterprise data foundation

Data StorageView Details

Power BI

Microsoft Business Intelligence Tool

Data Analytics & VisualizationView Details

Prefect

Orchestration platform for Python dataflows and automation tasks

Data ProcessingView Details

Presto

Distributed SQL query engine

Data ProcessingView Details

Qdrant

High-performance vector database for AI applications

AI EngineeringView Details

Qualitis

Data quality rule engine

Data GovernanceView Details

Ragas

Open-source evaluation framework for RAG applications

AI EngineeringView Details

RAGFlow

Open-source RAG engine focused on deep document understanding

AI EngineeringView Details

Redash

Open-Source Data Collaboration Platform

Data Analytics & VisualizationView Details

Secoda

Platform for data discovery, documentation, and data team knowledge management

Data GovernanceView Details

Semantic Kernel

Microsoft open-source SDK for enterprise AI orchestration

AI EngineeringView Details

Snowflake

Cloud-native data warehouse

Data StorageView Details

Soda

Tooling platform for data quality checks and data reliability

Data GovernanceView Details

StarRocks

Open-source OLAP database for fast analytics and lakehouse querying

Data StorageView Details

Tableau

Interactive Data Visualization Tool

Data Analytics & VisualizationView Details

Tencent Cloud Data Warehouse

Cloud-native data warehouse service

Data StorageView Details

Trino

Distributed SQL query engine for lakehouse and multi-source analytics

Data ProcessingView Details

vLLM

High-throughput LLM inference and serving framework

AI EngineeringView Details

Weaviate

Vector database with semantic and hybrid search capabilities

AI EngineeringView Details