Airbyte
Open-source ELT data integration and connector platform
This is the full BigDataFlowing tool entry point, covering AI engineering, data governance, analytics, data processing, and data storage. Category pages support deeper browsing, while tool pages collect websites, docs, GitHub links, features, and related tools.
tools
Open-source ELT data integration and connector platform
AI-powered data catalog
Cloud object storage service
Open-source data catalog for discovery and metadata search
Desktop and server tool for team knowledge bases, RAG, and AI agents
Workflow orchestration system
Metadata management and governance framework
Unified batch and stream processing model
Highly available distributed NoSQL database
Distributed visual workflow scheduler for big data platforms
Open-source MPP database for real-time analytics
Unified stream and batch processing engine
Data quality solution
Open-source distributed big data storage and computing framework
Open-source table format for incremental processing on data lakes
Open table format standard
Distributed streaming platform
Open-source multidimensional analytics engine for large datasets
Visual dataflow automation and integration platform
Hadoop workflow scheduler
Open-source table store for unified streaming and batch lakehouses
Open-source platform for large-scale data synchronization and integration
Unified data processing engine
Open-Source Data Visualization Platform
Open-source observability platform for LLM, RAG, and ML systems
Data catalog and governance collaboration platform for modern data teams
Microsoft open-source framework for building collaborative multi-agent systems
Serverless ETL service
Distributed workflow manager
Platform for AI model serving and inference deployment
Enterprise data quality monitoring and observability platform
Google Cloud managed serverless enterprise data warehouse
Unified distributed storage system
Embedding database for AI-native applications
High-performance columnar OLAP database
Enterprise data governance platform
Multi-agent orchestration framework for role-based task collaboration
Semantic layer and metrics API platform for data applications
Software-defined orchestration platform centered on data assets
Platform for data change validation, diffing, and quality governance
Modern metadata platform
SQL transformation and modeling framework for analytics engineering
Open-source platform for database change data capture
Open-source testing and evaluation framework for LLM applications
Spark-based data quality library
Open-source storage format and transaction layer for lakehouses
Open-source LLM app development platform for workflows, agents, RAG, and enterprise AI orchestration
SQL query and semantic layer platform for Apache Iceberg lakehouses
Framework for programming and optimizing language model pipelines
Columnar database for local analytics and embedded use cases
Open-Source JavaScript Visualization Library
Enterprise framework for open metadata and governance interoperability
Open-source framework for building data apps and reports with SQL and Markdown
Open-source platform for knowledge-base Q&A, RAG, and visual AI workflows
Apache Flink-based framework for real-time CDC data integration
Low-code LLM app builder for visual chat, RAG, and agent workflows
Open-source visualization platform for metrics, logs, and observability
Data quality and testing framework
Open-source AI framework for RAG, search, and question answering
Distributed column-oriented database
Distributed File System
Self-improving AI agent from Nous Research focused on memory, skills, and long-running collaboration
Collaborative data analysis, notebook, and data app platform
Event-driven open-source orchestration and automation platform
Kubernetes-native platform for machine learning workflows
Open-source framework for building LLM applications and agent workflows
Open-source LLM observability, tracing, and evaluation platform
Graph-based orchestration framework for controllable agent workflows
Open-source BI platform built on dbt semantic models
Unified LLM API gateway for routing, observability, and cost control
Data framework for LLM applications focused on RAG, data connectors, and knowledge indexing
Cloud-Native BI Platform
Open-source metadata and lineage service built around OpenLineage
Open-Source BI & Data Exploration
Cloud-native open-source vector database
High-performance object storage system
Lifecycle management platform for machine learning and generative AI
Commercial platform for enterprise data reliability and observability
Extensible automation and AI workflow orchestration platform
Development platform for interactive data visualization and data apps
Local LLM runtime for running and managing open models on developer machines
Self-hosted workspace for local and private LLM chat
Open-source AI agent for personal and team automation with tool execution, messaging channels, and skills
Open standard and ecosystem for data lineage collection
OpenTelemetry-based observability for LLM applications
Unified metadata platform
Open-source vector similarity search extension for PostgreSQL
Managed vector database for production AI applications
Mature open-source relational database and enterprise data foundation
Microsoft Business Intelligence Tool
Orchestration platform for Python dataflows and automation tasks
Distributed SQL query engine
High-performance vector database for AI applications
Data quality rule engine
Open-source evaluation framework for RAG applications
Open-source RAG engine focused on deep document understanding
Open-Source Data Collaboration Platform
Platform for data discovery, documentation, and data team knowledge management
Microsoft open-source SDK for enterprise AI orchestration
Cloud-native data warehouse
Tooling platform for data quality checks and data reliability
Open-source OLAP database for fast analytics and lakehouse querying
Interactive Data Visualization Tool
Cloud-native data warehouse service
Distributed SQL query engine for lakehouse and multi-source analytics
High-throughput LLM inference and serving framework
Vector database with semantic and hybrid search capabilities