Overview
Soda is a data quality and reliability platform for modern data stacks. It centers on Soda Core, SodaCL, Soda Agent, and Soda Cloud to provide checks-as-code, scans, monitoring, anomaly detection, alerts, and collaboration. Teams use it to embed quality rules into Airflow, dbt, Spark, Snowflake, and Databricks pipelines so issues such as missing values, duplicates, schema drift, freshness delays, and business rule violations are caught before data reaches analytics or AI workloads.
Features
- Define quality checks with SodaCL in YAML
- Run checks-as-code in data pipelines and CI/CD workflows
- Monitor row count, completeness, uniqueness, range, schema drift, and freshness
- Coordinate quality issues through Soda Cloud, Slack, and alert rules
- Integrate with Airflow, dbt, Spark, Snowflake, and Databricks
