Data Platform
Lakehouse, BI/analytics, and ML pipelines with governance, quality, and lineage.
Data Platform Architecture
Detailed view showing components, connections, and data flow
Core Components
Supporting Services
Data Flow
Security Boundary
Enables Architectural Patterns
What it is
An enterprise data platform that ingests, stores, models, and serves analytical and operational data — enabling BI and ML with strong governance and observability.
Related patterns
- Data Mesh (decentralized ownership, federated governance)
- Pipes and Filters (pipelines composition)
- Event-Driven Architecture (streaming ingestion)
Responsibilities
- Batch and streaming ingestion (CDC, connectors)
- Curated storage (lakehouse/warehouse) and modeling
- Metadata, catalog, lineage, and governance
- Pipelines orchestration and quality monitoring
- Serving: BI, ML features, and APIs
Core capabilities
- Medallion/layered data architecture
- Transformations (dbt/Spark/Flink) and orchestration
- Lineage, data contracts, and quality checks
- Feature store and reproducible ML pipelines
Architecture patterns
- Lambda/Kappa processing
- Data mesh with domain ownership
- CDC into lakes and warehouses
- Materialized views and serving layers
Tech examples
- Databricks/Delta Lake
- Snowflake
- BigQuery
- Apache Hudi/Iceberg
- dbt
- Airflow/Prefect
KPIs/SLIs
- Data freshness and completeness
- Pipeline success rate and duration
- Quality rule violations
- Lineage coverage