Data Platform

Lakehouse, BI/analytics, and ML pipelines with governance, quality, and lineage.

Data Platform Architecture

Detailed view showing components, connections, and data flow

AppsAPIsStreamsFilesDataLakehouseBronze•Silver•GoldBIMLAPIsReportsLineage • Quality • Governance
Core Components
Supporting Services
Data Flow
Security Boundary

Enables Architectural Patterns

What it is

An enterprise data platform that ingests, stores, models, and serves analytical and operational data — enabling BI and ML with strong governance and observability.

Related patterns

  • Data Mesh (decentralized ownership, federated governance)
  • Pipes and Filters (pipelines composition)
  • Event-Driven Architecture (streaming ingestion)

Responsibilities

  • Batch and streaming ingestion (CDC, connectors)
  • Curated storage (lakehouse/warehouse) and modeling
  • Metadata, catalog, lineage, and governance
  • Pipelines orchestration and quality monitoring
  • Serving: BI, ML features, and APIs

Core capabilities

  • Medallion/layered data architecture
  • Transformations (dbt/Spark/Flink) and orchestration
  • Lineage, data contracts, and quality checks
  • Feature store and reproducible ML pipelines

Architecture patterns

  • Lambda/Kappa processing
  • Data mesh with domain ownership
  • CDC into lakes and warehouses
  • Materialized views and serving layers

Tech examples

  • Databricks/Delta Lake
  • Snowflake
  • BigQuery
  • Apache Hudi/Iceberg
  • dbt
  • Airflow/Prefect

KPIs/SLIs

  • Data freshness and completeness
  • Pipeline success rate and duration
  • Quality rule violations
  • Lineage coverage