Projects Built
for Scale
End-to-end data engineering systems designed for production — not just demos.
Featured Projects
End-to-end data engineering systems built for scale, reliability, and real business impact.
FinStream
Data Platform
Medallion Lakehouse processing 200+ financial datasets per day at Nasdaq. Replaced ad-hoc raw table queries with a governed Bronze→Silver→Gold pipeline — reducing report discrepancies to zero and cutting analyst query time from 45s to 3s.
MarketPulse
Streaming Pipeline
Real-time intraday anomaly detection on market tick data. Ingests 500K msgs/sec from Kafka via Spark Structured Streaming, applies stateful window aggregations, and fires alerts within 800ms of event occurrence.
DataGuard
Quality Framework
Reusable data quality platform enforcing schema contracts at ingestion. Zero-trust quarantine pattern prevents corrupt data from ever touching production tables.
ChurnSight
ML Pipeline
End-to-end churn prediction platform: automated PySpark feature engineering → Databricks Feature Store → MLflow registry → FastAPI serving in <50ms.
TradeVault
Analytics DWH
Trade lifecycle analytics DWH joining 6 source systems — OMS, LMS, clearing feeds, FX rates — into a unified star schema. Automated reconciliation replaced a 2-day manual process with a zero-touch daily report.