About Me

The Engineer
Behind the Data

I don't just move data from A to B — I design the roads, traffic systems, and quality checkpoints in between.

My Story

From Curious Coder to
Production Systems Builder

I started my journey in data engineering driven by a simple frustration: slow dashboards and broken reports that blocked business decisions. I saw teams waiting 45 minutes for data that should arrive in seconds. That's when I understood — data pipelines are infrastructure, not afterthoughts.

At Nasdaq, I get to solve that problem at industrial scale. I build Medallion Lakehouse architectures, real-time streaming pipelines, and distributed compute systems that process 200+ financial datasets per day — with the kind of accuracy that financial data demands.

Over 4+ years, I've learned that the hardest engineering problems aren't the algorithms — they're the people problems: aligning schemas across teams, making pipelines observable when they silently fail, and building systems that junior engineers can debug at 2am.

I care deeply about reliability, observability, and simplicity. A pipeline that engineers can't understand is a pipeline that will fail in production.

Quick Facts

business

Senior Data Engineer @ Nasdaq

Building financial data platforms since 2021

timeline

4+ Years in Data Engineering

Python, PySpark, SQL, Airflow, AWS, Databricks

location_on

Based in India

Open to remote opportunities globally

military_tech

AWS, Azure & Databricks Certified

AWS Solutions Architect, Azure Administrator, Databricks Lakehouse

mail jindalkalash298@gmail.com

Engineering Mindset

How I Think

The mental models I apply to every system I design — because engineering without principles is just guessing.

📐

Scalability First

Every design decision starts with "what happens at 100x this volume?" Horizontal scaling, partitioning, and idempotency aren't features — they're requirements.

🔭

Observability > Debugging

If I can't see what a pipeline is doing, it doesn't exist yet. I instrument everything: record counts, latencies, error rates, schema drift alerts.

⚖️

Trade-off Clarity

CAP theorem is real. Batch vs streaming. Consistency vs availability. I make these trade-offs explicit in design docs, not discovered in incidents.

🔍

Root Cause, Not Symptoms

Production incidents teach more than any course. I build blameless postmortems and fix the systemic cause, not just restart the failing pod.

From the Trenches

Lessons from Production

❌ Never Again

Pipelines without data quality checks

Shipped a pipeline that silently dropped 18% of records for 3 weeks. Now every pipeline has count assertions, null checks, and schema validation built in.

✅ Always Do

Idempotent, replayable jobs

Every job must be safely re-runnable. Partition-based overwrite patterns, upsert logic, and checkpoint mechanisms save you at 3am.

💡 Learned the Hard Way

Schema evolution is a strategy, not a patch

Schema changes downstream break consumers silently. Now I own forward/backward compatibility contracts between every producer-consumer pair.

Value Proposition

What I Can Do For You

Whether you need a data platform built from scratch, a crumbling pipeline rescued, or a team mentored — I bring production-grade senior thinking.

construction Build end-to-end data pipelines from 0 to production

speed Reduce query latency by 10x with indexing + caching strategies

savings Cut cloud compute costs without sacrificing performance

school Mentor junior engineers on production data engineering

The EngineerBehind the Data

From Curious Coder toProduction Systems Builder