Enterprise-Grade Data Engineering.
Calm, Reliable, & Scalable.

I take data environments from zero to production. Whether building from scratch or untangling code, I deliver calm, audit-ready pipelines on a modern stack.

Architecture

Private-VPC Data Plane

This reference architecture demonstrates a fully private data plane. It ingests high-velocity market data into a hardened Virtual Private Cloud (VPC), orchestrates transformations via Airflow (MWAA), and delivers audit-ready datasets in Snowflake. Zero public exposure, 100% automated.

Architecture and security posture

The diagram below illustrates the complete lineage: Private VPC endpoints, KMS encryption, and least-privilege IAM roles orchestrating the flow from API to Analytics.

End-to-End Lineage: Orchestrating the flow from raw market APIs into S3, transforming via dbt Core, and serving analytics-ready tables in Snowflake.
Data & Analytics

Flagship Pipeline Output

2025 Year in Review Snapshot The dashboards below are not live feeds. They render a static dataset covering Jan 1 - Dec 19, 2025. This demonstrates the pipeline's ability to aggregate high-velocity historical data into finalized "Gold" reporting layers.

Pipeline Scale

Processing metrics for the 2025 fiscal year

Total Volume Analyzed
1.99 billion contracts
Trading Days
235 days
Aggregated Rows
2,960 rows
Tickers Tracked
15 tickers
SQL Logic Snapshot

Mag 7 Momentum Snapshot

Bullish/Bearish Flow Signals vs. Price

Macro Gravity Snapshot

SPY Correlation & Put/Call Regimes

Chaos Engines: Speculation Map Snapshot

Risk Appetite (DTE vs. Moneyness vs. Volume)

This scatter plot visualizes the "Gamma Casino." The X-Axis tracks time (0 is today), while the Y-Axis measures aggression (how far "Out of the Money" a bet is). Massive bubbles hugging the left axis reveal the market's addiction to short-term speculation over long-term investment.

Top 25 Options Contracts Snapshot

Highest volume contracts across Mag 7, Macro, and Chaos tickers

Swipe table horizontally to see more columns →

Pipeline Discoveries: 2025 Year in Review

These insights are automated outputs from the pipeline's silver/gold dbt layers, demonstrating how raw market data transforms into narrative signals:

Logic: Cross-asset correlation & Regime detection (Z-Score > 2.5)

1. Mag 7 Fracture: The Basket Trade is Dead

  • Insight: Returns varied wildly across the cohort. GOOGL (+62%) and NVDA (+31%) did the heavy lifting, while AMZN (+3%) effectively flatlined. The pipeline's momentum logic correctly identified META's lagging strength early, flagging "Bearish Divergence" on 112 separate trading days.

2. The "Wall of Worry" Rally

  • Insight: The sentiment model flagged 105 days of "Fear" (High Put/Call Ratios) for SPY, versus only 1 day of "Greed." Despite this persistent hedging activity, the S&P 500 rallied +16.4%. The data portrays a market that climbed higher specifically because investors remained defensive.

3. Structural Shift: The 7-Day Horizon

  • Insight: Long-dated investment is being replaced by short-dated speculation. 61.3% of top-tier options volume expired in less than 7 days. This "Gamma dominance" requires pipelines capable of ingesting and aggregating millions of ephemeral contract rows daily, proving this platform can handle the stress.

Disclaimer: None of this is to be taken as financial advice. All data and signals presented are strictly for technical demonstration of data pipeline capabilities.

Architecture

Infrastructure as Code & Compliance

The platform runs on a hardened Airflow deployment inside AWS MWAA. It connects securely to ECR, CloudWatch Logs, KMS, Secrets Manager, and S3 exclusively through VPC endpoints, ensuring data never traverses the public internet.

Data & Analytics

AI-Ready Foundation

The cleanest ML models start with clean data. This pipeline's rigorous "Silver" and "Gold" modeling layers in dbt provide the structured, history-preserving datasets required for predictive modeling and backtesting strategies immediately.

Interested in something similar?

If these examples look close to what you need in your own stack, the next step is a short discovery call.