Enterprise-Grade Data Engineering. Calm, Reliable, & Scalable.
I take data environments from zero to production. Whether building from scratch or untangling code, I deliver calm, audit-ready pipelines on a modern stack.
Architecture
Private-VPC Data Plane
This reference architecture demonstrates a fully private data plane. It ingests high-velocity market data into a hardened Virtual Private Cloud (VPC), orchestrates transformations via Airflow (MWAA), and delivers audit-ready datasets in Snowflake. Zero public exposure, 100% automated.
Architecture and security posture
The diagram below illustrates the complete lineage: Private VPC endpoints, KMS encryption, and least-privilege IAM roles orchestrating the flow from API to Analytics.
End-to-End Lineage: Orchestrating the flow from raw market APIs into S3, transforming via dbt Core, and serving analytics-ready tables in Snowflake.
Data & Analytics
Flagship Pipeline Output
2025 Year in Review Snapshot
The dashboards below are not live feeds. They render a static dataset covering
Jan 1 - Dec 19, 2025. This demonstrates the pipeline's ability
to aggregate high-velocity historical data into finalized "Gold" reporting layers.
Pipeline Scale
Processing metrics for the 2025 fiscal year
Total Volume Analyzed
1.99 billion contracts
Trading Days
235 days
Aggregated Rows
2,960 rows
Tickers Tracked
15 tickers
Unified Lineage: All downstream marts originate from a single intermediate join (int_massive__stocks_options_joined), consolidating 6.6 million rows of harmonized data into a consistent source of truth.
SQL Logic Snapshot
Mag 7 Momentum Snapshot
Bullish/Bearish Flow Signals vs. Price
Macro Gravity Snapshot
SPY Correlation & Put/Call Regimes
Chaos Engines: Speculation Map Snapshot
Risk Appetite (DTE vs. Moneyness vs. Volume)
This scatter plot visualizes the "Gamma Casino." The X-Axis tracks time (0 is today), while the Y-Axis measures aggression (how far "Out of the Money" a bet is).
Massive bubbles hugging the left axis reveal the market's addiction to short-term speculation over long-term investment.
Top 25 Options Contracts Snapshot
Highest volume contracts across Mag 7, Macro, and Chaos tickers
Swipe table horizontally to see more columns →
Pipeline Discoveries: 2025 Year in Review
These insights are automated outputs from the pipeline's silver/gold dbt layers, demonstrating how raw market data transforms into narrative signals:
Insight: Returns varied wildly across the cohort. GOOGL (+62%) and NVDA (+31%) did the heavy lifting, while AMZN (+3%) effectively flatlined.
The pipeline's momentum logic correctly identified META's lagging strength early, flagging "Bearish Divergence" on 112 separate trading days.
2. The "Wall of Worry" Rally
Insight: The sentiment model flagged 105 days of "Fear" (High Put/Call Ratios) for SPY, versus only 1 day of "Greed."
Despite this persistent hedging activity, the S&P 500 rallied +16.4%. The data portrays a market that climbed higher specifically because investors remained defensive.
3. Structural Shift: The 7-Day Horizon
Insight: Long-dated investment is being replaced by short-dated speculation. 61.3% of top-tier options volume expired in less than 7 days.
This "Gamma dominance" requires pipelines capable of ingesting and aggregating millions of ephemeral contract rows daily, proving this platform can handle the stress.
Disclaimer: None of this is to be taken as financial advice. All data and signals presented are strictly for technical demonstration of data pipeline capabilities.
Architecture
Infrastructure as Code & Compliance
The platform runs on a hardened Airflow deployment inside AWS MWAA. It connects securely to ECR, CloudWatch Logs, KMS,
Secrets Manager, and S3 exclusively through VPC endpoints, ensuring data never traverses the public internet.
Data & Analytics
AI-Ready Foundation
The cleanest ML models start with clean data. This pipeline's rigorous "Silver" and "Gold" modeling layers in dbt provide the structured, history-preserving datasets required for predictive modeling and backtesting strategies immediately.
Interested in something similar?
If these examples look close to what you need in your own stack, the next step is a short discovery call.