Personal Work

Projects

Side projects and personal engineering work built outside of professional engagements.

QGI — Quantitative Geopolitical Intelligence

Active Development

A serverless AWS data intelligence engine that surfaces hidden economic and geopolitical relationships between nations.

QGI ingests decades of World Bank indicator data — GDP, military spending, trade, population — and computes Single Dyadic Correlating Indices (SDCIs): time-shifted cross-correlations between every country pair for every indicator. SDCIs are then synthesised into Patterns (multi-faceted high-confidence relationships), Influence Cascades (directed propagation paths), and Relationship Clusters (non-obvious geopolitical blocs). The end goal is QGI Macroscope — a strategic intelligence dashboard for analysts.

Step Functions Orchestrator — State Machine Flow

0 · Hygiene 1 · Ingest 2 · Compute 3 · Discovery 4 · Synthesis 5 · Ops
195+ Countries
×
20 Indicators
=
All Pairs FFT Cross-Correlated
·
Graviton Spot · Cost-Optimised
·
Serverless Orchestration
▶ Start
Lambda  ·  Phase 0 — Hygiene
Janitor
Removes stale intermediate data before each run.
Phase 1 — Ingest
Lambda
Sheet Fetcher
Pulls ~20 World Bank indicators → gzipped CSV → S3.
Lambda
Dispatcher A
Fans out to pipeline A workers per indicator.
Wait  ·  21 min (hard)
Workers Download & Hash
Pipeline A workers download and hash indicator data from S3.
Lambda
Reporter
Tallies which indicators were updated; emits updated_ids.
Updates Found?
↓ yes → Phase 2 none → skip to Phase 5 ↧
Phase 2 — Compute · SDCI Engine
Lambda  ·  AWS Batch
Batch Orchestrator (Trigger B)
Submits FFT cross-correlation jobs across all country pairs to Graviton Spot instances.
↻ Polling Loop — every 30 min until Batch completes
Wait  ·  30 min (poll)
Poll Interval
Lambda
Batch Checker
Counts still-running Batch jobs.
Batch Done?
↑ jobs_running > 0 — repeat ↓ all done — proceed
Phase 3 — Discovery · Glue Crawler
AWS Glue SDK
Start Crawler
Launches qgi-scdis-partition-finder to auto-discover new Parquet partitions written by Phase 2.
↻ Polling Loop — every 5 min until crawler is READY
Wait  ·  5 min (poll)
Poll Interval
AWS Glue SDK
Check Crawler
Reads crawler state from Glue API.
Crawler Ready?
↑ State ≠ READY — repeat ↓ READY — proceed
Phase 4 — Synthesis · Pattern Builder
Lambda  ·  SQS
Dispatcher C
Fans out Pattern Builder workers via SQS to aggregate SDCIs into the Patterns dataset.
Wait  ·  60 min (hard)
Workers Calculate Correlations
Pipeline C workers compute and write the Patterns dataset to S3.
Amazon Athena
Repair Table
MSCK REPAIR TABLE patterns — registers new Parquet partitions for querying.
Lambda  ·  Phase 5 — Ops Report
DLQ Scanner
Surfaces failed messages and unprocessed items; produces final health report. Also the landing point when Phase 1 finds no updates.
✓ Workflow Complete
⏱ Min runtime ~2h+ (Batch-dependent)  ·  Step Functions polls async — zero idle compute cost during waits
Python AWS Step Functions AWS Lambda AWS Batch Graviton Spot AWS Glue AWS Athena AWS S3 SQS EventBridge Docker Parquet FFT Cross-Correlation