To build resilient platforms with real constraints

> GitOps that actually merges

> Terraform with DRY, not 3000 lines of variables

> CI/CD pipelines with sanity checks and rollbacks

> Kafka retries that don’t silently drop

> PostgreSQL that survives autovacuum hell

> Kernel params that matter: `dirty_ratio`, `swappiness`, `fs.inotify.max_user_watches`

> Cron replacement with failure alerts

> IAM policies that don’t blow up staging at 3 AM

> AWS with Nitro, not EC2 defaults

> Kubernetes with sane CNI and predictable upgrades

> Observability with real SLOs and distributed tracing

> Go for tools, C++ for perf, Python for glue

> PostgreSQL tuned per workload — not per tutorial

> NixOS for immutability and reproducibility

> Infrastructure where root cause actually debugs

> Security with just enough policy — not just audit checkboxes

> Design for rollback

> Never debug blind

> Metrics are for humans, logs are for grep

> No “eventually consistent” without a deadline

> More layers = more tracing

> Immutable where it matters, stateful where it must

> If it can’t be tested, it will be tested in prod

> Don’t trust managed services without knowing their failure mode