Notes
Notes on alerts, privacy, and on-call.
2026-05-23 · engineering / rag / llm / incident-management / postgres
When to skip the LLM: reusing root cause from similar resolved incidents
RAG over your own resolved incidents to ground (and sometimes entirely replace) the LLM root-cause call, the strict gate that decides when reuse is safe, and the shadow-validation that keeps it honest.
2026-05-13 · security / engineering / cloudflare-workers / regex
How we built a no-ReDoS customer regex tokenizer in Cloudflare Workers
A pure-JS RE2 port, a 60s per-isolate cache with stampede guard, a Web Worker test gate, and the three production bugs we caught in the first 24 hours.
2026-05-13 · engineering / postgres / soc2 / compliance / pgaudit
30-minute SOC 2 audit log: pgaudit + a deny-all RLS table
Satisfy SOC 2 audit-log requirements with one Postgres extension, one table with deny-all RLS, and one security-definer write function. No third-party logging vendor needed — and the auditor will be happier because the logs live in the same database, with the same RLS, as the data they describe.
2026-05-13 · engineering / pgvector / postgres / alert-correlation
pgvector for alert correlation: how a 1536-dim embedding ate our incident clustering
How to ship vector-based alert correlation in a few hundred lines of SQL plus one embedding API call per event — the query, the tuning knobs, the failure mode, and the line where pgvector stops being enough.
2026-05-09 · aiops / alert-correlation / incident-management / postgres
Two SQL primitives for when alert clustering gets it wrong
Why every alert clustering system needs a manual override, the two Postgres functions that implement split and merge with a full audit trail, and the race condition we found when shipping it.
2026-05-08 · llm / anthropic / prompt-caching / cost-engineering
Anthropic prompt caching cut our RCA cost by 90%
What actually goes in the cached segment, the two-segment trick that lets per-tenant context cache too, and what caching changes on Haiku 4.5.
2026-05-08 · on-call / alert-fatigue / observability / correlation
From 1,000 alerts to 10 incidents
Turning a thousand noisy webhooks into ten real incidents, without throwing away the signal that lives in the noise. Alert correlation, the four hard parts.
2026-05-08 · hipaa / security / observability / compliance
A HIPAA checklist for alert pipelines (8 controls)
Where PHI ends up in monitoring alerts, what HIPAA's Technical Safeguards actually require, and an 8-item checklist for keeping the alert path compliant.
2026-05-06 · pii / observability / tokenization / compliance
How to keep PII out of your alert pipeline
The four hard parts of edge tokenization for observability, and why the obvious shortcuts (strip-on-write, drop-on-detect, redact-in-prompt) all leak.
2026-05-05 · pii / regex / engineering / tokenization
6 regexes for detecting PII in event payloads
The regex set we run in production for tokenizing inbound alerts, with per-pattern false-negative cases and a structural fallback for what regex misses.
build d9b5312updated 2026-06-09no trackersno analyticsno third-party scripts