On-Call Rotation
Sprint I.7 — skeleton. Placeholder until PagerDuty is provisioned.
Who is on-call
Until the team crosses two engineers, Raj is the primary and the only escalation. Once a second engineer joins, rotation is weekly, handed over Mondays 09:00 Asia/Kolkata.
Responsibilities
- Acknowledge pages within 5 minutes during on-call hours.
- Own the triage → mitigate → resolve flow in incident-response.md.
- Update
apps/web/src/lib/incidents.tsfor user-visible incidents. - File the postmortem within 72 hours.
What on-call is NOT
- Not a product channel. Feature requests route to GitHub issues.
- Not a customer success role. Refunds, account questions, API-key
provisioning all go through
support@gohatch.fun. - Not infinite-hour. If a page wakes you up, the next-day work schedule slides — document the trade in your handoff.
Handoff
Every Monday 09:00 Asia/Kolkata, outgoing writes a handoff note covering:
- Any open degradations or silenced alerts.
- Third-party deps with known flapping (e.g. Bitquery/GoPlus at elevated latency, Anthropic 529s during US peak).
- Recent deploys that should stay watched this week.
- Any user-facing communications that were promised but not yet sent.
Handoff is a message in #hatch-oncall — not a doc. If it's complex
enough to need a doc, it's an incident.
Escalation
| Situation | Who |
|---|---|
| Data loss suspected | Raj + immediate DB freeze |
| Security disclosure | Raj + legal (see I.5) |
| Outage > 2h and no clear root cause | Raj |
| Supabase / Vercel / Railway billing issue | Raj (ops owner) |
| Contract-level exploit (post-mainnet) | Auditor on file + multisig |
Contact paths (placeholder)
- Telegram:
#hatch-oncall(private) — primary channel. - Email:
oncall@gohatch.fun— forwards to Raj until rotation starts. - Phone: added when PagerDuty is provisioned.