Govula's data is forensic-grade — audit_log, tenant_operation_log, and operator_audit_events are append-only via Postgres DO INSTEAD NOTHING rules and SHA-256 hash-chained (../audits/phase1-readiness-audit.md §4). Backups must preserve that integrity, and restores must be drilled.
Status:
- Neon PITR: Implemented (managed by Neon; nothing for the operator to install)
- Self-hosted
pg_basebackup+ WAL archiving: Partial (pattern documented; no shipped automation) - Restore drill: Aspirational (no automated drill harness; recommended manual cadence below)
RPO / RTO targets
| Tier | RPO (data loss) | RTO (time to restore) | Topology |
|---|---|---|---|
| Standard | 5 min | 30 min | Neon PITR |
| Tighter | 1 min | 15 min | Neon paid + read replica failover |
| Self-hosted | 5–60 min depending on archive cadence | 30–120 min | pg_basebackup + WAL shipping |
1. Neon PITR (canonical Railway+Vercel+Neon path)
Neon takes continuous WAL backups; PITR is a UI operation:
- Neon → project → Branches tab → identify a branch named for the desired timestamp.
- Restore to create a new branch from a specific point in time (millisecond resolution within the retention window — typically 7 days on free, 30 days on paid).
- Update Railway's
DATABASE_URLto the restored branch's connection string. - Redeploy the backend.
- After verification, promote the restored branch as the new primary in Neon.
This is the same flow described in ../rollback-plan.md §"Database rollback".
2. Logical dumps (any topology)
For long-term archive (audit / regulatory):
# Daily, retain 365 days, store off-host (S3-compatible bucket)
pg_dump --format=custom --no-owner --no-privileges \
--file="govula-$(date -u +%Y%m%d).dump" \
"$DATABASE_URL_DIRECT"
aws s3 cp govula-*.dump s3://govula-archive/ \
--storage-class STANDARD_IA
Logical dumps are slow to restore (≈ row-by-row insert) and lose physical replication state, so they're a complement to PITR, not a replacement.
3. Self-hosted: pg_basebackup + WAL archiving
For customer-controlled Postgres without a managed PITR feature:
- Configure WAL archiving in
postgresql.conf:archive_mode = on archive_command = 'aws s3 cp %p s3://govula-wal/%f' wal_level = replica - Take a base backup weekly:
pg_basebackup -D /backups/$(date -u +%Y%m%d) -F tar -z -P - Restore procedure (in DR runbook form):
# 1. Stop Postgres systemctl stop postgresql # 2. Wipe the data dir (or move it aside) mv /var/lib/postgresql/16/main /var/lib/postgresql/16/main.bak # 3. Restore base backup tar xzf /backups/<date>/base.tar.gz -C /var/lib/postgresql/16/main # 4. Configure recovery target time in postgresql.conf: # restore_command = 'aws s3 cp s3://govula-wal/%f %p' # recovery_target_time = '2026-05-10 12:00:00 UTC' # 5. Start Postgres; it replays WAL up to the target systemctl start postgresql
This is the same procedure used by the backup sidecar in docker-compose.on-prem.yml (see ../ENTERPRISE-DEPLOYMENT.md §"Backup and Recovery").
4. Hash-chain integrity after restore
After ANY restore, run the audit hash-chain replay to confirm no audit row was silently lost:
# This walks audit_log in (created_at ASC, id ASC) order and
# recomputes integrity_hash for every row.
curl -X POST $BACKEND/api/v1/super-admin/tenants/<tenantId>/operations \
-H "Authorization: Bearer $OPERATOR_JWT" \
-d '{"action":"force_audit_replay_validation","reason":"post-restore drill"}'
The result is itself audited (the outcome of the audit is audited) — see ../audits/phase1-readiness-audit.md §4 finding F-AU2.
5. Backup of object-storage assets
Govula optionally archives reports to object storage (Replit Object Storage / S3-compatible). The bucket name is DEFAULT_OBJECT_STORAGE_BUCKET_ID. Backups for this surface depend on your provider:
- Replit Object Storage: managed; no operator action.
- S3: enable Versioning + cross-region Replication.
- Self-hosted MinIO:
mc mirrorto a second site.
6. Restore drill cadence
Backups you have not restored are not backups. Recommended cadence:
- Quarterly for the canonical Neon path: PITR-restore to a throwaway branch, point a staging backend at it, run the post-deploy verification from
../production-checklist.md. - Quarterly for self-hosted: full restore on an isolated VM, verify hash chain.
7. What the system does NOT back up
- In-flight HTTP requests — the load balancer drains in-flight requests on instance shutdown but a sudden kill drops them. Idempotent endpoints retry safely.
- In-process entitlement cache — rebuilt on first request; no operator action.
- Sentry breadcrumbs / Pino in-flight logs — Sentry retains for its configured window; Pino logs are only as durable as your log shipper.
Where to read more
../rollback-plan.md— code rollback (separate from data restore)self-hosted.md— postgres co-resident vs managedscaling.md— replica strategy- In-app:
/docs/deployment/backup-recovery