fix(deploy): self-healing pre-migrate bootstrap for SecretBackend rollout
Some checks failed
Some checks failed
Why: clusters upgrading from the pre-SecretBackend schema crash-loop on the first rollout. `prisma db push` applies the Phase 0 migration as three sequential steps — add Secret.backendId column (default ''), create SecretBackend table, add FK — and the FK fails because empty-string values reference no row in the empty SecretBackend table. This happened on the live cluster today; I fixed it by hand with psql. This PR makes the fix automatic so a fresh cluster or anyone replaying the migration doesn't hit the same trap. - New `src/db/src/scripts/pre-migrate-bootstrap.ts` — idempotent node script. Checks if SecretBackend table exists; if so, ensures a default row exists (insert on conflict noop), then backfills any Secret.backendId = '' to point at it. Uses Prisma raw queries so it runs against a partially- migrated schema. - `deploy/entrypoint.sh` now catches a failed first push, runs the bootstrap, and retries. Fresh installs and fully-migrated clusters take the happy path (one push, no bootstrap needed). Pre-Phase-0 upgrades take the healing path (push fails → bootstrap seeds → retry succeeds). - The bootstrap is deliberately non-fatal — even on unexpected errors it logs and exits 0 so the retry still runs. If that retry also fails, the push error surfaces normally and the pod crash-loops visibly rather than silently starting in a half-migrated state. Verified the idempotent path logically: on the already-bootstrapped cluster (1 backend row, 0 empty-backendId Secrets), the script's UPDATE matches zero rows and the INSERT hits ON CONFLICT DO NOTHING — pure no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
105
src/db/src/scripts/pre-migrate-bootstrap.ts
Normal file
105
src/db/src/scripts/pre-migrate-bootstrap.ts
Normal file
@@ -0,0 +1,105 @@
|
||||
/**
|
||||
* Self-healing pre-migration step for the SecretBackend rollout (Phase 0).
|
||||
*
|
||||
* Why this exists: `prisma db push` applies schema changes sequentially. When
|
||||
* a cluster upgrades from a pre-SecretBackend DB:
|
||||
* 1. `Secret.backendId` column is added with `DEFAULT ''`
|
||||
* 2. `SecretBackend` table is created (empty)
|
||||
* 3. The FK `Secret.backendId → SecretBackend.id` is added — and FAILS
|
||||
* because every Secret row now has `backendId = ''` which references no
|
||||
* row in SecretBackend.
|
||||
*
|
||||
* This script runs AFTER a failed `prisma db push` attempt:
|
||||
* - If SecretBackend table doesn't exist yet → noop (fresh install case;
|
||||
* db push will create everything and the FK succeeds because there are
|
||||
* no Secret rows to violate it).
|
||||
* - If SecretBackend exists but is empty → insert a default plaintext row.
|
||||
* - If any Secret rows have `backendId = ''` → point them at the default.
|
||||
*
|
||||
* Idempotent: safe to run multiple times. No-op on a fully-migrated cluster.
|
||||
* Never throws; logs and exits 0 even on errors so the subsequent
|
||||
* `prisma db push` retry is still attempted.
|
||||
*/
|
||||
import { PrismaClient, Prisma } from '@prisma/client';
|
||||
|
||||
const DEFAULT_ID = 'cdefault000backend00000001';
|
||||
|
||||
async function main(): Promise<void> {
|
||||
const prisma = new PrismaClient();
|
||||
try {
|
||||
// Does the SecretBackend table exist yet? We check by querying the
|
||||
// information_schema rather than catching Prisma's error — cleaner, and
|
||||
// lets us distinguish "table missing" from "query succeeded but empty".
|
||||
const tableExists = await prisma.$queryRaw<Array<{ exists: boolean }>>`
|
||||
SELECT EXISTS (
|
||||
SELECT 1 FROM information_schema.tables
|
||||
WHERE table_schema = 'public' AND table_name = 'SecretBackend'
|
||||
) AS exists
|
||||
`;
|
||||
if (!tableExists[0]?.exists) {
|
||||
console.log('bootstrap: SecretBackend table not present yet — skipping');
|
||||
return;
|
||||
}
|
||||
|
||||
// Ensure at least one row exists, marked isDefault.
|
||||
const existingDefault = await prisma.$queryRaw<Array<{ id: string }>>`
|
||||
SELECT id FROM "SecretBackend" WHERE "isDefault" = true LIMIT 1
|
||||
`;
|
||||
let defaultId: string;
|
||||
if (existingDefault.length === 0) {
|
||||
await prisma.$executeRaw`
|
||||
INSERT INTO "SecretBackend"
|
||||
("id", "name", "type", "config", "isDefault", "description", "version", "createdAt", "updatedAt")
|
||||
VALUES (
|
||||
${DEFAULT_ID},
|
||||
'default',
|
||||
'plaintext',
|
||||
'{}'::jsonb,
|
||||
true,
|
||||
'Default in-database plaintext backend. Seeded by pre-migrate-bootstrap.',
|
||||
1,
|
||||
CURRENT_TIMESTAMP,
|
||||
CURRENT_TIMESTAMP
|
||||
)
|
||||
ON CONFLICT (name) DO NOTHING
|
||||
`;
|
||||
// Re-read — if there was an existing row with the same name but no
|
||||
// isDefault flag we need its id, not the one we tried to insert.
|
||||
const afterInsert = await prisma.$queryRaw<Array<{ id: string }>>`
|
||||
SELECT id FROM "SecretBackend" WHERE name = 'default' LIMIT 1
|
||||
`;
|
||||
if (afterInsert.length === 0) {
|
||||
console.log('bootstrap: could not establish a default SecretBackend — bailing');
|
||||
return;
|
||||
}
|
||||
defaultId = afterInsert[0]!.id;
|
||||
// Make sure it's flagged default.
|
||||
await prisma.$executeRaw`
|
||||
UPDATE "SecretBackend" SET "isDefault" = true WHERE id = ${defaultId}
|
||||
`;
|
||||
console.log(`bootstrap: seeded default SecretBackend (id=${defaultId})`);
|
||||
} else {
|
||||
defaultId = existingDefault[0]!.id;
|
||||
}
|
||||
|
||||
// Backfill Secret.backendId for any rows left with an empty value.
|
||||
// Using $executeRaw returns affected row count.
|
||||
const updated = await prisma.$executeRaw(
|
||||
Prisma.sql`UPDATE "Secret" SET "backendId" = ${defaultId} WHERE "backendId" = ''`,
|
||||
);
|
||||
if (updated > 0) {
|
||||
console.log(`bootstrap: backfilled ${updated} Secret row(s) with default backendId`);
|
||||
}
|
||||
} catch (err) {
|
||||
// Never fail the deploy — worst case prisma db push tries again anyway.
|
||||
// Log the error so it's visible in pod logs.
|
||||
console.error('bootstrap: non-fatal error:', err instanceof Error ? err.message : err);
|
||||
} finally {
|
||||
await prisma.$disconnect();
|
||||
}
|
||||
}
|
||||
|
||||
main().catch((err: unknown) => {
|
||||
console.error('bootstrap: fatal error (ignored):', err);
|
||||
// Intentionally exit 0 — we don't want to block the deploy on this.
|
||||
});
|
||||
Reference in New Issue
Block a user