Appearance
Migration Coordination Design
How the system coordinates schema migrations with concurrent data operations across all endpoints, and how the frontend should respond to each migration state.
Problem Statement
When an admin triggers a schema migration (column type change, add NOT NULL, etc.), PostgreSQL acquires locks on the affected table. Depending on the operation, this can block all concurrent reads and writes — affecting:
- Data endpoints (
/api/data/{view}/rows, cell edits, bulk operations) - Workspace API (auto-generated REST API used by external consumers)
- MCP API (AI assistant tool calls)
- Other SSE-connected users viewing the same or related tables
- FK-related tables that cascade into the migrating table
Currently, none of these systems know a migration is happening. They hit lock waits, timeout, and return 500s.
PostgreSQL Lock Levels per Operation
Every ALTER TABLE acquires a lock. The lock type determines what concurrent operations are blocked.
| ALTER TABLE Operation | Lock Level | Table Rewrite? | Duration | Blocks SELECT? | Blocks INSERT/UPDATE/DELETE? |
|---|---|---|---|---|---|
ADD COLUMN (nullable, no default) | AccessExclusive | No | Instant | Briefly | Briefly |
ADD COLUMN ... DEFAULT x (PG 11+) | AccessExclusive | No | Instant | Briefly | Briefly |
ADD COLUMN ... DEFAULT x (PG < 11) | AccessExclusive | Yes | Proportional to rows | Yes | Yes |
DROP COLUMN | AccessExclusive | No | Instant | Briefly | Briefly |
ALTER COLUMN ... TYPE (type change) | AccessExclusive | Yes | Proportional to rows | Yes | Yes |
ALTER COLUMN ... SET NOT NULL | AccessExclusive | No (scan only) | Proportional to rows | Yes | Yes |
ALTER COLUMN ... DROP NOT NULL | AccessExclusive | No | Instant | Briefly | Briefly |
ALTER COLUMN ... SET DEFAULT | AccessExclusive | No | Instant | Briefly | Briefly |
ALTER COLUMN ... DROP DEFAULT | AccessExclusive | No | Instant | Briefly | Briefly |
ADD CONSTRAINT ... UNIQUE | ShareLock | No (index build) | Proportional to rows | No | Yes |
ADD CONSTRAINT ... FOREIGN KEY | ShareRowExclusive | No | Fast (validates rows) | No | Yes on both tables |
DROP CONSTRAINT | AccessExclusive | No | Instant | Briefly | Briefly |
RENAME COLUMN | AccessExclusive | No | Instant | Briefly | Briefly |
Key Insight
"AccessExclusive" sounds scary, but for instant operations the lock is held for microseconds — effectively invisible. The danger is table rewrites and full table scans where the lock is held for seconds to minutes.
MySQL Lock Levels per Operation
MySQL's InnoDB uses Online DDL (since 5.6/8.0) with three algorithms: INSTANT, INPLACE, and COPY. The algorithm determines lock behavior.
| ALTER TABLE Operation | Algorithm | Permits Concurrent DML? | Table Rebuild? | Duration | Blocks SELECT? | Blocks INSERT/UPDATE/DELETE? |
|---|---|---|---|---|---|---|
ADD COLUMN (last position, 8.0.12+) | INSTANT | Yes | No | Instant | No | No |
ADD COLUMN (any position, 8.0.29+) | INSTANT | Yes | No | Instant | No | No |
ADD COLUMN (older MySQL) | INPLACE | Yes | Yes | Proportional to rows | No | Brief (start/end only) |
DROP COLUMN (8.0.29+) | INSTANT | Yes | No | Instant | No | No |
DROP COLUMN (older MySQL) | INPLACE | Yes | Yes | Proportional to rows | No | Brief |
MODIFY COLUMN ... TYPE (type change) | COPY | No | Yes | Proportional to rows | No | Yes |
MODIFY COLUMN ... NOT NULL | INPLACE | Yes | Yes | Proportional to rows | No | Brief |
MODIFY COLUMN ... NULL | INPLACE | Yes | Yes | Proportional to rows | No | Brief |
ALTER COLUMN ... SET DEFAULT | INSTANT | Yes | No | Instant | No | No |
ALTER COLUMN ... DROP DEFAULT | INSTANT | Yes | No | Instant | No | No |
ADD INDEX / UNIQUE | INPLACE | Yes | No | Proportional to rows | No | Brief (start/end only) |
ADD FOREIGN KEY | INPLACE | Yes | No | Fast (validates rows) | No | Brief |
DROP FOREIGN KEY | INPLACE | Yes | No | Instant | No | No |
DROP INDEX | INPLACE | Yes | No | Instant | No | No |
RENAME COLUMN | INSTANT | Yes | No | Instant | No | No |
Key Differences from PostgreSQL
| Aspect | PostgreSQL | MySQL (InnoDB) |
|---|---|---|
ADD COLUMN (nullable) | AccessExclusive (brief) | INSTANT (no lock at all, 8.0.12+) |
DROP COLUMN | AccessExclusive (brief) | INSTANT (8.0.29+) or INPLACE (rebuild) |
ALTER COLUMN TYPE | Full table rewrite, blocks everything | COPY algorithm, blocks DML but not SELECT |
SET NOT NULL | Full table scan, blocks everything | INPLACE rebuild, allows concurrent reads AND writes |
ADD UNIQUE INDEX | ShareLock (blocks writes) | INPLACE, allows concurrent DML |
ADD FOREIGN KEY | ShareRowExclusive (blocks writes on both tables) | INPLACE, brief lock only |
| Concurrent reads during rewrite | Blocked | Allowed (except COPY algorithm) |
Critical Implication
The same migration can have completely different impact depending on the database vendor:
SET NOT NULL: PostgreSQL blocks all reads/writes during scan. MySQL allows full concurrent DML.ALTER TYPE: PostgreSQL blocks everything. MySQL blocks DML but still allows SELECT.ADD COLUMN: PostgreSQL takes AccessExclusive briefly. MySQL is truly zero-lock (INSTANT).
The impact classification must be vendor-aware. The system already knows the workspace database vendor — use it.
MySQL Version Detection
MySQL's INSTANT algorithm availability depends on version:
| Feature | MySQL Version Required |
|---|---|
| INSTANT add column (last position) | 8.0.12+ |
| INSTANT add column (any position) | 8.0.29+ |
| INSTANT drop column | 8.0.29+ |
| INSTANT rename column | 8.0 |
| Online DDL (INPLACE) | 5.6+ |
The processor should query SELECT VERSION() and cache the result per workspace to determine available algorithms.
Migration Impact Classification
Each migration should be classified at detection time (before queuing) into one of these impact levels. Classification is vendor-specific.
Level 1: Transparent
Lock duration: < 100ms (instant metadata-only operations) Affected operations: None in practice Frontend state: No change needed — migration completes before anyone notices Data endpoints: Normal operation
PostgreSQL operations:
- Add nullable column (no default)
- Drop column
- Drop NOT NULL
- Set/drop default value
- Rename column
MySQL operations (all of the above, plus):
- Add column with DEFAULT (INSTANT, 8.0.12+)
- Drop column (INSTANT, 8.0.29+)
- Set NOT NULL (INPLACE — allows concurrent DML, brief metadata lock only)
Level 2: Brief Lock
Lock duration: 100ms – 5s (index builds, FK validation on small-medium tables) Affected operations: Writes may queue briefly Frontend state: Show subtle migration indicator, don't lock the UI Data endpoints: May see brief latency spike, no errors expected
PostgreSQL operations:
- Add UNIQUE constraint (ShareLock — blocks writes during index build)
- Add FOREIGN KEY constraint (ShareRowExclusive — blocks writes on both tables)
- Add nullable column with DEFAULT (PG 11+, instant but AccessExclusive)
MySQL operations:
- Add UNIQUE index (INPLACE — brief metadata lock at start/end, concurrent DML allowed during build)
- Add FOREIGN KEY (INPLACE — brief metadata lock, concurrent DML allowed)
- Add column (INPLACE, pre-8.0.12 — table rebuild but concurrent DML allowed)
- Drop column (INPLACE, pre-8.0.29 — table rebuild but concurrent DML allowed)
Level 3: Blocking
Lock duration: 5s – minutes (table rewrite or full scan on large tables) Frontend state: Lock affected view(s), show migration progress, disable editing Data endpoints: Return 503 Service Unavailable with Retry-After header
PostgreSQL — blocks ALL reads and writes:
- ALTER COLUMN TYPE (type change) — full table rewrite
- SET NOT NULL — full table scan to validate no NULLs exist
- Add column with DEFAULT on PG < 11
MySQL — blocks DML but allows SELECT (COPY algorithm):
- MODIFY COLUMN TYPE (type change) — full table copy, reads still work
- Converting between incompatible charsets
Vendor-Specific Frontend Behavior for Level 3
Because MySQL's COPY algorithm still allows SELECT:
| State | PostgreSQL | MySQL |
|---|---|---|
| SELECT / scrolling | Blocked — show overlay | Allowed — read-only mode |
| INSERT / UPDATE / DELETE | Blocked | Blocked |
| Frontend state | Full migration overlay | Read-only banner + disable editing |
This means the blocksReads field in the impact analysis is critical — it's true for PostgreSQL type changes but false for MySQL type changes.
MySQL Storage Engine Considerations
The lock behavior above assumes InnoDB (the default since MySQL 5.5). Other storage engines have fundamentally different DDL behavior.
MyISAM
MyISAM has no Online DDL support — every ALTER TABLE performs a full table copy with a table-level lock that blocks all reads AND writes for the entire duration. There is no INSTANT or INPLACE algorithm.
| Operation | MyISAM Behavior | InnoDB Equivalent |
|---|---|---|
| Any ALTER TABLE | Full copy + table lock | Varies (INSTANT/INPLACE/COPY) |
| Concurrent SELECT | Blocked | Usually allowed |
| Concurrent INSERT/UPDATE | Blocked | Depends on algorithm |
MEMORY, ARCHIVE, CSV, etc.
These engines also lack Online DDL. All ALTER TABLE operations are full-copy with table locks.
What the System Must Do
- Detect the storage engine per table:
SELECT ENGINE FROM information_schema.TABLES WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ? - If not InnoDB: Classify ALL migrations as Level 3 (BLOCKING, blocks reads + writes) regardless of operation type
- Recommend InnoDB: Show a warning in the admin UI if non-InnoDB tables are detected — "This table uses MyISAM, which blocks all operations during schema changes. Consider converting to InnoDB for better concurrent access."
- Store engine per view in metadata: Cache the storage engine in
VieworColumnMetadataso impact analysis doesn't need a live DB query every time
Engine Detection Query
sql
-- Per table
SELECT TABLE_NAME, ENGINE
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'workspace_db'
AND TABLE_NAME = 'products';
-- All tables in workspace
SELECT TABLE_NAME, ENGINE
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'workspace_db'
AND TABLE_TYPE = 'BASE TABLE';This should be cached during schema sync and stored in workspace metadata. Re-checked on each sync or drift detection.
Impact Analysis: What the System Must Know
When a migration is detected (in ViewColumnService.updateWithSchemaDetection()), the system must compute a MigrationImpact before queuing:
MigrationImpact {
// Classification
level: TRANSPARENT | BRIEF | BLOCKING
estimatedDurationMs: long
// Vendor context
databaseVendor: POSTGRESQL | MYSQL
mysqlVersion: string? // e.g. "8.0.35" — affects INSTANT availability
storageEngine: string? // e.g. "InnoDB", "MyISAM" — MySQL only
// Affected scope
primaryTable: string // The table being altered
primaryViewUuids: UUID[] // All views backed by this table
// FK cascade analysis
affectedFkTables: FkImpact[] // Tables with FK references to/from primaryTable
totalAffectedTables: int
// Lock analysis
lockType: string // AccessExclusive, ShareLock, ShareRowExclusive
blocksReads: boolean
blocksWrites: boolean
requiresTableRewrite: boolean
// Row count context
estimatedRowCount: long // From pg_stat_user_tables.n_live_tup
// Per-operation breakdown (a single PATCH can trigger multiple changes)
operations: MigrationOperation[]
}
MigrationOperation {
description: string // "Change type VARCHAR → INTEGER"
sqlPreview: string // "ALTER TABLE users ALTER COLUMN age TYPE INTEGER USING age::INTEGER"
lockType: string
estimatedDurationMs: long
requiresTableRewrite: boolean
}
FkImpact {
table: string
viewUuids: UUID[]
fkConstraint: string // Constraint name
cascadeType: string // RESTRICT, CASCADE, SET NULL, NO ACTION
direction: REFERENCES | REFERENCED_BY
impactOnMigration: string // "FK validation required" / "CASCADE may propagate lock"
blocksReads: boolean
blocksWrites: boolean
}Row Count Thresholds
Estimated duration depends on row count. These thresholds determine the impact level for table-rewrite operations:
| Row Count | Type Change Duration (est.) | Impact Level |
|---|---|---|
| < 10,000 | < 1s | Brief |
| 10,000 – 100,000 | 1s – 10s | Blocking |
| 100,000 – 1,000,000 | 10s – 60s | Blocking (warn) |
| > 1,000,000 | > 60s | Blocking (require confirmation) |
Row counts come from pg_stat_user_tables.n_live_tup (approximate but fast — no table scan needed).
FK Cascade Analysis
Foreign keys create cross-table dependencies that extend a migration's blast radius.
Scenario: Migration on a Referenced Table
Table categories is referenced by products.category_id (FK).
Adding NOT NULL to categories.name:
- Lock on
categoriesonly productsis unaffected (FK points FROM products TO categories)- But: INSERT/UPDATE on
productsthat referencescategorieswill block if the category table is locked
Dropping categories.id (referenced column):
- Depends on FK cascade rule:
RESTRICT/NO ACTION: Migration fails immediately (FK violation)CASCADE: Drops the FK constraint onproductstoo — both tables affectedSET NULL: Setsproducts.category_idto NULL for affected rows — writes to products
Scenario: Migration on a Referencing Table
Table products has FK category_id → categories.id.
Changing products.category_id type:
- Must drop FK first, alter column, recreate FK
- Lock on
products(table rewrite for type change) - Brief lock on
categories(FK recreation validates references) - Both tables affected
Scenario: Cascade Chain
order_items → orders → customers
Migration on customers.id type with CASCADE:
customers: table rewrite (type change)orders: FK must be dropped and recreated (validates all rows)order_items: FK must be dropped and recreated (validates all rows)- All three tables locked at various points
What the System Must Compute
For every migration, walk the FK graph in both directions:
- Outgoing FKs (this table references others): Will the migration break or need to drop/recreate these?
- Incoming FKs (other tables reference this one): Will the migration cascade? What's the cascade rule?
- Transitive FKs (chains): If cascade propagates, follow the chain
- For each affected table: Which views are backed by it? How many rows?
Endpoint Behavior During Migrations
Migration State per Table
The metadata DB tracks active migrations per table:
active_migration {
workspace_id: long
table_name: string
column_uuid: UUID
impact_level: TRANSPARENT | BRIEF | BLOCKING
blocks_reads: boolean
blocks_writes: boolean
affected_tables: string[] // Including FK cascade targets
started_at: timestamp
estimated_completion: timestamp
operation_id: string // Links to TaskCompletionMessage
}Data Endpoints (/api/data/...)
| Migration Level | SELECT (read) | INSERT | UPDATE | DELETE | Bulk Operations |
|---|---|---|---|---|---|
| No migration | Normal | Normal | Normal | Normal | Normal |
| TRANSPARENT | Normal | Normal | Normal | Normal | Normal |
| BRIEF | Normal | Normal (may queue 1-5s) | Normal (may queue) | Normal (may queue) | Normal |
| BLOCKING + blocks_reads | 503 + Retry-After | 503 | 503 | 503 | 503 |
| BLOCKING + writes only | Normal | 503 | 503 | 503 | 503 |
Response for 503:
json
{
"error": "schema_migration_in_progress",
"message": "A schema migration is in progress on this table. Please retry shortly.",
"table": "products",
"operationId": "abc-123",
"estimatedCompletionAt": "2026-03-16T20:35:00Z",
"retryAfterSeconds": 15,
"affectedOperations": ["INSERT", "UPDATE", "DELETE"]
}Workspace API (auto-generated REST)
Same behavior as data endpoints — the workspace API hits the same tables. External consumers (Zapier, n8n, etc.) must handle 503 + Retry-After gracefully.
MCP API
Same behavior. AI assistants calling MCP tools should see:
- Tool call result:
"error": "schema_migration_in_progress"with context - The AI can inform the user and retry after the estimated completion time
Implementation: Request Interceptors
Rather than adding migration checks to every endpoint, use framework-level interceptors that check the active_migration table before the request reaches the endpoint.
Quarkus (Data Endpoints) — JAX-RS ContainerRequestFilter
java
@Provider
@Priority(Priorities.AUTHORIZATION + 10) // Run after auth, before endpoint
public class MigrationGuardFilter implements ContainerRequestFilter {
@Inject
ActiveMigrationRepository activeMigrationRepository;
@Override
public void filter(ContainerRequestContext ctx) throws IOException {
// Extract table/view from request path (e.g. /api/data/{viewUuid}/rows)
String viewUuid = extractViewUuid(ctx);
if (viewUuid == null) return; // Not a data endpoint
String method = ctx.getMethod();
boolean isRead = "GET".equals(method) || "HEAD".equals(method);
ActiveMigration migration = activeMigrationRepository
.findBlockingByViewUuid(viewUuid);
if (migration == null) return; // No active migration
// Check if this operation type is blocked
if (isRead && !migration.isBlocksReads()) return; // Reads allowed
if (!isRead && !migration.isBlocksWrites()) return; // Writes allowed
// Block the request
ctx.abortWith(Response.status(503)
.header("Retry-After", migration.getRetryAfterSeconds())
.entity(migration.toErrorResponse())
.build());
}
}Covers: /api/data/... endpoints (cell edits, row inserts, bulk operations, selects)
Spring Boot (Workspace API + MCP API) — HandlerInterceptor
java
@Component
public class MigrationGuardInterceptor implements HandlerInterceptor {
private final ActiveMigrationService activeMigrationService;
@Override
public boolean preHandle(HttpServletRequest request,
HttpServletResponse response,
Object handler) throws Exception {
// Extract workspace + table from request path
String tableName = extractTableName(request);
Long workspaceId = extractWorkspaceId(request);
if (tableName == null || workspaceId == null) return true;
boolean isRead = "GET".equals(request.getMethod());
ActiveMigration migration = activeMigrationService
.findBlocking(workspaceId, tableName);
if (migration == null) return true;
if (isRead && !migration.isBlocksReads()) return true;
response.setStatus(503);
response.setHeader("Retry-After",
String.valueOf(migration.getRetryAfterSeconds()));
response.setContentType("application/json");
response.getWriter().write(migration.toErrorJson());
return false; // Short-circuit — don't invoke endpoint
}
}Covers: Workspace REST API (/api/workspace/{slug}/...) and MCP tool calls
Why Interceptors, Not Per-Endpoint Checks
- Single point of enforcement — every current and future data endpoint is protected automatically
- No code duplication — migration check logic lives in one place
- Cannot be forgotten — new endpoints get migration protection by default
- Fast path —
active_migrationlookup is a simple indexed query, cached in-memory for the duration of a migration - Clean separation — endpoints don't need to know about migrations at all
Caching the Active Migration State
The active_migration table will be queried on every data request during a migration. To avoid per-request DB hits:
- Cache active migrations in-memory per workspace (small dataset — typically 0-1 active migrations)
- Invalidate on
TaskCompletionMessage(migration success/failure) - TTL fallback: expire cache after 5 minutes to handle missed completion messages
- On Quarkus side: use CDI
@ApplicationScopedbean withConcurrentHashMap - On Spring Boot side: use
ConcurrentHashMapin the interceptor service
Frontend State Machine
Each view in the frontend should track migration state and adjust the UI accordingly.
States
NORMAL
│
├── view.column.schema.changing (TRANSPARENT)
│ → No visible change, migration completes instantly
│ → Transition to NORMAL on view.column.schema.changed
│
├── view.column.schema.changing (BRIEF)
│ → Show subtle indicator on affected column header
│ → Disable editing for that column only
│ → Other columns remain editable
│ → Transition to NORMAL on view.column.schema.changed
│
├── view.column.schema.changing (BLOCKING, writes only)
│ → Show migration banner: "Schema migration in progress — editing disabled"
│ → Disable all INSERT/UPDATE/DELETE on affected view(s)
│ → SELECT/scrolling still works
│ → Show progress indicator with estimated time
│ → Transition to NORMAL on view.column.schema.changed
│
└── view.column.schema.changing (BLOCKING, reads + writes)
→ Show full migration overlay: "Schema migration in progress — please wait"
→ Disable all operations on affected view(s)
→ Show progress indicator with estimated time
→ Transition to NORMAL on view.column.schema.changedMulti-View Impact
When FK cascade analysis shows multiple tables affected:
- The SSE event includes
affectedViewUuids[] - Frontend locks ALL listed views, not just the one where the migration was triggered
- Each affected view shows a banner: "Related table
{tableName}is being migrated"
What the SSE Event Must Carry
The view.column.schema.changing event needs to be extended:
typescript
interface SchemaChangingEvent {
// Existing fields
columnUuid: string;
columnName: string;
viewUuid: string;
reason: string;
// New: impact analysis
impactLevel: 'TRANSPARENT' | 'BRIEF' | 'BLOCKING';
blocksReads: boolean;
blocksWrites: boolean;
estimatedDurationMs: number;
estimatedCompletionAt: string; // ISO 8601
// New: cross-table impact
affectedViewUuids: string[]; // All views that should show migration state
affectedTables: string[]; // For admin visibility
// New: operation detail
operations: {
description: string;
requiresTableRewrite: boolean;
}[];
}Echo Filtering for Migration Events
Migration lock events should NOT be echo-filtered. The originating client also needs to show the migration state — they triggered it, and their view should reflect the lock too. Only the view.column.updated event (the metadata change) should be echo-filtered.
Pre-Migration Confirmation Flow
For BLOCKING migrations on tables with significant row counts, the system should require explicit confirmation.
Flow
Admin changes column type VARCHAR → INTEGER
│
▼
Backend computes MigrationImpact
│
├── Level: BLOCKING
├── Estimated duration: 45s
├── Blocks: reads + writes
├── Row count: 500,000
├── FK impact: 2 related tables (orders, order_items)
├── Total affected views: 3
│
▼
Response: 200 OK (dry-run)
{
"dryRun": true,
"impact": { ...full MigrationImpact... },
"confirmationRequired": true,
"message": "This migration will rewrite 500,000 rows and block 3 views for ~45 seconds. 2 related tables will also be affected.",
"confirmUrl": "/api/columns/item/{uuid}/confirm-migration",
"confirmToken": "one-time-token"
}
│
▼
Frontend shows confirmation dialog with full impact details
│
▼
Admin confirms
│
▼
POST /api/columns/item/{uuid}/confirm-migration
{ "confirmToken": "one-time-token" }
│
▼
Backend queues the migration (202 Accepted)When to Require Confirmation
| Condition | Require Confirmation? |
|---|---|
| TRANSPARENT migration | No |
| BRIEF migration, < 50k rows | No |
| BRIEF migration, > 50k rows | Optional (show warning, don't block) |
| BLOCKING migration, < 10k rows | No (fast enough) |
| BLOCKING migration, 10k–100k rows | Yes |
| BLOCKING migration, > 100k rows | Yes + strong warning |
| Any migration with FK cascade to large tables | Yes |
Concurrent Migration Prevention
Only one BLOCKING migration should run per table at a time.
Check Before Accepting
java
// In ViewColumnService.updateWithSchemaDetection()
if (impact.level == BLOCKING) {
ActiveMigration existing = activeMigrationRepository
.findByTableName(impact.primaryTable);
if (existing != null) {
throw new ConflictException(
"A migration is already in progress on table '" +
impact.primaryTable + "'. Please wait for it to complete."
);
// Returns 409 Conflict
}
}TRANSPARENT and BRIEF migrations: Allow concurrent
These are fast enough that serialization isn't worth the complexity. PostgreSQL's own lock management handles it.
BLOCKING migrations: One at a time per table
Enforced by the active_migration table with a unique constraint on (workspace_id, table_name) where completed_at IS NULL.
Implementation Phases
Phase 1: Vendor & Engine Detection (foundation)
- Detect database vendor (PostgreSQL vs MySQL) — already available in workspace config
- Query MySQL version (
SELECT VERSION()) and cache per workspace - Query storage engine per table (
information_schema.TABLES) during schema sync - Store vendor, version, and storage engine in workspace/view metadata
- Warn admins if non-InnoDB tables are detected
Phase 2: Migration Impact Analysis
Build vendor-aware operation classifier (same operation → different impact per vendor + engine)✅ Done —MigrationImpactAnalyzerwith PostgreSQL, MySQL InnoDB, and non-InnoDB classifiersCompute✅ Done —MigrationImpactfrom detected schema changesMigrationImpactDTOwithImpactLevel(TRANSPARENT/BRIEF/BLOCKING), per-operation breakdownReturn impact data in✅ Done — wired intoColumnUpdateResultDTOfor 202 responsesViewColumnService.updateWithSchemaDetection()Extend✅ Done —view.column.schema.changingSSE event with impact fieldsimpactLevel,blocksReads,blocksWrites,estimatedDurationMsAdd row count estimation:✅ Done —pg_stat_user_tables.n_live_tup(PostgreSQL) orinformation_schema.TABLES.TABLE_ROWS(MySQL)TableStatsServicequeries workspace DB via JDBC on worker threadQuery MySQL version (✅ Done —SELECT VERSION()) and cache per workspaceTableStatsService.getMysqlVersion()Query storage engine per table (✅ Done —information_schema.TABLES)TableStatsService.getMysqlStorageEngine()
Phase 3: Data Endpoint Protection
Add✅ Done — changesetactive_migrationtable to metadata DB (Liquibase changeset)metadata-1.50Write migration state on queue (before sending to processor)✅ Done —ActiveMigrationService.startMigration()called fromViewColumnServiceClear migration state on✅ Done —TaskCompletionMessage(success or failure)completeMigrationByColumnUuid()called fromColumnOperationHandlerImplement✅ Done — interceptsMigrationGuardFilter(JAX-RS ContainerRequestFilter) for Quarkus data endpoints/api/data/paths, checks viewUuid against cacheImplement✅ Done — interceptsMigrationGuardInterceptor(Spring HandlerInterceptor) for workspace API/api/v1/**, extracts table name from URL, checks viaActiveMigrationJdbcServiceAdd MCP migration guard✅ Done —McpAccessGuard.requireNoActiveMigration()protects all DataTools (query, get, create, update)Add in-memory cache for active migration state with TTL fallback✅ Done — Quarkus:ConcurrentHashMapwith 5-min TTL; Spring Boot: Caffeine cache with 5-sec TTLReturn 503 + Retry-After with structured error response✅ Done —SCHEMA_MIGRATION_IN_PROGRESSerror code withretryAfterSeconds,blocksReads,blocksWrites
Phase 4: FK Cascade Analysis
- Build FK graph from
ColumnMetadatarelationships andinformation_schema - Walk the graph in both directions for each migration to find all affected tables
- Account for cascade rules (RESTRICT, CASCADE, SET NULL, NO ACTION)
- Include affected tables in impact analysis and SSE events
- Extend interceptor to check FK-affected tables, not just the primary table
Phase 5: Pre-Migration Confirmation
- Add dry-run mode to the PATCH endpoint
- Generate confirmation tokens with TTL
- Add
POST .../confirm-migrationendpoint - Frontend confirmation dialog with full impact visualization (affected views, row counts, estimated duration, FK cascade)
Phase 6: Duration Estimation and Progress
Track actual migration durations to improve estimates over time✅ Done —MigrationDurationTrackerrecords estimated vs actual duration inmigration_duration_historytableBenchmark table rewrite speed per row✅ Done —MigrationDurationTracker.getRefinedMsPerRow()computes average ms/row from historical data, falling back to static defaultsAdd progress reporting from processor✅ Done —MigrationProgressPollerpollspg_stat_progress_alter_table(PG 12+) every 2s via second JDBC connection; falls back to time-based estimates for MySQL/older PGReal-time progress in SSE events✅ Done —MigrationProgressMessage→MigrationProgressConsumer→view.column.schema.progressSSE event with percent, phase, elapsed, rows processedMySQL: no equivalent progress view✅ Done — time-based progress estimation used automatically whenpg_stat_progress_alter_tableis unavailableWire✅ Done —getRefinedMsPerRow()intoMigrationImpactAnalyzerMigrationImpactAnalyzer.analyze()now accepts refined rates;ViewColumnServicepre-fetches workspace-specific rates viaMigrationDurationTracker
Open Questions
- Should TRANSPARENT migrations still send
schema.changingevents? They complete instantly — the event might arrive after theschema.changedevent. Consider skipping the lock event for Level 1. - Timeout handling: What if a BLOCKING migration takes longer than estimated? Should the system extend the Retry-After, or should there be a hard timeout with auto-rollback?
- Processor health: What if the processor crashes mid-migration? The
active_migrationrecord stays forever. Need a TTL or heartbeat mechanism. - Read replicas: If the workspace database has read replicas, SELECTs could be routed there during migrations. Out of scope for now but worth noting.
- Online schema change tools: For very large tables (> 10M rows), tools like
pg_repackorpgroll(PostgreSQL) orgh-ost/pt-online-schema-change(MySQL) can do non-blocking schema changes. Worth evaluating for the ZFS-hosted tier. - MyISAM conversion: Should the system offer a one-click "Convert to InnoDB" for MyISAM tables? This is itself a blocking migration, but a one-time improvement.
- MySQL ALGORITHM hint: Should the processor explicitly request
ALGORITHM=INSTANTorALGORITHM=INPLACEwhen available, to fail fast if MySQL would fall back to COPY? This prevents silent performance degradation.