Skip to content

Production Deployment

Architecture

SchemaStack production runs on a Hetzner CAX21 ARM server (4 vCPU, 8GB RAM) with Docker, fronted by Cloudflare. Quarkus services compile to GraalVM native images; Spring Boot services run on JVM.

Cloudflare (all proxied — origin IP hidden)

│ schemastack.io (same-origin, zero CORS):
├── /admin/*   → CF Worker → CF Pages (admin app)
├── /app/*     → CF Worker → CF Pages (app)
├── /api/*     → CF Worker → Hetzner (metadata-rest)
├── /sse/*     → CF Worker → Hetzner (consumer-worker, streaming)
├── /*         → CF Worker → CF Pages (public website)

├── status.schemastack.io   → CF Worker (KV-backed status JSON)
├── data.schemastack.io     → Hetzner (workspace-api)
├── docs.schemastack.io     → CF Pages (public docs)
└── dev.schemastack.io → CF Pages + CF Access (dev docs)

Hetzner CAX21 ARM (4 vCPU, 8GB RAM, Docker Compose)
├── Traefik         (reverse proxy, blue/green)
├── PostgreSQL 15   (metadata DB, localhost only)
├── PostgreSQL 15   (website DB, public :5433, SSL)
├── RabbitMQ 3.13
├── metadata-rest        (Quarkus native, 256MB)
├── consumer-worker      (Quarkus JVM, 512MB)
├── processor-service    (Spring Boot JVM, 576MB)
├── workspace-api        (Spring Boot JVM, 576MB)
├── pg-backup            (sidecar, 6-hourly pg_dump → S3)
├── Loki + Promtail      (log aggregation, 14d retention)
├── Grafana              (log search UI, SSH-tunnel only)
└── Dozzle              (log viewer, 32MB)

Services

ServicePortRuntimeRole
metadata-rest8080NativeAdmin API — entities, views, columns, orgs, auth. Runs Liquibase on startup.
consumer-worker8080JVMRabbitMQ task completion consumer + SSE broadcaster + file upload/thumbnails
processor-service8082JVMSchema migration processor — consumes from RabbitMQ, executes DDL
workspace-api8083JVMDynamic CRUD API — runtime Hibernate via ByteBuddy

Memory Budget

ServiceContainer Limit
PostgreSQL (metadata)1 GB
PostgreSQL (website)256 MB
RabbitMQ512 MB
Traefik128 MB
metadata-rest (native)256 MB
consumer-worker (JVM, -Xmx256m)512 MB
processor-service (JVM)576 MB
workspace-api (JVM)576 MB
pg-backup (sidecar)(alpine + aws-cli, negligible)
Loki384 MB
Promtail128 MB
Grafana192 MB
Dozzle32 MB
Steady-state total~4.4 GB

During blue/green deploy both app sets run briefly: ~6.1 GB on the 8 GB host.

Initial Server Setup

Local Prerequisites

bash
brew install ansible

One-Time Server Setup

After creating a Hetzner CAX21 ARM server (Ubuntu 24.04) with your SSH key:

bash
# SSH in as root
ssh root@schemastack

# Create deploy user (no password — SSH key only)
adduser --disabled-password --gecos "" deploy
usermod -aG sudo deploy
echo "deploy ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/deploy

# Copy SSH keys to deploy user
mkdir -p /home/deploy/.ssh
cp ~/.ssh/authorized_keys /home/deploy/.ssh/
chown -R deploy:deploy /home/deploy/.ssh
chmod 700 /home/deploy/.ssh && chmod 600 /home/deploy/.ssh/authorized_keys
exit

From your Mac:

bash
# Generate deploy key + copy to server
ssh-keygen -t ed25519 -f ~/.ssh/schemastack_deploy
ssh-copy-id -i ~/.ssh/schemastack_deploy deploy@schemastack

Add the SSH config entry so you can use ssh schemastack everywhere:

# ~/.ssh/config
Host schemastack
  Hostname 46.225.228.56
  Port 9978
  User deploy
  ProxyCommand none

SSH Port

SSH runs on port 9978 (not the default 22). This is configured automatically by the provisioning playbook via:

  • sshd_config — managed block with Port 22 and Port 9978 (sshd listens on both)
  • ssh.socket override — systemd socket activation (Ubuntu 22.04+) also needs the port added
  • UFW — only port 9978 is allowed; port 22 is removed from the firewall

sshd still binds to port 22 internally as a safety net (e.g., for recovery via Hetzner console), but UFW blocks it from the outside.

For fresh servers, the playbook connects on port 22 initially (default), configures sshd + socket + UFW for port 9978, verifies sshd is listening on 9978, then removes the UFW rule for port 22. The ansible_port: 9978 in inventory.yml is used for subsequent runs.

Cloudflare Origin Certificate

  1. Cloudflare dashboard → select schemastack.ioSSL/TLSOrigin Server
  2. Click Create Certificate
  3. Key type: RSA (2048), Hostnames: *.schemastack.io, schemastack.io, Validity: 15 years
  4. Copy Origin Certificate → save as traefik/certs/origin.pem
  5. Copy Private Key → save as traefik/certs/origin.key

The private key is only shown once — save it immediately.

Cloudflare Pages & Worker

One-time setup — create all five Pages projects:

bash
# Frontend apps (Worker proxies to these)
npx wrangler pages project create schemastack-admin --production-branch main
npx wrangler pages project create schemastack-app --production-branch main
npx wrangler pages project create schemastack-website --production-branch main

# Docs sites (accessed via CNAME subdomains)
npx wrangler pages project create schemastack-docs --production-branch main
npx wrangler pages project create schemastack-dev --production-branch main

Set custom domains so the docs sites are reachable at their subdomains:

Option A: CLI

bash
npx wrangler pages project edit schemastack-docs --domains docs.schemastack.io
npx wrangler pages project edit schemastack-dev --domains dev.schemastack.io

Option B: Dashboard

  1. Go to Workers & Pages → select the project (e.g. schemastack-docs)
  2. Custom domains tab → Set up a custom domain
  3. Enter docs.schemastack.io → Cloudflare automatically creates the CNAME DNS record
  4. Repeat for schemastack-dev with dev.schemastack.io

TIP

If you already created the CNAME DNS records manually, Cloudflare will detect them and activate the custom domain immediately. If not, the dashboard creates them for you.

One-time setup — deploy the Worker:

bash
cd schemastack-deployment/cloudflare/worker
npm install
npx wrangler deploy
npx wrangler secret put ORIGIN_SECRET   # paste same value as in .env.production.local

Deploying Pages — build + deploy via deploy-pages.sh:

bash
cd schemastack-deployment

./scripts/deploy-pages.sh              # Deploy all 5 sites
./scripts/deploy-pages.sh docs         # Public docs only (docs.schemastack.io)
./scripts/deploy-pages.sh dev          # Dev docs only (dev.schemastack.io)
./scripts/deploy-pages.sh admin        # Admin app only (schemastack.io/admin)
./scripts/deploy-pages.sh app          # App only (schemastack.io/app)
./scripts/deploy-pages.sh website      # Public website only (schemastack.io)

The script builds each site from source and deploys to Cloudflare Pages:

SiteSource repoBuildCF Pages project
Public docsschemastack-docsnpm run build:publicschemastack-docs
Dev docsschemastack-docsnpm run build:devschemastack-dev
Admin appschemastack-feng build adminschemastack-admin
Appschemastack-feng build appschemastack-app
Public websiteschemastack-feNone (static)schemastack-website

Secrets (Ansible Vault)

All production secrets are stored in .env.production.local, encrypted with Ansible Vault:

bash
# Create env file from template and fill in real values
cp .env.production .env.production.local
# Edit — replace all CHANGE_ME values

# Encrypt all secrets with Ansible Vault (same vault password for all)
ansible-vault encrypt .env.production.local
ansible-vault encrypt traefik/certs/origin.key

The vault password is the only secret you need to remember. Encrypted files are safe to commit.

To avoid typing the vault password every time, store it in a hidden file:

bash
echo 'your-vault-password' > .vault_pass
chmod 600 .vault_pass
# .vault_pass is already in .gitignore

Then pass --vault-password-file .vault_pass (or the short form) to any vault or playbook command:

bash
# Edit encrypted files
ansible-vault edit .env.production.local --vault-password-file .vault_pass

# View without editing
ansible-vault view .env.production.local --vault-password-file .vault_pass

# Decrypt to stdout (useful for piping/diffing)
ansible-vault decrypt .env.production.local --output -  --vault-password-file .vault_pass

Without .vault_pass, use --ask-vault-pass instead and enter the password interactively.

Provision Server

bash
./scripts/run.sh provision

The provisioning playbook supports tags for running specific sections:

TagWhat it covers
systemapt updates and base packages
firewallUFW rules (HTTP, HTTPS, PostgreSQL)
sshSSH port configuration and sshd
dockerDocker installation and config
appApp directory structure
certsSSL certificates (PostgreSQL, Cloudflare)
configEnvironment file, Traefik config, Docker Compose
infraStart infrastructure containers
bash
# Run only SSH-related tasks
./scripts/run.sh provision --tags ssh

# Update configs and restart infrastructure
./scripts/run.sh provision --tags config,infra

# Run everything except apt upgrades
./scripts/run.sh provision --skip-tags system

SSH Access to Services

PostgreSQL, RabbitMQ, Dozzle, and Grafana are only accessible via SSH tunnel:

ServiceTunnel CommandLocal URL
Grafana (log search)ssh -N -L 3000:localhost:3000 schemastackhttp://localhost:3000
Dozzle (live tail)ssh -N -L 8888:localhost:8888 schemastackhttp://localhost:8888
RabbitMQ Managementssh -N -L 15672:localhost:15672 schemastackhttp://localhost:15672
PostgreSQL (metadata)ssh -N -L 5432:localhost:5432 schemastackpsql -h localhost -p 5432
PostgreSQL (demo)ssh -N -L 5434:localhost:5434 schemastackpsql -h localhost -p 5434

PostgreSQL (demo) is a dedicated container for template-based demo workspaces. It holds the acme_store template database and all demo_{orgSlug} clones. Internal only — no external access, localhost-bound on port 5434 for SSH tunnel access. Credentials are in .env.production.local (vault-encrypted):

  • Admin: postgres / see DEMO_PG_ADMIN_PASSWORD in .env.production.local
  • Demo: demo_user / see DEMO_PG_DEMO_PASSWORD in .env.production.local

PostgreSQL (website) is publicly accessible on port 5433 with SSL (verify-full):

bash
# Connection string (requires server.crt as CA root cert)
psql "postgresql://website:<password>@46.225.228.56:5433/website?sslmode=verify-full&sslrootcert=pg-website-ca.crt"

# Fetch the CA cert from the server (one-time)
scp schemastack:/opt/schemastack/postgres-website-certs/server.crt ./pg-website-ca.crt

Combine multiple tunnels in a single command:

bash
ssh -N -L 8888:localhost:8888 -L 15672:localhost:15672 -L 5432:localhost:5432 -L 5434:localhost:5434 schemastack

Container Status & Logs

Checking Status

bash
ssh schemastack

# All containers with status
docker ps -a

# Resource usage (CPU, memory)
docker stats --no-stream

Viewing Logs

Two tools, complementary:

  • Grafana + Loki — persistent log aggregation with 30-day retention. Logs survive container redeploys. LogQL search across all containers, time-range queries, cross-blue/green correlation. Localhost-bound on 127.0.0.1:3000.
  • Dozzle — real-time live tail of all containers in one browser tab. Loses history on container redeploy but unbeatable for "what's happening right now." Localhost-bound on 127.0.0.1:8888.

Both via SSH tunnel — combine in one command:

bash
ssh -N -L 3000:localhost:3000 -L 8888:localhost:8888 schemastack

Open http://localhost:3000 for Grafana (login: admin + GRAFANA_ADMIN_PASSWORD from vault) or http://localhost:8888 for Dozzle (no login). Full LogQL examples and the persistence/retention story are in the dedicated Log Search page.

CLI Alternatives

bash
# Logs for a specific container
docker logs <container-name> --tail 100 -f

# All containers from the compose project
cd /opt/schemastack
docker compose logs -f --tail 50

TIP

Docker log rotation is configured at 50MB per file with 3 files max per container, so logs won't fill the disk. Loki ingests independently of this rotation, so search history goes back 30 days regardless of Docker's local file caps.

Building

Native Image Build (Quarkus)

Uses the docker-native Maven profile with Mandrel builder:

bash
# Single service
./mvnw package -DskipTests -Ddocker-native -pl quarkus/metadata/metadata-rest -am

# All services (via deploy script)
cd schemastack-deployment && ./scripts/build-jars.sh

Full Build & Deploy

bash
# 1. Build all artifacts locally
./scripts/build-jars.sh

# 2. Deploy via Ansible (blue/green, zero downtime)
./scripts/run.sh deploy

Single-Service Build & Deploy

When you only changed one service and want a faster deploy:

bash
# Build only the changed service
./scripts/build-jars.sh workspace-api

# Deploy only that service (blue/green, other services untouched)
./scripts/run.sh deploy --service workspace-api

Valid service names: metadata-rest, consumer-worker, processor, workspace-api

This builds the Docker image, starts the new color, waits for health check, then stops the old color — only for the specified service. Other services keep running on their current color.

Rollback

Roll back to the previous deployment color using the existing Docker images (no rebuild):

bash
./scripts/run.sh rollback

This detects the currently active color, starts the opposite color from its existing images, waits for health checks, then stops the current color. Useful when a deploy introduces a regression and you need to revert quickly.

Manual Deploy (SSH)

bash
ssh deploy@<server>
/opt/schemastack/scripts/deploy.sh

All run.sh commands auto-detect .vault_pass; if not found, they prompt for the vault password interactively.

Environment Variable Architecture

Environment variables are split between two mechanisms to support safe blue/green overrides:

  • env_file (.env.production.local) — all shared config: database credentials, RabbitMQ, S3, email, logging, framework-specific vars (QUARKUS_DATASOURCE_*, SPRING_DATASOURCE_*, etc.). Loaded via env_file: in the YAML anchors.
  • environment (per blue/green service) — only color-specific overrides like DEPLOY_COLOR: blue. Since there is no environment key in the anchors, blue/green services can safely set environment without triggering the YAML anchor replacement gotcha (YAML <<: replaces entire maps, it does not merge them).

Each blue/green service also has network aliases matching the base service name (e.g. consumer-worker-green is also reachable as consumer-worker on the Docker network). This allows inter-service communication using stable hostnames regardless of which color is active.

YAML Anchor Gotcha

Never add an environment key to the shared anchors (x-metadata-rest, etc.). If you do, any environment override in a blue/green service will silently replace the entire map, dropping all shared vars. Keep shared vars in env_file only.

Blue/Green Flow

  1. Detect active color (blue or green)
  2. Build Docker images from pre-built artifacts
  3. Start opposite color — metadata-rest first (Liquibase migrations)
  4. Wait for healthchecks (/q/health/ready for native, /actuator/health for JVM)
  5. Stop + remove old color — Traefik routes to remaining set

Response Compression

Traefik applies gzip/brotli compression to all routed responses via a shared compress middleware defined in traefik/dynamic.yml. The middleware is attached to every router (metadata-rest, workspace-api, consumer-files, MCP) via Docker Compose labels (compress@file).

text/event-stream is excluded — SSE streams must not be buffered or compressed as it breaks chunked delivery.

Cloudflare additionally compresses responses on the edge for end users, but the Traefik middleware ensures origin→Cloudflare traffic is also compressed (important for large API responses that bypass CF cache).

Routing

Cloudflare DNS

RecordTypeTargetProxyNotes
@No DNS record needed — CF Worker route handles schemastack.io/*
originAHetzner IPDNS only (grey cloud)Worker fetches from here; protected by shared secret header
statusAHetzner IPProxied (orange cloud)Status JSON API (CF Worker, CF Access bypassed)
dataAHetzner IPProxied (orange cloud)workspace-api public API
docsCNAMECF PagesProxiedPublic docs
devCNAMECF PagesProxiedDev docs (CF Access)

Traefik (on Hetzner)

HostPathTarget
origin.schemastack.io/api/*metadata-rest:8080
origin.schemastack.io/sse/*consumer-worker:8080
data.schemastack.io/*workspace-api:8083

CF Worker (schemastack.io)

typescript
/admin/*  → CF Pages Service Binding (admin)
/app/*    → CF Pages Service Binding (app)
/api/*    → fetch(origin.schemastack.io/api/...)
/sse/*    → fetch(origin.schemastack.io/sse/...)   // streams, no buffering
/*        → CF Pages (public website)

// status.schemastack.io (separate subdomain, bypasses CF Access)
GET /     → read STATUS_KV, return JSON { current, history }
cron */5  → fetch origin /api/status → write to STATUS_KV

PostgreSQL Website Database

A second PostgreSQL instance (postgres-website) runs on port 5433, publicly accessible over SSL. It serves website features (contact forms, newsletter subscriptions, etc.) from an external backend app.

How it works

  • Self-signed certificate with SAN = IP:46.225.228.56 for sslmode=verify-full
  • The cert is generated once by the provision playbook and stored at /opt/schemastack/postgres-website-certs/
  • UFW allows port 5433 from the internet
  • The self-signed server.crt doubles as the CA root cert (self-signed = its own CA)

Connection string

postgresql://website:<password>@46.225.228.56:5433/website?sslmode=verify-full&sslrootcert=pg-website-ca.crt

Fetching the CA cert (one-time)

The client app needs a copy of server.crt to verify the server identity:

bash
scp schemastack:/opt/schemastack/postgres-website-certs/server.crt ./pg-website-ca.crt

Regenerating the SSL certificate

If the server IP changes or the cert expires (valid 10 years):

bash
ssh schemastack

# Remove old cert so the provision playbook regenerates it
sudo rm /opt/schemastack/postgres-website-certs/server.crt /opt/schemastack/postgres-website-certs/server.key

# Re-run provision (or just the openssl command manually)
# Then restart the container:
cd /opt/schemastack
docker compose --env-file .env.production.local restart postgres-website

After regenerating, fetch the new server.crt to all clients.

Verification

bash
# Container is healthy
docker ps | grep postgres-website

# SSL is on
psql "postgresql://website:<password>@46.225.228.56:5433/website?sslmode=verify-full&sslrootcert=pg-website-ca.crt" -c "SHOW ssl;"
# → on

# UFW allows port 5433
sudo ufw status | grep 5433
# → 5433/tcp ALLOW Anywhere

Security

  • Origin secret header: CF Worker sends X-Origin-Secret on every proxied request to origin.schemastack.io. Traefik router rules use HeadersRegexp to reject requests without a matching header (returns 404). This prevents direct access even though the IP is visible in DNS (grey cloud, required for Worker fetch).
    • Set on Worker: wrangler secret put ORIGIN_SECRET
    • Set on server: ORIGIN_SECRET in .env.production.local
  • data.schemastack.io proxied via Cloudflare (orange cloud) — origin IP hidden, DDoS protected
  • UFW on Hetzner: ports 9978 (SSH), 80, 443, 5433
  • Cloudflare Origin Certificate for Hetzner ↔ CF (Full Strict mode)
  • Dev docs (Cloudflare Access + OTP): dev.schemastack.io and all *.schemastack-dev.pages.dev aliases are protected with email-based one-time PIN authentication. Setup:
    1. Zero TrustAccessApplicationsAdd an applicationSelf-hosted
    2. Application name: Dev Docs
    3. Application domain: dev.schemastack.io
    4. Add second domain: *.schemastack-dev.pages.dev (prevents bypass via preview URLs)
    5. Create policy: name Team access, action Allow, include rule Emails → add team email addresses
    6. Authentication method: One-time PIN (default — sends OTP to the allowed email)
    7. Session duration: 24 hours (default, adjustable under application settings)

Future Improvements

Canary / IP-based deployments

Currently blue/green switches all traffic at once. Traefik natively supports two safer alternatives:

  • Canary (weighted routing) — send a percentage of traffic (e.g. 10%) to the new color before fully switching, using Traefik's weighted round-robin service
  • IP-based routing — route a specific IP to the new color for manual verification before switching, using Traefik's ClientIP() matcher with higher priority

Both approaches require moving some routing config from Docker Compose labels to traefik/dynamic.yml and updating deploy.sh to support a --canary or --test-ip flag instead of immediate switchover.

SchemaStack Internal Developer Documentation