Docker Multi-Stage Builds and Compose Strategy
By David Le -- Part 14 of the FhirHub Series
The first 13 posts covered building FhirHub -- the application code, the FHIR resources, the frontend, the API gateway. But shipping code is only half the problem. The other half is getting it into production reliably, repeatedly, and with confidence that nothing breaks along the way.
This post begins a 5-part DevOps sub-series covering the full deployment stack. We start with Docker -- multi-stage builds that produce small, secure images and a three-file Compose strategy that cleanly separates dev from production.
Why DevOps Matters for Healthcare
Healthcare applications have stricter deployment requirements than most software. You can't push untested code to a system that manages patient data. You need:
- Reproducible builds -- The same code produces the same artifact every time
- Automated testing -- Every change is validated before it reaches production
- Audit trails -- Git history and pipeline logs record who changed what and when
- Rollback capability -- A bad deployment can be reversed in minutes, not hours
- Environment parity -- Dev, staging, and production run the same containers
This isn't optional in regulated environments. It's the baseline.
Frontend Dockerfile
The frontend Dockerfile uses three stages to keep the final image small:
# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts
# Stage 2: Build the application
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
# Stage 3: Production runner
FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
RUN addgroup --system --gid 1001 nodejs && \
adduser --system --uid 1001 nextjs
COPY --from=builder /app/public ./public
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
USER nextjs
EXPOSE 3000
CMD ["node", "server.js"]
Key decisions:
output: "standalone"innext.config.tsenables Next.js to produce a self-containedserver.jsthat doesn't neednode_modulesat runtime. The final image is ~120MB instead of ~800MB.- Non-root user (nextjs:1001) follows the principle of least privilege. If the container is compromised, the attacker can't modify system files.
- *
NEXT_PUBLIC_build args** are baked at build time because Next.js inlines them during compilation. They aren't secrets -- they're public URLs the browser needs. - Separate deps stage means changing source code doesn't re-run
npm ci. Docker layer caching makes rebuilds fast.
Checkpoint: Build the Frontend Image
Before continuing, verify the frontend Dockerfile works:
docker build -t fhirhub-frontend:local -f frontend/Dockerfile frontend/
Expected output:
- The final line shows an image ID and the build completes without errors
- No
npm ERR!orCOPY --fromfailures
docker images fhirhub-frontend:local --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"
Expected output:
- Image size should be ~120MB, not ~800MB. If it's close to 800MB, the multi-stage build isn't working -- check that
output: "standalone"is set innext.config.ts
docker run --rm fhirhub-frontend:local whoami
Expected output:
- Should print
nextjs, confirming the container runs as a non-root user
If something went wrong:
- If
npm cifails, check thatpackage-lock.jsonis up to date (npm installon the host, then rebuild) - If the image is ~800MB, verify
next.config.tshasoutput: "standalone"and that the final stage only copies.next/standaloneand.next/static
Why Multi-Stage vs. Single Stage?
| Approach | Image Size | Build Cache | Security |
|---|---|---|---|
| Single stage | ~800MB | Poor (one layer) | Dev tools in production |
| Multi-stage (3 stages) | ~120MB | Excellent (deps cached) | Only runtime files |
| Distroless | ~80MB | Good | No shell for debugging |
I chose multi-stage over distroless because Alpine still gives you a shell for debugging in emergencies, and the 40MB difference isn't worth losing sh access when a production container misbehaves at 2 AM.
API Dockerfile Improvements
The existing .NET Dockerfile was functional but missing production hardening:
# Added to final stage
ENV DOTNET_EnableDiagnostics=0
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
CMD curl -f http://localhost:8080/api/dashboard/metrics || exit 1
DOTNET_EnableDiagnostics=0disables the diagnostic pipe in production. It reduces the attack surface and avoids creating unnecessary files in the container.--no-restoreondotnet buildskips redundant package restoration since the previousdotnet restorestep already did it. Saves ~10 seconds per build.- OCI labels (
org.opencontainers.image.source) let container registries link images back to their source repository.
Checkpoint: Build the API Image
Before continuing, verify the API Dockerfile works:
docker build -t fhirhub-api:local -f FhirHubServer/src/FhirHubServer.Api/Dockerfile FhirHubServer/
Expected output:
- Build completes successfully. You should see
HEALTHCHECKin the build output, confirming the healthcheck instruction was processed
docker inspect fhirhub-api:local --format='{{.Config.Healthcheck}}'
Expected output:
- Should show the curl healthcheck command (e.g.,
{[CMD-SHELL curl -f http://localhost:8080/api/dashboard/metrics || exit 1] ...}). If it shows<nil>, theHEALTHCHECKinstruction is missing from the Dockerfile
If something went wrong:
- If
dotnet restorefails, check thatNuGet.ConfigorDirectory.Packages.propsis accessible from the build context - If the healthcheck shows
<nil>, verify theHEALTHCHECKinstruction is in the final stage of the Dockerfile, not a build stage
Why OCI Labels?
| Metadata Approach | Registry Support | Standardized | Machine-Readable |
|---|---|---|---|
| No labels | N/A | No | No |
Custom LABEL keys | Docker Hub | No | Varies |
OCI org.opencontainers.image.* | All OCI registries | Yes | Yes |
OCI labels are the industry standard. Docker Hub, GitHub Container Registry, and Harbor all understand them. They link your image to its source repo, commit SHA, and documentation URL without custom tooling.
Docker Compose: Three-File Strategy
Why Three Files vs. One?
| Approach | Pros | Cons |
|---|---|---|
Single docker-compose.yml | Simple | Can't separate dev/prod concerns |
.env-only switching | One file | Complex conditionals, hard to read |
| Three-file overlay | Clear separation, composable | Three files to manage |
I chose the three-file approach:
docker-compose.yml-- Base configuration. Uses${VARIABLE}references for everything configurable.docker-compose.override.yml-- Dev overrides. Applied automatically bydocker compose up. Volume mounts for hot-reload, Keycloak instart-devmode, hardcoded dev credentials.docker-compose.prod.yml-- Production overrides. Pre-built images from Docker Hub, resource limits, log rotation,restart: unless-stopped.
Running dev is just docker compose up. Running prod is docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d. No environment variable gymnastics.
The Frontend Service
Post 10 covered the original five services. The sixth service is the frontend:
fhirhub-frontend:
build:
context: ./frontend
dockerfile: Dockerfile
args:
NEXT_PUBLIC_API_URL: ${NEXT_PUBLIC_API_URL:-http://localhost:5197}
NEXT_PUBLIC_KEYCLOAK_URL: ${NEXT_PUBLIC_KEYCLOAK_URL:-http://localhost:8180}
ports:
- "${FRONTEND_PORT:-7002}:3000"
depends_on:
fhirhub-api:
condition: service_healthy
keycloak:
condition: service_healthy
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/"]
interval: 30s
timeout: 5s
retries: 3
The frontend depends on both the API and Keycloak being healthy. Without this, the app would boot and immediately show auth errors because Keycloak isn't ready yet.
Dev Override Details
The docker-compose.override.yml applies automatically when you run docker compose up:
fhirhub-frontend:
volumes:
- ./frontend/src:/app/src # Hot-reload source changes
- ./frontend/public:/app/public # Static assets
environment:
- WATCHPACK_POLLING=true # Enable polling for Docker volumes
Volume mounts let you edit code on the host and see changes instantly in the container. WATCHPACK_POLLING=true is necessary because filesystem events don't propagate reliably through Docker volume mounts on macOS.
Production Override Details
The docker-compose.prod.yml replaces local builds with pre-built images:
fhirhub-frontend:
image: ${DOCKERHUB_USERNAME}/fhirhub-frontend:${IMAGE_TAG:-latest}
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
reservations:
cpus: "0.25"
memory: 128M
restart: unless-stopped
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
Key production concerns:
- Resource limits prevent a runaway container from consuming all host resources
restart: unless-stoppedauto-recovers from crashes without restarting after manualdocker compose stop- Log rotation (
max-size: 10m,max-file: 3) prevents disk fill from verbose logging
The .env.example File
Every configurable value lives in .env:
COMPOSE_PROJECT_NAME=fhirhub
DOCKERHUB_USERNAME=your-username
IMAGE_TAG=latest
# Ports
API_PORT=5197
FRONTEND_PORT=7002
HAPI_FHIR_PORT=8080
KEYCLOAK_PORT=8180
# Database
HAPI_DB_PASSWORD=changeme-hapi
KEYCLOAK_DB_PASSWORD=changeme-keycloak
# Keycloak
KEYCLOAK_ADMIN_PASSWORD=changeme-admin
# Frontend (baked at build time)
NEXT_PUBLIC_API_URL=http://localhost:5197
NEXT_PUBLIC_KEYCLOAK_URL=http://localhost:8180
.env.example is committed. .env is gitignored. Developers copy the example and customize. No secrets in version control.
Checkpoint: Run the Full Stack
Before continuing, verify the complete Docker Compose setup works:
cp .env.example .env
docker compose up -d
Wait for services to start, then check their status:
docker compose ps
Expected output:
- All 6 services (fhirhub-api, fhirhub-frontend, hapi-fhir, keycloak, hapi-fhir-db, keycloak-db) should show
healthystatus. This may take 1-2 minutes as services wait for their dependencies
curl -s http://localhost:5197/api/dashboard/metrics | head -1
Expected output:
- Should return a JSON response (starts with
{or[). A connection refused error means the API isn't healthy yet -- wait and retry
curl -s -o /dev/null -w '%{http_code}' http://localhost:7002
Expected output:
- Should return
200, confirming the frontend is serving pages
Verify the production overlay also renders correctly:
docker compose -f docker-compose.yml -f docker-compose.prod.yml config --services
Expected output:
- Should list all 6 services. If it errors, check for syntax issues in
docker-compose.prod.yml
If something went wrong:
- If services aren't healthy after 2 minutes, check logs:
docker compose logs <service-name> - If the API can't reach HAPI FHIR or Keycloak, ensure the
.envports match the service configurations - Clean start:
docker compose down -v && docker compose up -d
What's Next
In Part 15, we'll build CI/CD pipelines with GitHub Actions -- reusable workflows, pull request checks, release pipelines, image tagging strategies, and security scanning that catches vulnerabilities before they reach production.