Docker Multi-Stage Builds and Compose Strategy

By David Le -- Part 14 of the FhirHub Series

The first 13 posts covered building FhirHub -- the application code, the FHIR resources, the frontend, the API gateway. But shipping code is only half the problem. The other half is getting it into production reliably, repeatedly, and with confidence that nothing breaks along the way.

This post begins a 5-part DevOps sub-series covering the full deployment stack. We start with Docker -- multi-stage builds that produce small, secure images and a three-file Compose strategy that cleanly separates dev from production.

Why DevOps Matters for Healthcare

Healthcare applications have stricter deployment requirements than most software. You can't push untested code to a system that manages patient data. You need:

Reproducible builds -- The same code produces the same artifact every time
Automated testing -- Every change is validated before it reaches production
Audit trails -- Git history and pipeline logs record who changed what and when
Rollback capability -- A bad deployment can be reversed in minutes, not hours
Environment parity -- Dev, staging, and production run the same containers

This isn't optional in regulated environments. It's the baseline.

Frontend Dockerfile

The frontend Dockerfile uses three stages to keep the final image small:

# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts

# Stage 2: Build the application
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Stage 3: Production runner
FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
RUN addgroup --system --gid 1001 nodejs && \
    adduser --system --uid 1001 nextjs
COPY --from=builder /app/public ./public
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
USER nextjs
EXPOSE 3000
CMD ["node", "server.js"]

Key decisions:

output: "standalone" in next.config.ts enables Next.js to produce a self-contained server.js that doesn't need node_modules at runtime. The final image is ~120MB instead of ~800MB.
Non-root user (nextjs:1001) follows the principle of least privilege. If the container is compromised, the attacker can't modify system files.
*NEXT_PUBLIC_ build args** are baked at build time because Next.js inlines them during compilation. They aren't secrets -- they're public URLs the browser needs.
Separate deps stage means changing source code doesn't re-run npm ci. Docker layer caching makes rebuilds fast.

Checkpoint: Build the Frontend Image

Before continuing, verify the frontend Dockerfile works:

docker build -t fhirhub-frontend:local -f frontend/Dockerfile frontend/

Expected output:

The final line shows an image ID and the build completes without errors
No npm ERR! or COPY --from failures

docker images fhirhub-frontend:local --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"

Expected output:

Image size should be ~120MB, not ~800MB. If it's close to 800MB, the multi-stage build isn't working -- check that output: "standalone" is set in next.config.ts

docker run --rm fhirhub-frontend:local whoami

Expected output:

Should print nextjs, confirming the container runs as a non-root user

If something went wrong:

If npm ci fails, check that package-lock.json is up to date (npm install on the host, then rebuild)
If the image is ~800MB, verify next.config.ts has output: "standalone" and that the final stage only copies .next/standalone and .next/static

Why Multi-Stage vs. Single Stage?

Approach	Image Size	Build Cache	Security
Single stage	~800MB	Poor (one layer)	Dev tools in production
Multi-stage (3 stages)	~120MB	Excellent (deps cached)	Only runtime files
Distroless	~80MB	Good	No shell for debugging

I chose multi-stage over distroless because Alpine still gives you a shell for debugging in emergencies, and the 40MB difference isn't worth losing sh access when a production container misbehaves at 2 AM.

API Dockerfile Improvements

The existing .NET Dockerfile was functional but missing production hardening:

# Added to final stage
ENV DOTNET_EnableDiagnostics=0

HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
  CMD curl -f http://localhost:8080/api/dashboard/metrics || exit 1

DOTNET_EnableDiagnostics=0 disables the diagnostic pipe in production. It reduces the attack surface and avoids creating unnecessary files in the container.
--no-restore on dotnet build skips redundant package restoration since the previous dotnet restore step already did it. Saves ~10 seconds per build.
OCI labels (org.opencontainers.image.source) let container registries link images back to their source repository.

Checkpoint: Build the API Image

Before continuing, verify the API Dockerfile works:

docker build -t fhirhub-api:local -f FhirHubServer/src/FhirHubServer.Api/Dockerfile FhirHubServer/

Expected output:

Build completes successfully. You should see HEALTHCHECK in the build output, confirming the healthcheck instruction was processed

docker inspect fhirhub-api:local --format='{{.Config.Healthcheck}}'

Expected output:

Should show the curl healthcheck command (e.g., {[CMD-SHELL curl -f http://localhost:8080/api/dashboard/metrics || exit 1] ...}). If it shows <nil>, the HEALTHCHECK instruction is missing from the Dockerfile

If something went wrong:

If dotnet restore fails, check that NuGet.Config or Directory.Packages.props is accessible from the build context
If the healthcheck shows <nil>, verify the HEALTHCHECK instruction is in the final stage of the Dockerfile, not a build stage

Why OCI Labels?

Metadata Approach	Registry Support	Standardized	Machine-Readable
No labels	N/A	No	No
Custom `LABEL` keys	Docker Hub	No	Varies
OCI `org.opencontainers.image.*`	All OCI registries	Yes	Yes

OCI labels are the industry standard. Docker Hub, GitHub Container Registry, and Harbor all understand them. They link your image to its source repo, commit SHA, and documentation URL without custom tooling.

Docker Compose: Three-File Strategy

Why Three Files vs. One?

Approach	Pros	Cons
Single `docker-compose.yml`	Simple	Can't separate dev/prod concerns
`.env`-only switching	One file	Complex conditionals, hard to read
Three-file overlay	Clear separation, composable	Three files to manage

I chose the three-file approach:

docker-compose.yml -- Base configuration. Uses ${VARIABLE} references for everything configurable.
docker-compose.override.yml -- Dev overrides. Applied automatically by docker compose up. Volume mounts for hot-reload, Keycloak in start-dev mode, hardcoded dev credentials.
docker-compose.prod.yml -- Production overrides. Pre-built images from Docker Hub, resource limits, log rotation, restart: unless-stopped.

Running dev is just docker compose up. Running prod is docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d. No environment variable gymnastics.

The Frontend Service

Post 10 covered the original five services. The sixth service is the frontend:

fhirhub-frontend:
  build:
    context: ./frontend
    dockerfile: Dockerfile
    args:
      NEXT_PUBLIC_API_URL: ${NEXT_PUBLIC_API_URL:-http://localhost:5197}
      NEXT_PUBLIC_KEYCLOAK_URL: ${NEXT_PUBLIC_KEYCLOAK_URL:-http://localhost:8180}
  ports:
    - "${FRONTEND_PORT:-7002}:3000"
  depends_on:
    fhirhub-api:
      condition: service_healthy
    keycloak:
      condition: service_healthy
  healthcheck:
    test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/"]
    interval: 30s
    timeout: 5s
    retries: 3

The frontend depends on both the API and Keycloak being healthy. Without this, the app would boot and immediately show auth errors because Keycloak isn't ready yet.

Dev Override Details

The docker-compose.override.yml applies automatically when you run docker compose up:

fhirhub-frontend:
  volumes:
    - ./frontend/src:/app/src          # Hot-reload source changes
    - ./frontend/public:/app/public    # Static assets
  environment:
    - WATCHPACK_POLLING=true           # Enable polling for Docker volumes

Volume mounts let you edit code on the host and see changes instantly in the container. WATCHPACK_POLLING=true is necessary because filesystem events don't propagate reliably through Docker volume mounts on macOS.

Production Override Details

The docker-compose.prod.yml replaces local builds with pre-built images:

fhirhub-frontend:
  image: ${DOCKERHUB_USERNAME}/fhirhub-frontend:${IMAGE_TAG:-latest}
  deploy:
    resources:
      limits:
        cpus: "1.0"
        memory: 512M
      reservations:
        cpus: "0.25"
        memory: 128M
  restart: unless-stopped
  logging:
    driver: json-file
    options:
      max-size: "10m"
      max-file: "3"

Key production concerns:

Resource limits prevent a runaway container from consuming all host resources
restart: unless-stopped auto-recovers from crashes without restarting after manual docker compose stop
Log rotation (max-size: 10m, max-file: 3) prevents disk fill from verbose logging

The .env.example File

Every configurable value lives in .env:

COMPOSE_PROJECT_NAME=fhirhub
DOCKERHUB_USERNAME=your-username
IMAGE_TAG=latest

# Ports
API_PORT=5197
FRONTEND_PORT=7002
HAPI_FHIR_PORT=8080
KEYCLOAK_PORT=8180

# Database
HAPI_DB_PASSWORD=changeme-hapi
KEYCLOAK_DB_PASSWORD=changeme-keycloak

# Keycloak
KEYCLOAK_ADMIN_PASSWORD=changeme-admin

# Frontend (baked at build time)
NEXT_PUBLIC_API_URL=http://localhost:5197
NEXT_PUBLIC_KEYCLOAK_URL=http://localhost:8180

.env.example is committed. .env is gitignored. Developers copy the example and customize. No secrets in version control.

Checkpoint: Run the Full Stack

Before continuing, verify the complete Docker Compose setup works:

cp .env.example .env
docker compose up -d

Wait for services to start, then check their status:

docker compose ps

Expected output:

All 6 services (fhirhub-api, fhirhub-frontend, hapi-fhir, keycloak, hapi-fhir-db, keycloak-db) should show healthy status. This may take 1-2 minutes as services wait for their dependencies

curl -s http://localhost:5197/api/dashboard/metrics | head -1

Expected output:

Should return a JSON response (starts with { or [). A connection refused error means the API isn't healthy yet -- wait and retry

curl -s -o /dev/null -w '%{http_code}' http://localhost:7002

Expected output:

Should return 200, confirming the frontend is serving pages

Verify the production overlay also renders correctly:

docker compose -f docker-compose.yml -f docker-compose.prod.yml config --services

Expected output:

Should list all 6 services. If it errors, check for syntax issues in docker-compose.prod.yml

If something went wrong:

If services aren't healthy after 2 minutes, check logs: docker compose logs <service-name>
If the API can't reach HAPI FHIR or Keycloak, ensure the .env ports match the service configurations
Clean start: docker compose down -v && docker compose up -d

What's Next

In Part 15, we'll build CI/CD pipelines with GitHub Actions -- reusable workflows, pull request checks, release pipelines, image tagging strategies, and security scanning that catches vulnerabilities before they reach production.

Find the source code on GitHub Connect on LinkedIn

Docker Multi-Stage Builds and Compose Strategy

Docker Multi-Stage Builds and Compose Strategy

Why DevOps Matters for Healthcare

Frontend Dockerfile

Checkpoint: Build the Frontend Image

Why Multi-Stage vs. Single Stage?

API Dockerfile Improvements

Checkpoint: Build the API Image

Why OCI Labels?

Docker Compose: Three-File Strategy

Why Three Files vs. One?

The Frontend Service

Dev Override Details

Production Override Details

The .env.example File

Checkpoint: Run the Full Stack

What's Next

Related Projects

FhirHub