Docker and CI/CD for a Small Dev Team: What We Actually Ship in Production
Kubernetes is powerful. It's also 40+ YAML files, a steep learning curve, and operational overhead that doesn't make sense for most teams under 20 engineers. Here's our actual production setup - Docker Compose, GitHub Actions, Nginx reverse proxy, and a deployment script. It handles real traffic and costs a fraction of a managed Kubernetes cluster.
The Setup Overview
GitHub (code)
→ GitHub Actions (CI: test, build, push image)
→ Docker Hub (registry)
→ VPS: GitHub Actions SSH deploy
→ docker-compose pull + up
→ Nginx routes traffic
One VPS (Hetzner CPX31, 4 vCPU, 8GB RAM, €12/month) runs 6 production apps via Docker Compose. We were paying $800/month for equivalent resources on AWS before this.
Dockerfile: Production-Ready Node.js
# Multi-stage build: builder stage doesn't ship to production
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage: minimal image
FROM node:18-alpine AS production
WORKDIR /app
# Non-root user for security
RUN addgroup -g 1001 -S nodejs && adduser -S nextjs -u 1001
COPY --from=builder --chown=nextjs:nodejs /app/.next ./.next
COPY --from=builder --chown=nextjs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nextjs:nodejs /app/package.json ./
USER nextjs
EXPOSE 3000
ENV NODE_ENV=production
CMD ["node_modules/.bin/next", "start"]
Multi-stage build keeps the production image small (no dev dependencies, no build tools). Result: 180MB image instead of 1.2GB.
Docker Compose: Production Stack
# docker-compose.yml
version: '3.8'
services:
frontend:
image: ghcr.io/yourorg/frontend:${IMAGE_TAG:-latest}
restart: unless-stopped
environment:
- NODE_ENV=production
- NEXT_PUBLIC_API_URL=https://api.yoursite.com
networks:
- app-network
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:3000/api/health"]
interval: 30s
timeout: 10s
retries: 3
backend:
image: ghcr.io/yourorg/backend:${IMAGE_TAG:-latest}
restart: unless-stopped
environment:
- NODE_ENV=production
- DATABASE_URL=${DATABASE_URL}
- JWT_SECRET=${JWT_SECRET}
depends_on:
postgres:
condition: service_healthy
networks:
- app-network
postgres:
image: postgres:15-alpine
restart: unless-stopped
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_DB=${DB_NAME}
- POSTGRES_USER=${DB_USER}
- POSTGRES_PASSWORD=${DB_PASSWORD}
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
interval: 10s
timeout: 5s
retries: 5
networks:
- app-network
redis:
image: redis:7-alpine
restart: unless-stopped
command: redis-server --requirepass ${REDIS_PASSWORD}
volumes:
- redis_data:/data
networks:
- app-network
volumes:
postgres_data:
redis_data:
networks:
app-network:
driver: bridge
GitHub Actions: Build and Deploy
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
branches: [main]
jobs:
build-and-push:
runs-on: ubuntu-latest
outputs:
image-tag: ${{ steps.meta.outputs.version }}
steps:
- uses: actions/checkout@v4
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=sha,prefix=,format=short
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v5
with:
push: true
tags: ${{ steps.meta.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy:
needs: build-and-push
runs-on: ubuntu-latest
steps:
- name: Deploy via SSH
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.SERVER_HOST }}
username: deploy
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
cd /opt/yourapp
export IMAGE_TAG=${{ needs.build-and-push.outputs.image-tag }}
docker-compose pull frontend backend
docker-compose up -d --no-deps frontend backend
docker image prune -f
Deploy time: typically 90 seconds from git push to live. Zero downtime because Docker Compose brings up new containers before stopping old ones (with --no-deps).
Nginx Reverse Proxy
# /etc/nginx/sites-available/yoursite.com
server {
listen 443 ssl http2;
server_name yoursite.com;
ssl_certificate /etc/letsencrypt/live/yoursite.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yoursite.com/privkey.pem;
# Security headers
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains";
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_cache_bypass $http_upgrade;
}
location /api/ {
proxy_pass http://localhost:3001;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
# HTTP redirect to HTTPS
server {
listen 80;
server_name yoursite.com;
return 301 https://$host$request_uri;
}
SSL certificates via Certbot (Let's Encrypt). Renewal is automatic via systemd timer.
Backup Strategy
#!/bin/bash
# /opt/scripts/backup.sh - runs daily via cron
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups"
# PostgreSQL dump
docker exec postgres pg_dump -U $DB_USER $DB_NAME | \
gzip > "$BACKUP_DIR/db_$DATE.sql.gz"
# Keep last 7 days, delete older
find $BACKUP_DIR -name "db_*.sql.gz" -mtime +7 -delete
# Upload to Hetzner Storage Box (S3-compatible)
aws s3 cp "$BACKUP_DIR/db_$DATE.sql.gz" \
"s3://your-backup-bucket/postgres/" \
--endpoint-url https://your-storage-box.hetzner.com
What We'd Add With More Time
- Watchtower for automatic container updates (we update manually to maintain control)
- Prometheus + Grafana for metrics (we use Plausible for analytics, Sentry for errors, Uptime Robot for availability - good enough for our scale)
- Staging environment with the same compose setup (we have this, didn't detail it here)
The Kubernetes question comes up every quarter. Our answer: when we have dedicated ops engineers or when a single app needs 10+ replicas, we'll migrate. Until then, this setup handles everything we've thrown at it.