Published on

Building a solid Next.js CI/CD Pipeline for EC2 Deployment

Authors

πŸ“‹ Executive Summary

This case study documents the complete transformation of a manual, error-prone deployment process into a fully automated, production-ready CI/CD pipeline for a Next.js application. The solution achieved zero-downtime deployments, automatic SSL certificates, and multi-environment support using modern DevOps tools and best practices.

Key Results:

  • ⚑ Automated deployments reduced from 25+ minutes to 4 minutes (84% improvement)
  • πŸ”’ Automatic SSL certificates with zero manual intervention
  • 🌍 Professional multi-environment setup (staging + production)
  • 🐳 Containerized deployments with Docker
  • πŸ’° Cost-effective solution at ~$32.50/month total
  • πŸ›‘οΈ 87% reduction in failed deployments

🚨 THE PROBLEM

Business Challenge

Our growing startup was facing critical deployment bottlenecks that were hampering our ability to ship features quickly and reliably to customers.

Pain Points:

  • Manual deployments are taking 25+ minutes each time
  • 15% deployment failure rate, causing downtime and frustrated users
  • No staging environment, leading to bugs reaching production
  • SSL certificate management requires 2+ hours of manual setup
  • Developer productivity is severely impacted by deployment anxiety
  • Inconsistent environments are causing "works on my machine" issues

Technical Challenges

  1. Zero automation - Everything done manually via SSH and file transfers
  2. No environment separation - Testing directly in production
  3. SSL certificate management - Manual setup and renewal
  4. Build validation - No pre-deployment testing
  5. Security concerns - Running applications as the root user
  6. Resource constraints - Need for a cost-effective solution on the startup budget

Business Impact

  • 20+ hours monthly spent on deployment-related tasks
  • Customer complaints due to frequent downtime
  • Developer burnout from deployment stress
  • Slow feature delivery is impacting competitive advantage
  • $200+ monthly in developer time costs for deployment management

πŸ’‘ THE SOLUTION

Solution Architecture

We implemented a comprehensive DevOps pipeline using modern, cost-effective tools:

Technology Stack:

  • Frontend: Next.js (React framework)
  • Infrastructure: AWS EC2 (t3.micro instances)
  • Containerization: Docker + Docker Compose
  • Reverse Proxy: Traefik with automatic SSL
  • CI/CD: GitHub Actions
  • DNS: AWS Route 53
  • SSL: Let's Encrypt (free certificates)

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Developer     │───▢│  GitHub Actions  β”‚
β”‚   git push      β”‚    β”‚  CI/CD Pipeline  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β–Ό                       β–Ό
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚   AWS EC2       β”‚    β”‚   AWS EC2       β”‚
           β”‚   Staging       β”‚    β”‚   Production    β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚                       β”‚
                    β–Ό                       β–Ό
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚    Traefik      β”‚    β”‚    Traefik      β”‚
           β”‚ Reverse Proxy   β”‚    β”‚ Reverse Proxy   β”‚
           β”‚ + SSL Certs     β”‚    β”‚ + SSL Certs     β”‚
           β”‚ (Staging)       β”‚    β”‚ (Production)    β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚                       β”‚
                    β–Ό                       β–Ό
        app.staging.domain.com     domain.com + www.domain.com

DNS Strategy Implementation

Professional URL Structure:

# Production Environment
https://domain.com          # Main site
https://www.domain.com      # WWW version

# Staging Environment
https://app.staging.domain.com  # Clear staging indicator

Route 53 Configuration:

# Production Records
domain.com                    A      PRODUCTION_EC2_IP
www.domain.com               CNAME  domain.com

# Staging Records
app.staging.domain.com       A      STAGING_EC2_IP

Implementation Strategy

Phase 1: Infrastructure Setup

EC2 Configuration:
  - Instance Type: t3.medium (2 vCPU, 4GB RAM)
  - OS: Ubuntu 22.04 LTS
  - Storage: 15GB gp3 SSD
  - Security: HTTP (80), HTTPS (443), SSH (22)
  - Cost: ~$15/month per instance

Phase 2: Containerization

# Production-optimized Dockerfile
FROM node:22-alpine

WORKDIR /app

# Security: Non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nextjs -u 1001

# Optimized dependency installation
COPY package*.json ./
RUN npm ci && npm cache clean --force

# Build application
COPY --chown=nextjs:nodejs . .
RUN npm run build && npm prune --production

USER nextjs
EXPOSE 3000
CMD ["npm", "start"]

Phase 3: Automated SSL with Traefik

# Deployed on BOTH staging and production servers
# Each environment gets its own Traefik instance
services:
  traefik:
    image: traefik
    command:
      - --providers.docker
      - --certificatesresolvers.letsencrypt.acme.tlschallenge=true
      - --certificatesresolvers.letsencrypt.acme.email=admin@domain.com
      - --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
      - './letsencrypt:/letsencrypt'

Environment-Specific Configuration:

  • Staging Traefik: Handles app.staging.domain.com
  • Production Traefik: Handles domain.com + www.domain.com
  • Both environments: Get automatic SSL certificates independently

Phase 4: CI/CD Pipeline

# Two-stage deployment validation
name: Deploy Next.js App

jobs:
  # Stage 1: Build & Test
  build-and-test:
    steps:
      - name: Install & Build
        run: |
          npm ci
          npm run lint      # Code quality check
          npm run build     # Build validation
          npm test --if-present

  # Stage 2: Deploy (only if tests pass)
  deploy:
    needs: build-and-test
    steps:
      - name: Deploy to EC2
        # Deployment with health checks

Smart Branch Strategy

Repository Branches:
β”œβ”€β”€ dev          β†’ Development work (no deployments)
β”œβ”€β”€ staging      β†’ Auto-deploy to app.staging.domain.com
└── deploy       β†’ Auto-deploy to production domain.com

Infrastructure:
β”œβ”€β”€ 2x t3.medium EC2 instances (4GB RAM each)
β”œβ”€β”€ Traefik reverse proxy on each server
β”œβ”€β”€ Automatic SSL certificate management
└── Complete environment isolation

Security Implementation

  • Non-root Docker containers
  • Automatic HTTPS redirect
  • Security group restrictions
  • SSH key-based authentication
  • Environment variable encryption

πŸ“Š THE RESULTS

Performance Improvements

MetricBeforeAfterImprovement
Deployment Time25+ minutes4 minutes84% faster
Failed Deployments15% failure rate<2% failure rate87% reduction
SSL Setup2+ hours manualAutomatic100% automation
Environment ConsistencyManual/Error-proneIdentical configsPerfect parity
Developer Productivity20 hrs/month overhead2 hrs/month90% time savings

Cost Analysis

Monthly Infrastructure Costs:
β”œβ”€β”€ 2x EC2 t3.medium instances        $30.00
β”œβ”€β”€ 2x Traefik instances (free)       $0.00
β”œβ”€β”€ Route 53 hosted zone              $0.50
β”œβ”€β”€ Data transfer                    ~$2.00
β”œβ”€β”€ SSL certificates (Let's Encrypt)  $0.00
└── Total Monthly Cost               $32.50

Previous Manual Process Costs:
β”œβ”€β”€ Developer time (20 hrs/month @ $50/hr) $1,000
β”œβ”€β”€ Downtime costs                         $500+
β”œβ”€β”€ SSL certificate fees                   $100/year
└── Total Monthly Cost                     $1,500+

Monthly Savings: $1,467.50 (97.8% cost reduction)

Security Achievements

  • βœ… A+ SSL Rating (SSL Labs test)
  • βœ… 100% HTTPS traffic with automatic redirects
  • βœ… Zero manual certificate management
  • βœ… Non-root container execution
  • βœ… Automated security updates

Business Impact

  • Feature delivery speed increased by 300%
  • Developer satisfaction dramatically improved
  • Customer complaints about downtime eliminated
  • Competitive advantage through faster iteration
  • Operational confidence in the deployment process

Technical Metrics

Deployment Success Rate:
β”œβ”€β”€ Build validation failures caught: 98%
β”œβ”€β”€ Successful deployments: >98%
β”œβ”€β”€ Rollback time (if needed): <2 minutes
└── Zero-downtime deployments: 100%

Performance Metrics:
β”œβ”€β”€ Application boot time: <30 seconds
β”œβ”€β”€ SSL certificate renewal: Automatic
β”œβ”€β”€ Health check response: <1 second
└── DNS propagation: <5 minutes

🧠 Key Learnings & Best Practices

What Worked Exceptionally Well

1. Infrastructure as Code Approach

  • Every configuration documented and version-controlled
  • Identical Traefik setup on both staging and production servers
  • Easy replication across environments
  • Reduced human error significantly

2. Separation of Concerns

  • Each environment has its own Traefik instance for complete isolation
  • Staging Traefik handles app.staging.domain.com
  • Production Traefik handles domain.com + www.domain.com
  • GitHub Actions manages CI/CD for both environments
  • Docker ensures consistent environments across staging and production
  • Each component has a single responsibility

3. Progressive Deployment Strategy

  • Staging catches issues before production
  • Build validation prevents bad deployments
  • Health checks ensure service availability

Challenges Overcome

1. Memory Constraints (Avoided with t3.medium)

  • Challenge: Initially considered t3.micro, but 1GB RAM was insufficient
  • Solution: Choose t3.medium with 4GB RAM for reliable Docker builds
  • Result: 100% build success rate with comfortable memory headroom

2. SSL Certificate Complexity

  • Challenge: Manual SSL setup taking 2+ hours
  • Solution: Traefik + Let's Encrypt automation
  • Result: Zero-touch SSL management

3. Environment Configuration Drift

  • Challenge: Staging and production inconsistencies
  • Solution: Same codebase, environment-specific variables
  • Result: Perfect environment parity

Future Enhancements Roadmap

Phase 1 (Next 3 months):

  • Monitoring with Prometheus + Grafana
  • Automated database backups
  • Performance monitoring and alerting

Phase 2 (6 months):

  • Auto-scaling groups for high availability
  • Blue-green deployment strategy
  • End-to-end testing with Playwright

Phase 3 (12 months):

  • Multi-region deployment
  • CDN integration
  • Advanced security scanning

πŸ’Ό Business Recommendations

For Startups

  • Start with this architecture early - Don't wait until deployment pain becomes unbearable
  • Invest in automation - The ROI is immediate and compounds over time
  • Use managed services - Let AWS/Let's Encrypt handle infrastructure complexity

For Development Teams

  • Treat deployment as a product feature - It deserves the same attention as user-facing features
  • Make staging identical to production - Environment parity prevents surprises
  • Automate everything - If you do it more than twice, automate it

For CTOs/Engineering Leaders

  • Developer productivity ROI - This investment pays for itself in the first month
  • Risk mitigation - Automated deployments reduce business risk significantly
  • Scalability foundation - This architecture grows with your business

🎯 Conclusion

This project transformed our deployment process from a manual, error-prone nightmare into a streamlined, automated pipeline that developers actually enjoy using. The business impact has been transformational:

  • 98% cost reduction in deployment overhead
  • 84% faster time to market for new features
  • 87% fewer deployment failures
  • 100% automation of SSL certificate management

The architecture provides a solid foundation that scales with business growth while maintaining cost-effectiveness and security best practices.

Key Success Factors:

  1. Comprehensive automation - Eliminate human error
  2. Environment parity - What you test is what you deploy
  3. Security by default - Make secure choices the easy choices
  4. Cost consciousness - Enterprise-grade doesn't require enterprise costs

πŸ“š Resources & Next Steps

Technical Resources

  • Complete source code: Available on GitHub
  • Infrastructure templates: Terraform configurations provided
  • Documentation: Step-by-step implementation guide

If this case study helped you, please share it with your network and star the repository! πŸš€