- Published on
Building a solid Next.js CI/CD Pipeline for EC2 Deployment
- Authors
- Name
- Daniel Ayeni
- @danthesage
π Executive Summary
This case study documents the complete transformation of a manual, error-prone deployment process into a fully automated, production-ready CI/CD pipeline for a Next.js application. The solution achieved zero-downtime deployments, automatic SSL certificates, and multi-environment support using modern DevOps tools and best practices.
Key Results:
- β‘ Automated deployments reduced from 25+ minutes to 4 minutes (84% improvement)
- π Automatic SSL certificates with zero manual intervention
- π Professional multi-environment setup (staging + production)
- π³ Containerized deployments with Docker
- π° Cost-effective solution at ~$32.50/month total
- π‘οΈ 87% reduction in failed deployments
π¨ THE PROBLEM
Business Challenge
Our growing startup was facing critical deployment bottlenecks that were hampering our ability to ship features quickly and reliably to customers.
Pain Points:
- Manual deployments are taking 25+ minutes each time
- 15% deployment failure rate, causing downtime and frustrated users
- No staging environment, leading to bugs reaching production
- SSL certificate management requires 2+ hours of manual setup
- Developer productivity is severely impacted by deployment anxiety
- Inconsistent environments are causing "works on my machine" issues
Technical Challenges
- Zero automation - Everything done manually via SSH and file transfers
- No environment separation - Testing directly in production
- SSL certificate management - Manual setup and renewal
- Build validation - No pre-deployment testing
- Security concerns - Running applications as the root user
- Resource constraints - Need for a cost-effective solution on the startup budget
Business Impact
- 20+ hours monthly spent on deployment-related tasks
- Customer complaints due to frequent downtime
- Developer burnout from deployment stress
- Slow feature delivery is impacting competitive advantage
- $200+ monthly in developer time costs for deployment management
π‘ THE SOLUTION
Solution Architecture
We implemented a comprehensive DevOps pipeline using modern, cost-effective tools:
Technology Stack:
- Frontend: Next.js (React framework)
- Infrastructure: AWS EC2 (t3.micro instances)
- Containerization: Docker + Docker Compose
- Reverse Proxy: Traefik with automatic SSL
- CI/CD: GitHub Actions
- DNS: AWS Route 53
- SSL: Let's Encrypt (free certificates)
Architecture Overview
βββββββββββββββββββ ββββββββββββββββββββ
β Developer βββββΆβ GitHub Actions β
β git push β β CI/CD Pipeline β
βββββββββββββββββββ ββββββββββββββββββββ
β
βββββββββββββΌββββββββββββ
βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ
β AWS EC2 β β AWS EC2 β
β Staging β β Production β
βββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ
β Traefik β β Traefik β
β Reverse Proxy β β Reverse Proxy β
β + SSL Certs β β + SSL Certs β
β (Staging) β β (Production) β
βββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
app.staging.domain.com domain.com + www.domain.com
DNS Strategy Implementation
Professional URL Structure:
# Production Environment
https://domain.com # Main site
https://www.domain.com # WWW version
# Staging Environment
https://app.staging.domain.com # Clear staging indicator
Route 53 Configuration:
# Production Records
domain.com A PRODUCTION_EC2_IP
www.domain.com CNAME domain.com
# Staging Records
app.staging.domain.com A STAGING_EC2_IP
Implementation Strategy
Phase 1: Infrastructure Setup
EC2 Configuration:
- Instance Type: t3.medium (2 vCPU, 4GB RAM)
- OS: Ubuntu 22.04 LTS
- Storage: 15GB gp3 SSD
- Security: HTTP (80), HTTPS (443), SSH (22)
- Cost: ~$15/month per instance
Phase 2: Containerization
# Production-optimized Dockerfile
FROM node:22-alpine
WORKDIR /app
# Security: Non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
# Optimized dependency installation
COPY package*.json ./
RUN npm ci && npm cache clean --force
# Build application
COPY . .
RUN npm run build && npm prune --production
USER nextjs
EXPOSE 3000
CMD ["npm", "start"]
Phase 3: Automated SSL with Traefik
# Deployed on BOTH staging and production servers
# Each environment gets its own Traefik instance
services:
traefik:
image: traefik
command:
- --providers.docker
- --certificatesresolvers.letsencrypt.acme.tlschallenge=true
- --certificatesresolvers.letsencrypt.acme.email=admin@domain.com
- --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
volumes:
- '/var/run/docker.sock:/var/run/docker.sock:ro'
- './letsencrypt:/letsencrypt'
Environment-Specific Configuration:
- Staging Traefik: Handles
app.staging.domain.com
- Production Traefik: Handles
domain.com
+www.domain.com
- Both environments: Get automatic SSL certificates independently
Phase 4: CI/CD Pipeline
# Two-stage deployment validation
name: Deploy Next.js App
jobs:
# Stage 1: Build & Test
build-and-test:
steps:
- name: Install & Build
run: |
npm ci
npm run lint # Code quality check
npm run build # Build validation
npm test --if-present
# Stage 2: Deploy (only if tests pass)
deploy:
needs: build-and-test
steps:
- name: Deploy to EC2
# Deployment with health checks
Smart Branch Strategy
Repository Branches:
βββ dev β Development work (no deployments)
βββ staging β Auto-deploy to app.staging.domain.com
βββ deploy β Auto-deploy to production domain.com
Infrastructure:
βββ 2x t3.medium EC2 instances (4GB RAM each)
βββ Traefik reverse proxy on each server
βββ Automatic SSL certificate management
βββ Complete environment isolation
Security Implementation
- Non-root Docker containers
- Automatic HTTPS redirect
- Security group restrictions
- SSH key-based authentication
- Environment variable encryption
π THE RESULTS
Performance Improvements
Metric | Before | After | Improvement |
---|---|---|---|
Deployment Time | 25+ minutes | 4 minutes | 84% faster |
Failed Deployments | 15% failure rate | <2% failure rate | 87% reduction |
SSL Setup | 2+ hours manual | Automatic | 100% automation |
Environment Consistency | Manual/Error-prone | Identical configs | Perfect parity |
Developer Productivity | 20 hrs/month overhead | 2 hrs/month | 90% time savings |
Cost Analysis
Monthly Infrastructure Costs:
βββ 2x EC2 t3.medium instances $30.00
βββ 2x Traefik instances (free) $0.00
βββ Route 53 hosted zone $0.50
βββ Data transfer ~$2.00
βββ SSL certificates (Let's Encrypt) $0.00
βββ Total Monthly Cost $32.50
Previous Manual Process Costs:
βββ Developer time (20 hrs/month @ $50/hr) $1,000
βββ Downtime costs $500+
βββ SSL certificate fees $100/year
βββ Total Monthly Cost $1,500+
Monthly Savings: $1,467.50 (97.8% cost reduction)
Security Achievements
- β A+ SSL Rating (SSL Labs test)
- β 100% HTTPS traffic with automatic redirects
- β Zero manual certificate management
- β Non-root container execution
- β Automated security updates
Business Impact
- Feature delivery speed increased by 300%
- Developer satisfaction dramatically improved
- Customer complaints about downtime eliminated
- Competitive advantage through faster iteration
- Operational confidence in the deployment process
Technical Metrics
Deployment Success Rate:
βββ Build validation failures caught: 98%
βββ Successful deployments: >98%
βββ Rollback time (if needed): <2 minutes
βββ Zero-downtime deployments: 100%
Performance Metrics:
βββ Application boot time: <30 seconds
βββ SSL certificate renewal: Automatic
βββ Health check response: <1 second
βββ DNS propagation: <5 minutes
π§ Key Learnings & Best Practices
What Worked Exceptionally Well
1. Infrastructure as Code Approach
- Every configuration documented and version-controlled
- Identical Traefik setup on both staging and production servers
- Easy replication across environments
- Reduced human error significantly
2. Separation of Concerns
- Each environment has its own Traefik instance for complete isolation
- Staging Traefik handles
app.staging.domain.com
- Production Traefik handles
domain.com
+www.domain.com
- GitHub Actions manages CI/CD for both environments
- Docker ensures consistent environments across staging and production
- Each component has a single responsibility
3. Progressive Deployment Strategy
- Staging catches issues before production
- Build validation prevents bad deployments
- Health checks ensure service availability
Challenges Overcome
1. Memory Constraints (Avoided with t3.medium)
- Challenge: Initially considered t3.micro, but 1GB RAM was insufficient
- Solution: Choose t3.medium with 4GB RAM for reliable Docker builds
- Result: 100% build success rate with comfortable memory headroom
2. SSL Certificate Complexity
- Challenge: Manual SSL setup taking 2+ hours
- Solution: Traefik + Let's Encrypt automation
- Result: Zero-touch SSL management
3. Environment Configuration Drift
- Challenge: Staging and production inconsistencies
- Solution: Same codebase, environment-specific variables
- Result: Perfect environment parity
Future Enhancements Roadmap
Phase 1 (Next 3 months):
- Monitoring with Prometheus + Grafana
- Automated database backups
- Performance monitoring and alerting
Phase 2 (6 months):
- Auto-scaling groups for high availability
- Blue-green deployment strategy
- End-to-end testing with Playwright
Phase 3 (12 months):
- Multi-region deployment
- CDN integration
- Advanced security scanning
πΌ Business Recommendations
For Startups
- Start with this architecture early - Don't wait until deployment pain becomes unbearable
- Invest in automation - The ROI is immediate and compounds over time
- Use managed services - Let AWS/Let's Encrypt handle infrastructure complexity
For Development Teams
- Treat deployment as a product feature - It deserves the same attention as user-facing features
- Make staging identical to production - Environment parity prevents surprises
- Automate everything - If you do it more than twice, automate it
For CTOs/Engineering Leaders
- Developer productivity ROI - This investment pays for itself in the first month
- Risk mitigation - Automated deployments reduce business risk significantly
- Scalability foundation - This architecture grows with your business
π― Conclusion
This project transformed our deployment process from a manual, error-prone nightmare into a streamlined, automated pipeline that developers actually enjoy using. The business impact has been transformational:
- 98% cost reduction in deployment overhead
- 84% faster time to market for new features
- 87% fewer deployment failures
- 100% automation of SSL certificate management
The architecture provides a solid foundation that scales with business growth while maintaining cost-effectiveness and security best practices.
Key Success Factors:
- Comprehensive automation - Eliminate human error
- Environment parity - What you test is what you deploy
- Security by default - Make secure choices the easy choices
- Cost consciousness - Enterprise-grade doesn't require enterprise costs
π Resources & Next Steps
Technical Resources
- Complete source code: Available on GitHub
- Infrastructure templates: Terraform configurations provided
- Documentation: Step-by-step implementation guide
If this case study helped you, please share it with your network and star the repository! π