Kubernetes Zero to Hero Part 2: Traefik Ingress with Automatic SSL and Advanced Observability

This is Part 2 of our Kubernetes Zero to Hero series. In Part 1, we built a rock-solid observability foundation with Prometheus and Grafana. Now we're adding intelligent ingress with automatic SSL.

Traffic management in Kubernetes can make or break your production experience. Today, we're deploying Traefik—the cloud-native edge router that makes ingress simple, secure, and observable. By the end of this guide, you'll have automatic SSL certificates, intelligent routing, and full observability integration.

What We're Building

Building on our monitoring foundation, we'll add:

Traefik as our ingress controller with automatic service discovery
Let's Encrypt integration for automatic SSL certificates
Advanced routing with middleware, sticky sessions, and load balancing
Complete observability with metrics, tracing, and dashboard integration
Production-ready security with rate limiting and authentication

This isn't just ingress—it's intelligent traffic management.

Prerequisites

Completed Part 1 with Prometheus and Grafana running
A domain name pointing to your cluster's external IP
kubectl access to your cluster

Step 1: Installing Traefik with Helm

Let's deploy Traefik with production-ready configuration:

# Add Traefik Helm repository
helm repo add traefik https://traefik.github.io/charts
helm repo update

# Create Traefik configuration
cat <<EOF > traefik-values.yaml
# Traefik configuration for production
deployment:
  replicas: 2

# Enable dashboard
ingressRoute:
  dashboard:
    enabled: true

# Service configuration
service:
  type: LoadBalancer
  annotations:
    service.beta.kubernetes.io/linode-loadbalancer-hostname: "traefik.yourdomain.com"

# Enable Prometheus metrics
metrics:
  prometheus:
    addEntryPointsLabels: true
    addServicesLabels: true
    addRoutersLabels: true

# Enable access logs
logs:
  access:
    enabled: true
    format: json

# Enable API and dashboard
api:
  dashboard: true
  insecure: false

# Entry points configuration
ports:
  web:
    port: 80
    redirectTo: websecure
  websecure:
    port: 443
    tls:
      enabled: true

# Certificate resolvers for Let's Encrypt
certificatesResolvers:
  letsencrypt:
    acme:
      email: your-email@example.com
      storage: /data/acme.json
      httpChallenge:
        entryPoint: web
      # Use staging for testing
      # caServer: https://acme-staging-v02.api.letsencrypt.org/directory

# Persistence for ACME certificates
persistence:
  enabled: true
  size: 128Mi
  storageClass: "linode-block-storage-retain"

# Resource limits
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 256Mi

# Additional arguments
additionalArguments:
  - "--serversTransport.insecureSkipVerify=true"
  - "--providers.kubernetescrd.allowCrossNamespace=true"
  - "--providers.kubernetesingress.allowEmptyServices=true"

# Enable Kubernetes CRD provider
providers:
  kubernetescrd:
    enabled: true
    allowCrossNamespace: true
  kubernetesingress:
    enabled: true
EOF

# Install Traefik
helm install traefik traefik/traefik \
  --namespace traefik \
  --create-namespace \
  --values traefik-values.yaml

Step 2: Setting Up Traefik Dashboard with SSL

Create a secure dashboard with automatic SSL:

# traefik-dashboard.yaml
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: traefik-dashboard
  namespace: traefik
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`traefik.yourdomain.com`)
      kind: Rule
      services:
        - name: api@internal
          kind: TraefikService
      middlewares:
        - name: auth
  tls:
    certResolver: letsencrypt
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: auth
  namespace: traefik
spec:
  basicAuth:
    secret: traefik-dashboard-auth
---
apiVersion: v1
kind: Secret
metadata:
  name: traefik-dashboard-auth
  namespace: traefik
type: Opaque
data:
  # Generate with: htpasswd -nb admin yourpassword | base64 -w 0
  users: YWRtaW46JGFwcjEkSDY1dnBkTU8kWmY2eTM0LldiQ28wUDVGMjBBNmYuMAo=

# Generate password hash
htpasswd -nb admin yourpassword | base64 -w 0

# Apply the configuration
kubectl apply -f traefik-dashboard.yaml

Step 3: Integrating with Prometheus Monitoring

Add Traefik metrics to our monitoring stack:

# traefik-servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: traefik
  namespace: monitoring
  labels:
    app: traefik
spec:
  namespaceSelector:
    matchNames:
    - traefik
  selector:
    matchLabels:
      app.kubernetes.io/name: traefik
  endpoints:
  - port: traefik
    interval: 30s
    path: /metrics
---
# PrometheusRule for Traefik alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: traefik-rules
  namespace: monitoring
  labels:
    prometheus: kube-prometheus
    role: alert-rules
spec:
  groups:
  - name: traefik.rules
    rules:
    - alert: TraefikHighErrorRate
      expr: rate(traefik_service_request_duration_seconds_count{code=~"5.."}[5m]) > 0.1
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Traefik high error rate"
        description: "Traefik error rate is above 10% for service {{ $labels.service }}"
    
    - alert: TraefikHighLatency
      expr: histogram_quantile(0.95, rate(traefik_service_request_duration_seconds_bucket[5m])) > 1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Traefik high latency"
        description: "Traefik 95th percentile latency is above 1s for service {{ $labels.service }}"
    
    - alert: TraefikCertificateExpiringSoon
      expr: traefik_tls_certs_not_after - time() < 7 * 24 * 3600
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "SSL certificate expiring soon"
        description: "Certificate for {{ $labels.cn }} expires in less than 7 days"

kubectl apply -f traefik-servicemonitor.yaml

Step 4: Deploying Applications with Automatic SSL

Let's create a production-ready application with all the bells and whistles:

# production-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-web-app
  namespace: default
  labels:
    app: production-web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: production-web-app
  template:
    metadata:
      labels:
        app: production-web-app
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9113"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: web-app
        image: nginx:1.21-alpine
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: 64Mi
            cpu: 50m
          limits:
            memory: 128Mi
            cpu: 100m
        volumeMounts:
        - name: nginx-config
          mountPath: /etc/nginx/conf.d
        livenessProbe:
          httpGet:
            path: /health
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
      - name: nginx-exporter
        image: nginx/nginx-prometheus-exporter:0.10.0
        args:
        - -nginx.scrape-uri=http://localhost/nginx_status
        ports:
        - containerPort: 9113
          name: metrics
        resources:
          requests:
            memory: 32Mi
            cpu: 25m
          limits:
            memory: 64Mi
            cpu: 50m
      volumes:
      - name: nginx-config
        configMap:
          name: nginx-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-config
  namespace: default
data:
  default.conf: |
    server {
        listen 80;
        server_name _;
        
        location /health {
            access_log off;
            return 200 "healthy\n";
            add_header Content-Type text/plain;
        }
        
        location /ready {
            access_log off;
            return 200 "ready\n";
            add_header Content-Type text/plain;
        }
        
        location /nginx_status {
            stub_status on;
            access_log off;
            allow 127.0.0.1;
            deny all;
        }
        
        location / {
            root /usr/share/nginx/html;
            index index.html;
            try_files $uri $uri/ =404;
        }
    }
---
apiVersion: v1
kind: Service
metadata:
  name: production-web-app
  namespace: default
  labels:
    app: production-web-app
spec:
  ports:
  - port: 80
    targetPort: 80
    name: http
  - port: 9113
    targetPort: 9113
    name: metrics
  selector:
    app: production-web-app

kubectl apply -f production-app.yaml

Step 5: Advanced Traefik Configuration

Create sophisticated routing with middleware:

# advanced-routing.yaml
# Rate limiting middleware
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: rate-limit
  namespace: default
spec:
  rateLimit:
    burst: 100
    average: 50
---
# Retry middleware
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: retry
  namespace: default
spec:
  retry:
    attempts: 3
---
# Circuit breaker middleware
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: circuit-breaker
  namespace: default
spec:
  circuitBreaker:
    expression: NetworkErrorRatio() > 0.3
---
# Compress responses
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: compress
  namespace: default
spec:
  compress: {}
---
# Add security headers
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: security-headers
  namespace: default
spec:
  headers:
    customRequestHeaders:
      X-Forwarded-Proto: "https"
    customResponseHeaders:
      X-Frame-Options: "DENY"
      X-Content-Type-Options: "nosniff"
      X-XSS-Protection: "1; mode=block"
      Strict-Transport-Security: "max-age=31536000; includeSubDomains; preload"
---
# Production ingress route with all middleware
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: production-app
  namespace: default
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`app.yourdomain.com`)
      kind: Rule
      services:
        - name: production-web-app
          port: 80
      middlewares:
        - name: rate-limit
        - name: retry
        - name: circuit-breaker
        - name: compress
        - name: security-headers
  tls:
    certResolver: letsencrypt

kubectl apply -f advanced-routing.yaml

Step 6: Blue-Green Deployments with Traefik

Implement zero-downtime deployments:

# blue-green-deployment.yaml
apiVersion: traefik.containo.us/v1alpha1
kind: TraefikService
metadata:
  name: blue-green-service
  namespace: default
spec:
  weighted:
    services:
    - name: production-web-app
      weight: 100
      port: 80
    - name: production-web-app-green
      weight: 0
      port: 80
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: blue-green-route
  namespace: default
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`bg.yourdomain.com`)
      kind: Rule
      services:
        - name: blue-green-service
          kind: TraefikService
  tls:
    certResolver: letsencrypt

kubectl apply -f blue-green-deployment.yaml

Step 7: Setting Up Grafana Dashboards for Traefik

Import the official Traefik dashboard in Grafana:

Traefik Official Dashboard (Dashboard ID: 4475)
Traefik 2.0 Dashboard (Dashboard ID: 11462)

Or create a custom dashboard with key metrics:

{
  "dashboard": {
    "id": null,
    "title": "Traefik Production Metrics",
    "panels": [
      {
        "title": "Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(traefik_service_requests_total[5m])",
            "legendFormat": "{{service}} - {{method}}"
          }
        ]
      },
      {
        "title": "Response Time",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(traefik_service_request_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          }
        ]
      },
      {
        "title": "Error Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(traefik_service_requests_total{code=~\"5..\"}[5m]) / rate(traefik_service_requests_total[5m])",
            "legendFormat": "Error Rate"
          }
        ]
      }
    ]
  }
}

Step 8: SSL Certificate Monitoring

Monitor certificate health:

# cert-monitor.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cert-monitor
  namespace: traefik
spec:
  schedule: "0 */6 * * *"  # Every 6 hours
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: cert-checker
            image: alpine:latest
            command:
            - /bin/sh
            - -c
            - |
              apk add --no-cache openssl curl
              
              # Check certificate expiration
              DOMAINS="app.yourdomain.com traefik.yourdomain.com"
              
              for domain in $DOMAINS; do
                EXPIRY=$(echo | openssl s_client -servername $domain -connect $domain:443 2>/dev/null | \
                         openssl x509 -noout -dates | grep notAfter | cut -d= -f2)
                EXPIRY_EPOCH=$(date -d "$EXPIRY" +%s)
                CURRENT_EPOCH=$(date +%s)
                DAYS_TO_EXPIRY=$(( ($EXPIRY_EPOCH - $CURRENT_EPOCH) / 86400 ))
                
                echo "Certificate for $domain expires in $DAYS_TO_EXPIRY days"
                
                if [ $DAYS_TO_EXPIRY -lt 30 ]; then
                  echo "WARNING: Certificate for $domain expires soon!"
                  # Send alert to monitoring system
                  curl -X POST http://prometheus-alertmanager.monitoring:9093/api/v1/alerts \
                    -H "Content-Type: application/json" \
                    -d "[{\"labels\":{\"alertname\":\"CertificateExpiringSoon\",\"domain\":\"$domain\",\"severity\":\"warning\"}}]"
                fi
              done
          restartPolicy: OnFailure

kubectl apply -f cert-monitor.yaml

Step 9: Load Testing and Performance Validation

Validate your setup with load testing:

# Install k6 for load testing
kubectl create configmap k6-script --from-literal=script.js='
import http from "k6/http";
import { check, sleep } from "k6";

export let options = {
  stages: [
    { duration: "2m", target: 20 },
    { duration: "5m", target: 20 },
    { duration: "2m", target: 40 },
    { duration: "5m", target: 40 },
    { duration: "2m", target: 0 },
  ],
};

export default function() {
  let response = http.get("https://app.yourdomain.com");
  check(response, {
    "status is 200": (r) => r.status === 200,
    "response time < 500ms": (r) => r.timings.duration < 500,
  });
  sleep(1);
}
'

# Run load test
kubectl run k6 --image=loadimpact/k6:latest --rm -i --restart=Never -- run --vus 50 --duration 10m /scripts/script.js --mount configMap:k6-script:/scripts

Production Best Practices

1. Resource Management

Always set appropriate limits:

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 256Mi

2. High Availability

Run multiple Traefik replicas:

deployment:
  replicas: 3
  
# Use pod anti-affinity
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - traefik
        topologyKey: kubernetes.io/hostname

3. Security Hardening

Use network policies to restrict access
Enable pod security policies
Regularly rotate certificates
Monitor for suspicious traffic patterns

4. Backup and Recovery

Backup ACME certificate data
Document disaster recovery procedures
Test certificate renewal processes

Troubleshooting Common Issues

SSL Certificate Issues

# Check certificate status
kubectl logs -n traefik deployment/traefik | grep -i acme

# Verify ACME challenge
kubectl get certificates -A
kubectl describe certificate your-cert -n your-namespace

High Latency

# Check Traefik metrics
kubectl port-forward -n traefik svc/traefik 8080:8080
# Visit http://localhost:8080/metrics

# Monitor backend health
kubectl get endpoints

Rate Limiting Issues

# Check middleware configuration
kubectl get middleware -A
kubectl describe middleware rate-limit -n default

What's Next?

In Part 3, we'll complete our journey by adding enterprise-grade security with Akamai App and API Protection, plus EdgeDNS integration for global performance. You'll learn:

Akamai App & API Protection configuration
EdgeDNS integration with cert-manager
Global load balancing and failover
DDoS protection and Web Application Firewall
Performance optimization with edge caching

Conclusion

You now have a production-ready ingress controller that automatically manages SSL certificates, provides intelligent routing, and integrates seamlessly with your monitoring stack. This setup gives you:

Zero-touch SSL management with automatic renewal
Advanced traffic management with middleware and routing
Complete observability with metrics and alerting
Production-ready security with rate limiting and headers
Blue-green deployment capabilities for zero-downtime updates

The combination of Traefik's automatic service discovery, Let's Encrypt integration, and Prometheus monitoring creates a robust foundation that scales with your applications while maintaining security and observability.

Next up in Part 3: We'll add the final layer—enterprise security with Akamai App & API Protection and global DNS management for a truly production-ready Kubernetes platform.

Alexander Cedergren is a Solutions Engineer specializing in Kubernetes, edge computing, and cloud security. Follow the series to master production Kubernetes deployments.