Deployment Guide Creator

Эксперт по созданию production-ready документации для деплоя.

Core Principles

Structure & Organization

Prerequisites listed first
Environment-specific instructions
Verification steps after each phase
Rollback procedures documented
Operational readiness covered

Documentation Standards

Imperative tone for instructions
Exact commands with expected outputs
Version specifications for all tools
Context explaining why each step matters
Estimated execution times per phase

Standard Guide Structure

Deployment Guide: [Application Name]

Overview

Application description
Deployment strategy (blue-green, rolling, canary)
Architecture diagram
Key contacts

Prerequisites

System Requirements

OS: Ubuntu 22.04 LTS
RAM: 8GB minimum
Disk: 50GB SSD
Network: 100Mbps

Required Tools

Tool	Version	Purpose
Docker	24.0+	Containerization
kubectl	1.28+	Kubernetes CLI
Helm	3.12+	Package management

Access Requirements

SSH access to jump server
Kubernetes cluster credentials
Container registry credentials
Secrets management access

Security Checklist

VPN connection established
MFA configured
SSH keys rotated (< 90 days)

Pre-Deployment Checklist

Code Readiness

All tests passing in CI
Code review approved
Security scan completed
Documentation updated

Environment Checks

Target cluster healthy
Database backups verified
Monitoring alerts silenced
Maintenance window scheduled

Rollback Preparation

Previous version tagged
Rollback procedure tested
Data migration reversible
Communication plan ready

Deployment Phases

Phase 1: Infrastructure Prep

Estimated time: 10 minutes

1. Verify cluster connectivity

kubectl cluster-info

Expected: Kubernetes control plane is running

2. Check node readiness

kubectl get nodes

Expected: All nodes in "Ready" state

3. Verify namespace exists

kubectl get namespace production

If not exists:

kubectl create namespace production

Phase 2: Application Deployment

Estimated time: 15 minutes

1. Pull latest configuration

git pull origin main cd deployment/kubernetes

2. Update image tag in values

export IMAGE_TAG=v1.2.3 sed -i "s/tag: .*/tag: ${IMAGE_TAG}/" values.yaml

3. Deploy with Helm

helm upgrade --install myapp ./charts/myapp
--namespace production
--values values.yaml
--wait
--timeout 10m

Expected output:

Release "myapp" has been upgraded. Happy Helming!

Phase 3: Database Migration

Estimated time: 5-30 minutes (depends on data size)

1. Create backup before migration

kubectl exec -n production deploy/myapp --
pg_dump -Fc > backup_$(date +%Y%m%d_%H%M%S).dump

2. Run migrations

kubectl exec -n production deploy/myapp --
npm run migrate

3. Verify migration status

kubectl exec -n production deploy/myapp --
npm run migrate:status

Kubernetes Deployment Example

deployment.yaml

apiVersion: apps/v1 kind: Deployment metadata: name: myapp namespace: production labels: app: myapp version: v1.2.3 spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers: - name: myapp image: registry.example.com/myapp:v1.2.3 ports: - containerPort: 8080 resources: requests: memory: "256Mi" cpu: "250m" limits: memory: "512Mi" cpu: "500m" livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 env: - name: NODE_ENV value: "production" - name: DATABASE_URL valueFrom: secretKeyRef: name: myapp-secrets key: database-url

Post-Deployment Verification

Verification Checklist

Health Checks

All pods running: kubectl get pods -n production
Endpoints healthy: curl -s https://api.example.com/health
Database connected: Check application logs

Performance Validation

Response time < 200ms (p95)
Error rate < 0.1%
Memory usage stable

Security Checks

TLS certificates valid
No sensitive data in logs
Rate limiting active

Verification Script

#!/bin/bash

verify-deployment.sh

echo "=== Deployment Verification ==="

Check pod status

echo "Checking pods..." READY_PODS=$(kubectl get pods -n production -l app=myapp
-o jsonpath='{.items[*].status.containerStatuses[0].ready}' | tr ' ' '\n' | grep -c true) TOTAL_PODS=$(kubectl get pods -n production -l app=myapp --no-headers | wc -l)

if [ "$READY_PODS" -eq "$TOTAL_PODS" ]; then echo "✅ All $TOTAL_PODS pods ready" else echo "❌ Only $READY_PODS of $TOTAL_PODS pods ready" exit 1 fi

Check endpoints

echo "Checking health endpoint..." HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" https://api.example.com/health) if [ "$HTTP_CODE" -eq 200 ]; then echo "✅ Health endpoint returning 200" else echo "❌ Health endpoint returning $HTTP_CODE" exit 1 fi

Check logs for errors

echo "Checking for errors in logs..." ERROR_COUNT=$(kubectl logs -n production -l app=myapp --since=5m | grep -c "ERROR") if [ "$ERROR_COUNT" -lt 5 ]; then echo "✅ Error count acceptable: $ERROR_COUNT" else echo "⚠️ High error count: $ERROR_COUNT" fi

echo "=== Verification Complete ==="

Rollback Procedures

Automatic Rollback Triggers

Health check failures > 3 consecutive
Error rate > 5% for 5 minutes
P99 latency > 2 seconds for 5 minutes

Manual Rollback Steps

Estimated time: 5 minutes

1. Identify previous release

helm history myapp -n production

2. Rollback to previous version

helm rollback myapp [REVISION] -n production --wait

3. Verify rollback

kubectl get pods -n production -l app=myapp curl -s https://api.example.com/health

4. If database migration needs reversal

kubectl exec -n production deploy/myapp --
npm run migrate:down

Data Recovery

Restore from backup if needed

kubectl exec -n production deploy/myapp --
pg_restore -d myapp_production backup_20240101_120000.dump

Troubleshooting

Common Issues

Issue: Pods stuck in ImagePullBackOff

Symptoms:

Pods show ImagePullBackOff status
Events show "Failed to pull image"

Resolution:

Verify image exists: docker pull registry.example.com/myapp:v1.2.3
Check registry credentials: kubectl get secret regcred -n production

Recreate secret if needed:

kubectl create secret docker-registry regcred \
  --docker-server=registry.example.com \
  --docker-username=user \
  --docker-password=pass \
  -n production

Issue: Health checks failing

Symptoms:

Pods restarting frequently
Readiness probe failures in events

Resolution:

Check application logs: kubectl logs -n production deploy/myapp
Verify environment variables: kubectl exec -n production deploy/myapp -- env
Test health endpoint manually: kubectl port-forward deploy/myapp 8080:8080
Increase probe timeouts if startup is slow

Log Locations

| Log Type | Location | Command |
|----------|----------|---------|
| Application | Pod stdout | `kubectl logs deploy/myapp` |
| Ingress | Ingress controller | `kubectl logs -n ingress deploy/nginx` |
| Events | Kubernetes events | `kubectl get events -n production` |
| Audit | Cluster audit logs | `/var/log/kubernetes/audit.log` |

Emergency Contacts

| Role | Name | Contact |
|------|------|---------|
| On-call Engineer | PagerDuty | #ops-escalation |
| Database Admin | DBA Team | dba@example.com |
| Security | Security Team | security@example.com |

CI/CD Integration

# .github/workflows/deploy.yml
name: Deploy to Production

on:
  push:
    tags:
      - 'v*'

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production

    steps:
      - uses: actions/checkout@v4

      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          kubeconfig: ${{ secrets.KUBE_CONFIG }}

      - name: Deploy with Helm
        run: |
          helm upgrade --install myapp ./charts/myapp \
            --namespace production \
            --set image.tag=${{ github.ref_name }} \
            --wait \
            --timeout 10m

      - name: Verify deployment
        run: ./scripts/verify-deployment.sh

      - name: Notify on failure
        if: failure()
        uses: slackapi/slack-github-action@v1
        with:
          payload: |
            {"text": "⚠️ Deployment failed for ${{ github.ref_name }}"}

Лучшие практики

- Test rollback — регулярно тестируйте процедуры отката

- Incremental deploys — начинайте с малого % трафика

- Feature flags — разделяйте deploy и release

- Monitoring first — настройте мониторинг до деплоя

- Document everything — все шаги должны быть воспроизводимы

- Automate verification — скрипты вместо ручных проверок