Nginx 负载均衡

Nginx 负载均衡配置策略和最佳实践

📋 目录

负载均衡基础
Upstream 配置
负载均衡算法
健康检查
会话保持
高级配置
性能优化

负载均衡基础

什么是负载均衡

负载均衡是将客户端请求分发到多个后端服务器,以提高性能、可靠性和可扩展性。

客户端请求
    ↓
负载均衡器（Nginx）
    ↓
┌───────────┬───────────┬───────────┐
│ 后端服务器 │ 后端服务器 │ 后端服务器 │
│ Server 1  │ Server 2  │ Server 3  │
└───────────┴───────────┴───────────┘

负载均衡的好处:

提高并发处理能力
增强系统可靠性（故障转移）
实现水平扩展
优化资源利用率

Nginx 负载均衡架构

# 1. 定义后端服务器组
upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 2. 使用 upstream
server {
    listen 80;
    server_name example.com;
 
    location / {
        proxy_pass http://backend;
    }
}

Upstream 配置

基本 Upstream

upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}

带权重的 Upstream

upstream backend {
    # weight 越大,分配的请求越多
    server backend1.example.com:8080 weight=3;  # 3/6 = 50%
    server backend2.example.com:8080 weight=2;  # 2/6 = 33%
    server backend3.example.com:8080 weight=1;  # 1/6 = 17%
}

带参数的 Upstream

upstream backend {
    # max_fails: 最大失败次数
    # fail_timeout: 失败超时时间
    # backup: 备用服务器
    # down: 标记为不可用
    # max_conns: 最大连接数
 
    server backend1.example.com:8080 max_fails=3 fail_timeout=30s weight=3 max_conns=100;
    server backend2.example.com:8080 max_fails=3 fail_timeout=30s weight=2 max_conns=100;
    server backend3.example.com:8080 max_fails=3 fail_timeout=30s weight=1 backup;
    server backend4.example.com:8080 down;  # 手动下线
}

长连接配置

upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
 
    keepalive 32;              # 每个 worker 的长连接数
    keepalive_timeout 60s;     # 长连接超时时间
    keepalive_requests 100;    # 每个长连接最大请求数
}
 
server {
    location / {
        proxy_pass http://backend;
 
        # HTTP 1.1 和长连接
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

DNS 解析

upstream backend {
    # 域名会自动解析
    server api.example.com:8080;
 
    # 设置 DNS 解析频率
    resolver 8.8.8.8 8.8.4.4 valid=300s;  # 每 5 分钟解析一次
    resolver_timeout 5s;                   # DNS 解析超时
}

负载均衡算法

1. 轮询 (Round Robin) - 默认

upstream backend {
    # 默认就是轮询
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 请求分配顺序:
# Request 1 → backend1
# Request 2 → backend2
# Request 3 → backend3
# Request 4 → backend1
# Request 5 → backend2
# ...

2. 加权轮询 (Weighted Round Robin)

upstream backend {
    server backend1.example.com:8080 weight=3;  # 性能好的服务器
    server backend2.example.com:8080 weight=2;
    server backend3.example.com:8080 weight=1;  # 性能差的服务器
}
 
# 请求分配顺序:
# Request 1 → backend1
# Request 2 → backend1
# Request 3 → backend1
# Request 4 → backend2
# Request 5 → backend2
# Request 6 → backend3
# Request 7 → backend1
# ...

3. IP 哈希 (IP Hash)

upstream backend {
    ip_hash;  # 根据客户端 IP 哈希
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 同一个客户端 IP 总是分配到同一个后端
# 用于需要会话保持的场景

4. 最少连接 (Least Connections)

upstream backend {
    least_conn;  # 选择连接数最少的后端
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 适合请求处理时间差异较大的场景

5. 通用哈希 (Generic Hash)

upstream backend {
    hash $request_uri consistent;  # 根据请求 URI 哈希
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 同一个 URI 总是分配到同一个后端
# consistent 参数使用一致性哈希,添加/删除服务器时影响较小
 
# 也可以根据其他变量哈希
hash $remote_addr;          # 根据客户端 IP
hash $cookie_sessionid;     # 根据 Cookie
hash $arg_user;             # 根据 URL 参数

6. 随机 (Random)

upstream backend {
    random;  # 随机选择
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 随机选择,配合权重使用
upstream backend {
    random two=on;  # 选择负载最低的两个,再随机选择一个
 
    server backend1.example.com:8080 weight=3;
    server backend2.example.com:8080 weight=2;
    server backend3.example.com:8080 weight=1;
}

健康检查

被动健康检查

upstream backend {
    # max_fails: 最大失败次数
    # fail_timeout: 失败超时时间
    server backend1.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend2.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend3.example.com:8080 max_fails=3 fail_timeout=30s;
}
 
server {
    location / {
        proxy_pass http://backend;
 
        # 在哪些情况下认为失败
        proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
 
        # 重试次数
        proxy_next_upstream_tries 3;
 
        # 重试超时
        proxy_next_upstream_timeout 10s;
    }
}

主动健康检查（需要 nginx-plus）

upstream backend {
    zone backend 64k;
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
 
    # 主动健康检查
    health_check;
}

使用第三方模块（nginx_upstream_check_module）

upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
 
    # 主动健康检查
    check interval=3000 rise=2 fall=3 timeout=1000 type=http;
    check_http_send "GET /health HTTP/1.0\r\n\r\n";
    check_http_expect_alive http_2xx http_3xx;
}

健康检查配置说明

参数	说明	推荐值
`max_fails`	最大失败次数	3
`fail_timeout`	失败超时时间	30s
`backup`	标记为备用服务器	-
`down`	标记为不可用	-
`max_conns`	最大连接数	根据后端能力
`slow_start`	慢启动时间	30s

会话保持

IP 哈希会话保持

upstream backend {
    ip_hash;  # 根据 IP 哈希,同一 IP 访问同一后端
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}

upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
 
    # 根据 Cookie 粘性
    sticky cookie srv_id expires=1h domain=.example.com path=/;
}

通用哈希会话保持

upstream backend {
    hash $cookie_sessionid consistent;  # 根据 sessionid Cookie
 
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
# 其他哈希方式
hash $remote_addr;          # 根据 IP
hash $arg_userid;           # 根据 URL 参数
hash $http_authorization;   # 根据认证信息

应用层会话保持

upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
map $cookie_backend $backend_name {
    default backend;
    backend1.example.com:8080 backend1;
    backend2.example.com:8080 backend2;
    backend3.example.com:8080 backend3;
}
 
server {
    location / {
        # 如果有后端 cookie,使用指定的后端
        if ($cookie_backend) {
            proxy_pass http://$cookie_backend;
            break;
        }
 
        # 否则负载均衡
        proxy_pass http://backend;
 
        # 设置后端 cookie
        add_header Set-Cookie "backend=$upstream_addr; Path=/; Max-Age=3600";
    }
}

高级配置

动态配置后端

# 使用变量作为后端
upstream backend {
    server backend1.example.com:8080;
    server backend2.example.com:8080;
    server backend3.example.com:8080;
}
 
upstream backend_backup {
    server backup1.example.com:8080;
    server backup2.example.com:8080;
}
 
map $http_authorization $backend_pool {
    default "backend";
    ~*premium "backend_premium";
    ~*free    "backend_free";
}
 
server {
    location / {
        proxy_pass http://$backend_pool;
    }
}

根据地理位置负载均衡

# 需要 GeoIP 模块
geo $backend_pool {
    default "backend_us";
    192.168.1.0/24 "backend_local";
    10.0.0.0/8     "backend_internal";
    CN             "backend_cn";
    JP             "backend_jp";
    EU             "backend_eu";
}
 
upstream backend_us { server us1.example.com:8080; }
upstream backend_local { server local1.example.com:8080; }
upstream backend_internal { server internal1.example.com:8080; }
upstream backend_cn { server cn1.example.com:8080; }
upstream backend_jp { server jp1.example.com:8080; }
upstream backend_eu { server eu1.example.com:8080; }
 
server {
    location / {
        proxy_pass http://$backend_pool;
    }
}

基于响应时间的动态负载均衡

upstream backend {
    server backend1.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend2.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend3.example.com:8080 max_fails=3 fail_timeout=30s;
 
    # 使用最少时间算法（nginx-plus）
    least_time header;
    # least_time last_byte;  # 或者根据最后字节时间
}

熔断机制

upstream backend {
    server backend1.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend2.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend3.example.com:8080 max_fails=3 fail_timeout=30s;
}
 
server {
    location / {
        proxy_pass http://backend;
 
        # 熔断配置
        proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
        proxy_next_upstream_timeout 5s;
 
        # 如果所有后端都失败,返回错误
        error_page 500 502 503 504 /error.html;
    }
}

灰度发布

upstream backend_old {
    server backend-old1.example.com:8080;
    server backend-old2.example.com:8080;
}
 
upstream backend_new {
    server backend-new1.example.com:8080;
    server backend-new2.example.com:8080;
}
 
# 灰度规则
map $cookie_gray $backend_pool {
    default "backend_old";
    "new"   "backend_new";
    "old"   "backend_old";
}
 
map $remote_addr $backend_ip {
    default "backend_old";
    ~^192\.168\.1\.[0-5]$ "backend_new";  # 特定 IP 段使用新版本
}
 
server {
    location / {
        # 优先使用 Cookie 规则,其次使用 IP 规则
        set $final_backend $backend_pool;
        if ($final_backend = "backend_old") {
            set $final_backend $backend_ip;
        }
 
        proxy_pass http://$final_backend;
 
        # 设置灰度 Cookie
        add_header Set-Cookie "gray=$cookie_gray; Path=/; Max-Age=86400";
    }
}

性能优化

长连接优化

upstream backend {
    server backend1.example.com:8080 max_fails=3 fail_timeout=30s;
    server backend2.example.com:8080 max_fails=3 fail_timeout=30s;
 
    keepalive 32;              # 每个 worker 的长连接数
    keepalive_timeout 60s;     # 长连接超时
    keepalive_requests 100;    # 每个长连接最大请求数
}
 
server {
    location / {
        proxy_pass http://backend;
 
        # HTTP 1.1 和长连接
        proxy_http_version 1.1;
        proxy_set_header Connection "";
 
        # 长连接超时
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
}

缓冲区优化

server {
    location / {
        proxy_pass http://backend;
 
        # 响应缓冲
        proxy_buffering on;
        proxy_buffers 16 4k;      # 增加缓冲区数量
        proxy_buffer_size 8k;     # 增加缓冲区大小
        proxy_busy_buffers_size 16k;
 
        # 请求缓冲
        client_body_buffer_size 128k;
        proxy_request_buffering on;
    }
}

超时优化

server {
    location / {
        proxy_pass http://backend;
 
        # 连接超时
        proxy_connect_timeout 3s;   # 短连接超时
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
 
        # 后端失败重试
        proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
        proxy_next_upstream_timeout 10s;
    }
}

上游服务器配置优化

upstream backend {
    server backend1.example.com:8080
        max_fails=3              # 最大失败次数
        fail_timeout=30s         # 失败超时时间
        max_conns=100            # 最大连接数
        slow_start=30s;          # 慢启动时间（逐渐恢复流量）
 
    server backend2.example.com:8080
        max_fails=3
        fail_timeout=30s
        max_conns=100
        slow_start=30s;
 
    keepalive 32;
    keepalive_timeout 60s;
    keepalive_requests 100;
}

🔧 监控和调试

监控负载均衡状态

server {
    listen 80;
    server_name example.com;
 
    location / {
        proxy_pass http://backend;
 
        # 添加后端信息到响应头（调试用）
        add_header X-Upstream $upstream_addr always;
        add_header X-Upstream-Status $upstream_status always;
        add_header X-Upstream-Response-Time $upstream_response_time always;
    }
 
    # 状态监控
    location /upstream_status {
        # 需要 nginx-plus 或第三方模块
        upstream_status;
        access_log off;
    }
}

日志记录

# 自定义日志格式，包含负载均衡信息
log_format upstream_log '$remote_addr - $remote_user [$time_local] '
                       '"$request" $status $body_bytes_sent '
                       '"$http_referer" "$http_user_agent" '
                       'rt=$request_time uct="$upstream_connect_time" '
                       'uht="$upstream_header_time" urt="$upstream_response_time" '
                       'addr="$upstream_addr" status="$upstream_status"';
 
access_log /var/log/nginx/upstream.log upstream_log;

Muliminty Note

探索

Nginx 负载均衡

Nginx 负载均衡

📋 目录

负载均衡基础

什么是负载均衡

Nginx 负载均衡架构

Upstream 配置

基本 Upstream

带权重的 Upstream

带参数的 Upstream

长连接配置

DNS 解析

负载均衡算法

1. 轮询 (Round Robin) - 默认

2. 加权轮询 (Weighted Round Robin)

3. IP 哈希 (IP Hash)

4. 最少连接 (Least Connections)

5. 通用哈希 (Generic Hash)

6. 随机 (Random)

健康检查

被动健康检查

主动健康检查（需要 nginx-plus）

使用第三方模块（nginx_upstream_check_module）

健康检查配置说明

会话保持

IP 哈希会话保持

Cookie 会话保持（需要第三方模块）

通用哈希会话保持

应用层会话保持

高级配置

动态配置后端

根据地理位置负载均衡

基于响应时间的动态负载均衡

熔断机制

灰度发布

性能优化

长连接优化

缓冲区优化

超时优化

上游服务器配置优化

🔧 监控和调试

监控负载均衡状态

日志记录

📚 相关链接

相关笔记

关系图谱

目录

反向链接