Linux系统运维日志 – 第174页 – 又一个WordPress站点

【命令使用】rsync跳过大目录

rsync命令的介绍和使用方法不在此介绍，本处仅记录在使用时如何跳过不想要的大目录。背景如下：

1、ubuntu服务器新增了1T硬盘；

2、需要同步原有home目录下的用户目录到新硬盘挂载的硬盘目录（/home2）；

3、同步时跳过某一个用户（test）目录下的大文件目录（T3），需要跳过的目录全路径为”/home/test/T3″；
此时可以使用rsync的–exclude命令实现。直接上命令：

[html] view plain copy print?
sudo rsync -aux --exclude "test/T3" /home/ /home2/

需要注意的是–exclude之后的目录需要写相对路径（如例中的相对于/home目录的相对路径），而不能是全路径，否则无法跳过。

问题

很多情况下我们会遇到服务器日志目录几十个G，或者附件目录过大的情况，如果我们使用常规的rm命令来进行删除的话，会非常慢。

但是rsync命令提供了这样一个功能，可以秒删。通过同步指令的–delete-before来删除目录

原理

解决方案

首先新建立一个空目录empty_dir，然后将空目录的内容同步到非空目录。

rsync --delete-before -a -H -v --progress --stats /www/webdev/newhouse/house/empty_dir /www/webdev/xxx.com/house/log

第1章数据实时同步介绍

1.1 什么是实时同步：如何实现实时同步

A. 要利用监控服务（inotify），监控同步数据服务器目录中信息的变化

B. 发现目录中数据产生变化，就利用rsync服务推送到备份服务器上

1.2 实现实时同步的方法

inotify+rsync 方式实现数据同步

sersync 方式实现实时数据同步

1.2.1 实时同步原理介绍

未分类

1.3 inotify+rsync 方式实现数据同步

1.3.1 Inotify简介

Inotify是一种强大的，细粒度的。异步的文件系统事件监控机制，linux内核从2.6.13起，加入了 Inotify支持，通过Inotify可以监控文件系统中添加、删除，修改、移动等各种事件,利用这个内核接口，第三方软件就可以监控文件系统下文件的各种变化情况，而 inotify-tools 正是实施这样监控的软件。国人周洋在金山公司也开发了类似的实时同步软件sersync。

提示信息：

sersync软件实际上就是在 inotify软件基础上进行开发的，功能要更加强大些，多了定时重传机制，过滤机制了提供接口做 CDN，支持多线程橾作。

Inotify实际是一种事件驱动机制，它为应用程序监控文件系统事件提供了实时响应事件的机制，而无须通过诸如cron等的轮询机制来获取事件。cron等机制不仅无法做到实时性，而且消耗大量系统资源。相比之下，inotify基于事件驱动，可以做到对事件处理的实时响应，也没有轮询造成的系统资源消耗，是非常自然的事件通知接口，也与自然世界事件机制相符合。

inotify的实现有几款软件：

inotify-tools，sersync，lrsyncd

1.3.2 inotify+rsync使用方式

inotify 对同步数据目录信息的监控

rsync 完成对数据信息的实时同步

利用脚本进行结合

1.4 部署inotify软件的前提

需要2.6.13以后内核版本才能支持inotify软件。2.6.13内核之后版本，在没有安装inotify软件之前，应该有这三个文件。

[root@backup ~]# ll /proc/sys/fs/inotify/

total 0

-rw-r--r-- 1 root root 0 Oct 17 10:12 max_queued_events

-rw-r--r-- 1 root root 0 Oct 17 10:12 max_user_instances

-rw-r--r-- 1 root root 0 Oct 17 10:12 max_user_watches

1.4.1 三个重要文件的说明

未分类

1.4.2 【服务优化】可以将三个文件的数值调大，监听更大的范围

1.4.3 【官方说明】三个重要文件

[root@nfs01 ~]# man proc

/proc/sys/fs/inotify (since Linux 2.6.13)

      This  directory  contains    files    max_queued_events,

      max_user_instances, and max_user_watches, that can be used

      to limit the amount of kernel memory consumed by the  inotify interface. 

for further details, see inotify(7).

通过man手册的第7级别中查到 inotify的默认文件的详细说明。

[root@nfs01 ~]# man 7 inotify

/proc/sys/fs/inotify/max_queued_events

      The  value  in this file is used when an application calls

      inotify_init(2) to set an upper limit  on  the  number  of

      events  that  can  be  queued to the corresponding inotify

      instance.  Events in excess of this limit are dropped, but

      an IN_Q_OVERFLOW event is always generated.



/proc/sys/fs/inotify/max_user_instances

      This  specifies  an  upper  limit on the number of inotify

      instances that can be created per real user ID.



/proc/sys/fs/inotify/max_user_watches

      This specifies an upper limit on  the  number  of  watches

      that can be created per real user ID.

1.5 inotify软件介绍及参数说明

1.5.1 两种安装方式

1） yum install -y inotify-tools

2）手工编译安装

注：

YUM 安装需要有epel源

http://mirrors.aliyun.com

手工编译安装方式需要到github上进行下载软件包

inotify软件的参考资料链接：

https://github.com/rvoicilas/inotify-tools/wiki

1.5.2 inotify主要安装的两个软件

inotifywait：（主要）

在被监控的文件或目录上等待特定文件系统事件（open close delete等）发生，执行后处于阻塞状态，适合在shell脚本中使用

inotifywatch：

收集被监控的文件系统使用的统计数据，指文件系统事件发生的次数统计。

说明：在实时实时同步的时候，主要是利用inotifywait对目录进行监控

1.5.3 inotifywait命令参数说明

未分类

1.5.4 -e[参数] 可以指定的事件类型

未分类

1.5.4.1 【实例】inotifywait监控中的事件测试

1、创建事件

[root@nfs01 data]# touch test2.txt

[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d-%m-%y %H:%M" --format "%T %w%f 事件信息: %e" -e create

17-10-17 11:19 /data/test2.txt 事件信息: CREATE

2、删除事件

[root@nfs01 data]# rm -f test1.txt

[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d-%m-%y %H:%M" --format "%T %w%f 事件信息: %e" -e delete

17-10-17 11:28 /data/test1.txt 事件信息: DELETE

3、修改事件

[root@nfs01 data]# echo "132" > test.txt

[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d-%m-%y %H:%M" --format "%T %w%f 事件信息: %e" -e close_write

17-10-17 11:30 /data/test.txt 事件信息: CLOSE_WRITE,CLOSE

4、移动事件 moved_to

[root@nfs01 data]# mv /etc/hosts .

[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d-%m-%y %H:%M" --format "%T %w%f 事件信息: %e" -e moved_to

17-10-17 11:33 /data/hosts 事件信息: MOVED_TO

移动事件 moved_from

[root@nfs01 data]# mv ./hosts  /tmp/

[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d-%m-%y %H:%M" --format "%T %w%f 事件信息: %e" -e moved_from

17-10-17 11:34 /data/hosts 事件信息: MOVED_FROM

1.5.5 inotifywait 参数 –format 格式定义参数

未分类

1.5.6 inotifywait 参数–timefmt 时间格式参数

未分类

1.5.6.1 修改输出的日期格式

[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d/%m/%y %H:%M" --format "%T %w%f"

17/10/17 11:12 /data/test1.txt

1.5.7 -e[参数] 重要监控事件参数汇总表：

未分类

1.6 对inotifywait命令的测试

对inotifywait命令测试的说明：

需要打开两个连接窗口

窗口运行inotifywait
窗口对文件夹进行操作，可在一窗口中查看出inotifywait的监控记录

1.6.1 创建文件的逻辑↓

[root@nfs01 ~]# inotifywait /data

Setting up watches.

Watches established.

/data/ CREATE test1.txt

/data/ OPEN test1.txt

/data/ ATTRIB test1.txt

/data/ CLOSE_WRITE,CLOSE test1.txt

创建文件，inotifywait显示创建文件的过程↑

[root@nfs01 data]# touch test1.txt

1.6.2 创建目录逻辑↓

[root@nfs01 data]# mkdir testdir

[root@nfs01 ~]#

/data/ CREATE,ISDIR testdir

1.6.3 监控子目录下的文件↓

[root@nfs01 data]# touch  testdir/test01.txt

[root@nfs01 ~]# inotifywait -mrq  /data

/data/testdir/ OPEN test01.txt

/data/testdir/ ATTRIB test01.txt

/data/testdir/ CLOSE_WRITE,CLOSE test01.txt

1.6.4 sed命令修改逻辑

[root@nfs01 data]# sed 's#132#123#g' test.txt -i



[root@nfs01 ~]# inotifywait -mrq  /data --timefmt "%d-%m-%y %H:%M" --format "%T %w%f 事件信息: %e" -e moved_from

 /data/test.txt 事件信息: OPEN

 /data/sedDh5R8v 事件信息: CREATE

 /data/sedDh5R8v 事件信息: OPEN

 /data/test.txt 事件信息: ACCESS

 /data/sedDh5R8v 事件信息: MODIFY

 /data/sedDh5R8v 事件信息: ATTRIB

 /data/sedDh5R8v 事件信息: ATTRIB

 /data/test.txt 事件信息: CLOSE_NOWRITE,CLOSE

 /data/sedDh5R8v 事件信息: CLOSE_WRITE,CLOSE

 /data/sedDh5R8v 事件信息: MOVED_FROM

 /data/test.txt 事件信息: MOVED_TO

sed命令替换逻辑：

创建临时文件
将原文件内容放置到临时文件中，修改替换临时文件中的内容，原有文件不做改动
重命名临时文件，覆盖原文件

1.6.5 inotifywait监控中 -e 的参数使用

inotifywait -mrq /data --timefmt "%d/%m/%y %H:%M" --format "%T %w%f 事件信息: %e" -e create

说明：表示只监听create事件

inotifywait -mrq /data --timefmt "%d/%m/%y %H:%M" --format "%T %w%f 事件信息: %e"

说明：不指定-e参数，表示监听所有事件

删除事件delete

    # inotifywait -mrq /data --timefmt "%F %H:%M" --format "%T %w%f 事件信息: %@e" -e delete

    2017-10-17 11:28 /data/02.txt 事件信息: DELETE

    2017-10-17 11:28 /data/03.txt 事件信息: DELETE

    2017-10-17 11:28 /data/04.txt 事件信息: DELETE

修改事件close_write

    # inotifywait -mrq /data --timefmt "%F %H:%M" --format "%T %w%f 事件信息: %@e" -e delete,close_write

    2017-10-17 11:30 /data/oldgirl.txt 事件信息: CLOSE_WRITE@CLOSE

    2017-10-17 11:30 /data/.oldgirl.txt.swx 事件信息: CLOSE_WRITE@CLOSE

    2017-10-17 11:30 /data/.oldgirl.txt.swx 事件信息: DELETE

    2017-10-17 11:30 /data/.oldgirl.txt.swp 事件信息: CLOSE_WRITE@CLOSE

    2017-10-17 11:30 /data/.oldgirl.txt.swp 事件信息: DELETE

    2017-10-17 11:30 /data/.oldgirl.txt.swp 事件信息: CLOSE_WRITE@CLOSE

    2017-10-17 11:30 /data/.oldgirl.txt.swp 事件信息: DELETE

移动事件moved_to

    inotifywait -mrq /data --timefmt "%F %H:%M" --format "%T %w%f 事件信息: %@e" -e delete,close_write,moved_to

    2017-10-17 11:34 /data/hosts 事件信息: MOVED_TO

1.7 实时同步命令参数示意图

未分类

第2章 inotify+rsync实时同步服务部署

2.1 第一个里程碑：部署rsync服务

2.1.1 rsync服务端部署

1)软件是否存在

[root@backup ~]# rpm -qa |grep rsync

rsync-3.0.6-12.el6.x86_64

需求：查询到某个命令非常有用。但是不知道属于哪个软件包

    yum  provides  rysnc

    provides  Find what package provides the given value

2)进行软件服务配置

[root@backup ~]# vim /etc/rsyncd.conf

uid = rsync

gid = rsync

use chroot = no

max connections = 200

timeout = 300

pid file = /var/run/rsyncd.pid

lock file = /var/run/rsync.lock

log file = /var/log/rsyncd.log

ignore errors

read only = false

list = false

hosts allow = 172.16.1.0/24

auth users = rsync_backup

secrets file = /etc/rsync.password

[backup]

comment = "backup dir by oldboy"

path = /backup

[nfsbackup]

comment = "nfsbackup dir by hzs"

path = /nfsbackup

3)创建rsync管理用户

[root@backup ~]# useradd -s /sbin/nologin -M rsync

[root@backup ~]# mkdir /nfsbackup/

[root@backup ~]# chown -R rsync.rsync /nfsbackup/

5)创建认证用户密码文件并进行授权600

echo "rsync_backup:oldboy123" >>/etc/rsync.password

chmod 600 /etc/rsync.password

6)启动rsync服务

rsync --daemon

至此服务端配置完成

[root@backup ~]# ps -ef |grep rsync

root      2076      1  0 17:05 ?        00:00:00 rsync --daemon

root      2163  1817  0 17:38 pts/1    00:00:00 grep --color=auto rsync

2.1.2 rsync客户端配置

1)软件是否存在

[root@backup ~]# rpm -qa |grep rsync

rsync-3.0.6-12.el6.x86_64

2)创建安全认证文件，并进行修改权限600

echo "oldboy123" >>/etc/rsync.password

chmod 600 /etc/rsync.password

3) 测试数据传输

[root@nfs01 sersync]# rsync -avz /data  [email protected]::nfsbackup  --password-file=/etc/rsync.password

sending incremental file list

data/

data/.hzs

data/.tar.gz

data/.txt

2.2 第二个里程碑：部署inotify服务

首先先确认是否有epel源用来安装inotify-tools软件

[root@nfs01 ~]# yum repolist

Loaded plugins: fastestmirror, security

Loading mirror speeds from cached hostfile

 * base: mirrors.aliyun.com

 * epel: mirrors.aliyun.com

 * extras: mirrors.aliyun.com

 * updates: mirrors.aliyun.com

repo id  repo name                                      status

base    CentOS-6 - Base - mirrors.aliyun.com            6,706

epel    Extra Packages for Enterprise Linux 6 - x86_64  12,401

extras  CentOS-6 - Extras - mirrors.aliyun.com              46

updates  CentOS-6 - Updates - mirrors.aliyun.com            722

repolist: 19,875

2.2.1 安装inotify软件

两种安装方式

1） yum install -y inotify-tools

2）手工编译安装

注：

手工编译安装方式需要到github上进行下载软件包

inotify软件的参考资料链接：

https://github.com/rvoicilas/inotify-tools/wiki

2.2.2 查看inotify安装上的两个命令(inotifywait,inotifywatch)

[root@nfs01 ~]# rpm -ql inotify-tools

/usr/bin/inotifywait      #主要

/usr/bin/inotifywatch

2.2.2.1 inotifywait和inotifywatch的作用：

一共安装了2个工具（命令），即inotifywait和inotifywatch

inotifywait : 在被监控的文件或目录上等待特定文件系统事件（open close delete等）发生，

执行后处于阻塞状态，适合在shell脚本中使用

inotifywatch :收集被监控的文件系统使用的统计数据,指文件系统事件发生的次数统计。

说明：yum安装后可以直接使用，如果编译安装需要进入到相应软件目录的bin目录下使用

#命令 man手册说明

# man inotifywait

inotifywait - wait for changes to files using inotify

使用inotify进行监控，等待产生变化的文件信息

# man inotifywatch

inotifywatch - gather filesystem access statistics using inotify

使用inotify进行监控，收集文件系统访问统计佶息

2.3 第三个里程碑：编写脚本，实现rsync+inotify软件功能结合

2.3.1 rsync服务命令

rsync -avz --delete /data/ [email protected]::nfsbackup --password-file=/etc/rsync.password

2.3.2 inotify服务命令：

inotifywait -mrq /data -format "%w%f"  -e create,delete,move_to,close_write

2.3.3 编写脚本

[root@nfs01 sersync]# vim /server/scripts/inotify.sh
#!/bin/bash
inotifywait -mrq /data --format "%w%f" -e create,delete,moved_to,close_write|
while read line
do
        rsync -az --delete /data/ [email protected]::nfsbackup --password-
file=/etc/rsync.password
done

脚本说明：

for循环会定义一个条件，当条件不满足时停止循环

while循环：只要条件满足就一直循环下去

2.3.4 对脚本进行优化

#!/bin/bash

Path=/data
backup_Server=172.16.1.41


/usr/bin/inotifywait -mrq --format '%w%f' -e create,close_write,delete /data  | while read line  
do
    if [ -f $line ];then
        rsync -az $line --delete rsync_backup@$backup_Server::nfsbackup --password-file=/etc/rsync.password
    else
        cd $Path &&
        rsync -az ./ --delete rsync_backup@$backup_Server::nfsbackup --password-file=/etc/rsync.password
    fi

done

2.4 第四个里程碑：测试编写的脚本

2.4.1 让脚本在后台运行

在/data 目录先创建6个文件

[root@nfs01 data]# sh  /server/scripts/inotify.sh &

[root@nfs01 data]# touch {1..6}.txt

在backup服务器上，已经时候同步过去了6个文件。

[root@backup ~]# ll /nfsbackup/

total 8

-rw-r--r-- 1 rsync rsync 0 Oct 17 12:06 1.txt

-rw-r--r-- 1 rsync rsync 0 Oct 17 12:06 2.txt

-rw-r--r-- 1 rsync rsync 0 Oct 17 12:06 3.txt

-rw-r--r-- 1 rsync rsync 0 Oct 17 12:06 4.txt

-rw-r--r-- 1 rsync rsync 0 Oct 17 12:06 5.txt

-rw-r--r-- 1 rsync rsync 0 Oct 17 12:06 6.txt

2.5 利用while循环语句编写的脚本停止方法（kill）

ctrl+z暂停程序运行，kill -9杀死
不要暂停程序，直接利用杀手三剑客进行杀进程

说明：kill三个杀手不是万能的，在进程暂停时，无法杀死；kill -9 （危险）

2.5.1 查看后台都要哪些程序在运行

[root@nfs01 data]# jobs

[1]+  Running                sh /server/scripts/inotify.sh &

2.5.2 fg将后台的程序调到前台来

[root@nfs01 data]# fg 1

sh /server/scripts/inotify.sh

2.6 进程的前台和后台运行方法：

    fg    -- 前台

    bg    -- 后台

2.6.1 脚本后台运行方法

    01. sh inotify.sh &

    02. nohup sh inotify.sh &

    03. screen实现脚本程序后台运行

sh /server/scripts/inotify.sh &

nohup

nohup sh inotify.sh &

2.7 screen实现脚本程序后台运行

2.7.1 经过yum查找发现screen命令属于screen包

[root@test ~]# yum provides screen

Loaded plugins: fastestmirror, security

Loading mirror speeds from cached hostfile

 * base: mirrors.aliyun.com

 * epel: mirrors.aliyun.com

 * extras: mirrors.aliyun.com

 * updates: mirrors.aliyun.com

base                                                      | 3.7 kB    00:00   

epel                                                      | 4.3 kB    00:00   

extras                                                    | 3.4 kB    00:00   

updates                                                  | 3.4 kB    00:00   

screen-4.0.3-19.el6.x86_64 : A screen manager that supports multiple logins on

                          : one terminal

Repo        : base

Matched from:

2.7.2 安装screen软件

[root@test ~]# yum install -y  screen

2.7.3 screen命令的参数

在shell中输入 screen即可进入screen 视图

[root@test ~]# screen

Screen实现后台运行程序的简单步骤:

  screen -ls ：可看screen会话

  screen -r ID :指定进入哪个screen会话

Screen命令中用到的快捷键

  Ctrl+a c ：创建窗口

  Ctrl+a w ：窗口列表

  Ctrl+a n ：下一个窗口

  Ctrl+a p ：上一个窗口

  Ctrl+a 0-9 ：在第0个窗口和第9个窗口之间切换

  Ctrl+a K(大写) ：关闭当前窗口，并且切换到下一个窗口 ，

（当退出最后一个窗口时，该终端自动终止，并且退回到原始shell状态）

  exit ：关闭当前窗口，并且切换到下一个窗口

（当退出最后一个窗口时，该终端自动终止，并且退回到原始shell状态）

  Ctrl+a d ：退出当前终端，返回加载screen前的shell命令状态

  Ctrl+a " : 窗口列表不同于w

2.8 sersync软件实现实时同步

http://www.linuxidc.com/Linux/2017-10/147899.htm

本博文中所使用的系统版本为: CentOS release 6.9 (Final) 内核版本为： 2.6.32-696.10.1.el6.x86_64 望读者注意！

rsync 服务端配置步骤

创建配置文件:

文件本身不存在需要手动创建

vi /etc/rsyncd.conf

#Rsync server 
#created by kendall 2017.10.18
##rsyncd.conf start##
uid = rsync                     #客户端连过来具有什么权限
gid = rsync
use chroot = no                 #安全相关，程序出bug开启有好处
max connections = 2000          #最大客户端连接数
timeout = 300                   #超时断开时间
pid file = /var/run/rsyncd.pid  #daemon进程号记录
lock file = /var/run/rsync.lock
log file = /var/log/rsyncd.log  #日志文件位置
ignore errors                   #忽略错误
read only = false               #只读 假的（可读写）
list = false                    #不可以查看服务端列表
hosts allow = 172.16.1.0/24     #允许IP段
#hosts deny = 0.0.0.0/32        #拒绝
auth users = rsync_backup       #远程连接的用户（纯虚拟用户，不是系统用户）
secrets file = /etc/rsync.password #存放用户密码的文件位置
[backup]                        #第一个模块
path = /backup                  #共享的目录
[oldboy]                        #第二个模块
path = /data                    #共享的目录

创建用户,及共享目录

useradd rsync -s /sbin/nologin -M
id rsync
mkdir /backup /data
chown -R rsync.rsync /backup/ /data/

创建密码文件

echo "rsync_backup:654321" >/etc/rsync.password
chmod 600 /etc/rsync.password

启动rsync

rsync --daemon
netstat -lntup|grep rsync
ps -ef|grep rsync|grep -v grep

加入开机自启动

echo "rsync --daemon" >>/etc/rc.local
cat /etc/rc.local

rsync 客户端配置步骤

创建密码文件

echo "654321" >/etc/rsync.password
chmod 600 /etc/rsync.password
ll /etc/rsync.password
cat /etc/rsync.password

推送文件测试

rsync -avz /tmp/ rsync_backup@server_ip::backup --password-file=/etc/rsync.password
rsync -avz /tmp/ rsync://rsync_backup@servr_ip/backup/tmp/  --password-file=/etc/rsync.password

Python合并多个字典的方法

示例

x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}

相同属性合并，后者覆盖前者的值。x和y合并后

>>> z
{'a': 1, 'b': 3, 'c': 4}

Python 3.5

在Python 3.5新增了字典合并的语法，只需要一条语句就可以实现字典的合并

z = {**x, **y}

其中**为字典解包操作符（dictionary unpacking operator）。

详细查看：https://docs.python.org/dev/whatsnew/3.5.html#pep-448-additional-unpacking-generalizations

Python 2 以及 Python 3.0-3.4

在Python 3.5之前，需要自己实现合并函数。

def merge_dicts(*dicts):
    result = {}
    for dict in dicts:
        result.update(dict)
    return result

pygal的简单使用

例子来自此书: 《Python编程从入门到实战》【美】Eric Matthes

pygal是一个SVG图表库。SVG是一种矢量图格式。全称Scalable Vector Graphics — 可缩放矢量图形。

用浏览器打开svg，可以方便的与之交互。

以下代码均在Jupyter Notebook中运行

模拟掷骰子

来看一个简单的例子。它模拟了掷骰子。

import random

class Die:
    """
    一个骰子类
    """
    def __init__(self, num_sides=6):
        self.num_sides = num_sides

    def roll(self):
        return random.randint(1, self.num_sides)

模拟掷骰子并可视化

import pygal

die = Die()
result_list = []
# 掷1000次
for roll_num in range(1000):
    result = die.roll()
    result_list.append(result)

frequencies = []
# 范围1~6，统计每个数字出现的次数
for value in range(1, die.num_sides + 1):
    frequency = result_list.count(value)
    frequencies.append(frequency)

# 条形图
hist = pygal.Bar()
hist.title = 'Results of rolling one D6 1000 times'
# x轴坐标
hist.x_labels = [1, 2, 3, 4, 5, 6]
# x、y轴的描述
hist.x_title = 'Result'
hist.y_title = 'Frequency of Result'
# 添加数据， 第一个参数是数据的标题
hist.add('D6', frequencies)
# 保存到本地，格式必须是svg
hist.render_to_file('die_visual.svg')

使用浏览器打开这个文件，鼠标指向数据，可以看到显示了标题“D6”， x轴的坐标以及y轴坐标。

可以发现，六个数字出现的频次是差不多的（理论上概率是1/6，随着实验次数的增加，趋势越来越明显）

未分类

同时掷两个骰子

稍微改下代码就行，再实例化一个骰子

die_1 = Die()
die_2 = Die()

result_list = []
for roll_num in range(5000):
    # 两个骰子的点数和
    result = die_1.roll() + die_2.roll()
    result_list.append(result)

frequencies = []
# 能掷出的最大数
max_result = die_1.num_sides + die_2.num_sides

for value in range(2, max_result + 1):
    frequency = result_list.count(value)
    frequencies.append(frequency)

# 可视化
hist = pygal.Bar()
hist.title = 'Results of rolling two D6 dice 5000 times'
hist.x_labels = [x for x in range(2, max_result + 1)]
hist.x_title = 'Result'
hist.y_title = 'Frequency of Result'
# 添加数据
hist.add('two D6', frequencies)
# 格式必须是svg
hist.render_to_file('2_die_visual.svg')

从图中可以看出，两个骰子之和为7的次数最多，和为2的次数最少。因为能掷出2的只有一种情况 -> (1, 1);而掷出7的情况有(1, 6) , (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)共6种情况，其余数字的情况都没有7的多，故掷得7得概率最大。

未分类

处理json数据–世界人口地图

需要用到人口数据

下载population.json: http://pan.baidu.com/s/1pKLB9N1 ，该数据来源于okfn.org这个网站

打开看下数据，其实这是个很长的列表，包含了许多国家从1960~2015年的人口数据。看第一数据，如下。后面的数据和第一个键都一样。

[ 
{
 "Country Name":"Arab World",
 "Country Code":"ARB",
 "Year":"1960",
 "Value":"92496099"
 },
...

只有四个键，其中Country Code指的是国别码，这里是3位的。Value就是人口数了。

import json

filename = r'F:Jupyter Notebookmatplotlib_pygal_csv_jsonpopulation.json'
with open(filename) as f:
    # json.load()可以将json文件转为Python能处理的形式，这里位列表，列表里是字典
    pop_data = json.load(f)

cc_populations = {}
for pop_dict in pop_data:
    if pop_dict['Year'] == '2015':
        country_name = pop_dict['Country Name']
        # 有些值是小数，先转为float再转为int
        population = int(float(pop_dict['Value']))
        print(country_name + ': ' + population)

上面的程序打印了2015年各个国家的人口数，当然要分析2014年的，代码中数字改改就行。

Arab World: 392168030
Caribbean small states: 7116360
Central Europe and the Baltics: 103256779
Early-demographic dividend: 3122757473.68203
East Asia & Pacific: 2279146555
...

需要注意的是，人口数据有些值是小数（不可思议）。人口数据类型是字符串str，如果直接转int，像’35435.12432’这样的字符串是不能强转位int的，必须先转为float，再丢失精度转为int。

获取两个字母的国别码

我们的数据中，国别码是三位的，而pygal的地图工具使用两位国别码。要使用pygal绘制世界地图。需要安装依赖包。

pip install pygal_maps_world就可以了

国别码位于i18n模块

from pygal_maps_world.i18n import COUNTRIES这样就导入了, COUNTRIES是一个字典，键是两位国别码，值是具体国家名。

key -> value
af Afghanistan
af Afghanistan
al Albania
al Albania
dz Algeria
dz Algeria
ad Andorra
ad Andorra
ao Angola

写一个函数，根据具体国家名返回pygal提供的两位国别码

def get_country_code(country_name):
    """
    根据国家名返回两位国别码
    """
    for code, name in COUNTRIES.items():
        if name == country_name:
            return code
    return None

世界人口地图绘制

先给出全部代码，需要用到World类

import json

from pygal_maps_world.i18n import COUNTRIES
from pygal_maps_world.maps import World
# 颜色相关
from pygal.style import RotateStyle
from pygal.style import LightColorizedStyle

def get_country_code(country_name):
    """
    根据国家名返回两位国别码
    """
    for code, name in COUNTRIES.items():
        if name == country_name:
            return code
    return None

filename = r'F:Jupyter Notebookmatplotlib_pygal_csv_jsonpopulation.json'
with open(filename) as f:
    pop_data = json.load(f)

cc_populations = {}
for pop_dict in pop_data:
    if pop_dict['Year'] == '2015':
        country_name = pop_dict['Country Name']

        # 有些值是小数，先转为float再转为int
        population = int(float(pop_dict['Value']))
        code = get_country_code(country_name)
        if code:
            cc_populations[code] = population

# 为了使颜色分层更加明显
cc_populations_1,cc_populations_2, cc_populations_3 = {}, {}, {}
for cc, population in cc_populations.items():
    if population < 10000000:
        cc_populations_1[cc] = population
    elif population < 1000000000:
        cc_populations_2[cc] = population
    else:
        cc_populations_3[cc] = population

wm_style = RotateStyle('#336699', base_style=LightColorizedStyle)
world = World(style=wm_style)
world.title = 'World Populations in 2015, By Country'
world.add('0-10m', cc_populations_1)
world.add('10m-1bn', cc_populations_2)
world.add('>1bn', cc_populations_3)
world.render_to_file('world_population_2015.svg')

有几个变量比较重要

cc_populations是一个dict，里面存放了两位国别码与人口的键值对。
cc_populations_1,cc_populations_2, cc_populations_3这是3个字典，把人口按照数量分阶梯，人口一千万以下的存放在cc_populations_1中，一千万~十亿级别的存放在cc_populations_2中，十亿以上的存放在cc_populations_3中，这样做的目的是使得颜色分层更加明显，更方便找出各个阶梯里人口最多的国家。由于分了三个层次，在添加数据的的时候也add三次，把这三个字典分别传进去。
world = World(style=wm_style)这是一个地图类，导入方法from pygal_maps_world.maps import World
wm_style = RotateStyle(‘#336699’, base_style=LightColorizedStyle)这里修改了pygal默认的主题颜色，第一个参数是16进制的RGB颜色，前两位代表R，中间两位代表G，最后两位代表B。数字越大颜色越深。第二个参数设置基础样式为亮色主题，pygal默认使用较暗的颜色主题，通过此方法可以修改默认样式。

中国大佬，No. 1

图中可以看出，分的三个颜色层次为。紫色系，十亿以上；蓝色系，一千万到十亿之间；绿色系，一千万一下。这三种颜色里面颜色最深的就对应了三个阶梯里人口最多的国家。

未分类

仔细观察，图中有些是空白的。并不是这些地方全部没人，而是我们的json数据中有些国家的名称与pygal中COUNTIES模块的国家名不对应导致。这算是一个遗憾，不过可以修改get_country_code函数，使得遇到不对应的国家名时，不返回None。对于这些国家，查阅COUNTRIES的代码，找出对应的国别码，返回之，应该就Ok了。比如下面这样

# 传入的参数country_name是json数据中的，可能与COUNTRIES里面的国家名不一致，按照上面的代码直接就返回None，导致绘图时这个国家的人口数据空白
if country_name == 'Yemen, Rep':
    return 'ye'

不过我懒得试了233

使用Web API分析数据

以GitHub为例，我想查看最受欢迎的Python库。以stars排序。

访问: https://api.github.com/search/repositories?q=language:python&sort=stars 就可查看。数据大概长这样

{
  "total_count": 1767997,
  "incomplete_results": false,
  "items": [{
     {
       "id": 21289110,
      "name": "awesome-python",
      "full_name": "vinta/awesome-python",
      "owner": {
        "login": "vinta",
        ...
          },
       ...    
       "html_url": "https://github.com/vinta/awesome-python",
        ...
          "stargazers_count": 35234,
        ...

  }, {...}
      ...]
}

第三个数据，items。里面是得到stars最多的top 30。name即库名称，owner下的login是库的拥有者，html_url是该库的网址（注意owner下也有个html_url。不过那个是用户的GitHub网址，我们要定位到该用户的具体这个库，所以不要用owner下的html_url），stargazers_count至关重要，所得的stars数目。

顺便说下第一个键total_count，表示Python语言的仓库的总数；第二个键，incomplete_results，表示响应的值不完全，一般来说是false，表示响应的数据完整。

import requests

url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
response = requests.get(url)
# 200为响应成功
print(response.status_code, '响应成功！')
response_dict = response.json()

total_repo = response_dict['total_count']
repo_list = response_dict['items']
print('总仓库数: ', total_repo)
print('top', len(repo_list))
for repo_dict in repo_list:
    print('nName: ', repo_dict['name'])
    print('Owner: ', repo_dict['owner']['login'])
    print('Stars: ', repo_dict['stargazers_count'])
    print('Repo: ', repo_dict['html_url'])
    print('Description: ', repo_dict['description'])

其实像这样已经得到结果了

200 响应成功！
总仓库数:  1768021
top 30

Name:  awesome-python
Owner:  vinta
Stars:  35236
Repo:  https://github.com/vinta/awesome-python
Description:  A curated list of awesome Python frameworks, libraries, software and resources

Name:  httpie
Owner:  jakubroztocil
Stars:  30149
Repo:  https://github.com/jakubroztocil/httpie
Description:  Modern command line HTTP client – user-friendly curl alternative with intuitive UI, JSON support, syntax highlighting, wget-like downloads, extensions, etc.  https://httpie.org

Name:  thefuck
Owner:  nvbn
Stars:  28535
Repo:  https://github.com/nvbn/thefuck
Description:  Magnificent app which corrects your previous console command.
...

可视化一下当然会更加直观。

pygal可视化数据

代码不是很难，有一个plot_dict比较关键，这是鼠标放在条形图上时，会显示出来的数据，键基本上都是固定写法，xlink里面时仓库地址，只要点击，浏览器就会新开一个标签跳转到该页面了！

import requests

import pygal
from pygal.style import LightColorizedStyle, LightenStyle

url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
response = requests.get(url)
# 200为响应成功
print(response.status_code, '响应成功！')
response_dict = response.json()

total_repo = response_dict['total_count']
repo_list = response_dict['items']
print('总仓库数: ', total_repo)
print('top', len(repo_list))

names, plot_dicts = [], []
for repo_dict in repo_list:
    names.append(repo_dict['name'])
    # 加上str强转，因为我遇到了'NoneType' object is not subscriptable (: 说明里面有个没有此项, 是NoneType
    plot_dict = {
        'value' : repo_dict['stargazers_count'],
        # 有些描述很长很长，选最前一部分
        'label' : str(repo_dict['description'])[:200]+'...',
        'xlink' : repo_dict['html_url']
    }
    plot_dicts.append(plot_dict)

# 改变默认主题颜色，偏蓝色
my_style = LightenStyle('#333366', base_style=LightColorizedStyle)
# 配置
my_config = pygal.Config()
# x轴的文字旋转45度
my_config.x_label_rotation = -45
# 隐藏左上角的图例
my_config.show_legend = False
# 标题字体大小
my_config.title_font_size = 30
# 副标签，包括x轴和y轴大部分
my_config.label_font_size = 20
# 主标签是y轴某数倍数，相当于一个特殊的刻度，让关键数据点更醒目
my_config.major_label_font_size = 24
# 限制字符为15个，超出的以...显示
my_config.truncate_label = 15
# 不显示y参考虚线
my_config.show_y_guides = False
# 图表宽度
my_config.width = 1000

# 第一个参数可以传配置
chart = pygal.Bar(my_config, style=my_style)
chart.title = 'Most-Starred Python Projects on GitHub'
# x轴的数据
chart.x_labels = names
# 加入y轴的数据，无需title设置为空，注意这里传入的字典，
# 其中的键--value也就是y轴的坐标值了
chart.add('', plot_dicts)
chart.render_to_file('most_stars_python_repo.svg')

看下图，chrome浏览器里显示效果。总感觉config里面有些设置没有起到作用, x、y轴的标签还是那么小orz…不过plot_dict里面的三个数据都显示出来了，点击即可跳转。

未分类

好了，就折腾这么多吧，这个库也不是特别大众的…

nginx和php-fpm连接超时之解决方法

前言

现在线上系统的架构大致是这样的，除去cache的proxy机器外，还有项目的nginx proxy机器，后面跟nginx webserver + php-fpm。有时候，会看到proxy nginx的日志里面会有各种异常状态码，比如499，502，504等，这些都是什么情况导致的呢？

架构示意

nginx proxy => nginx webserver => php-fpm

状态码说明

499：客户端(或者proxy)主动断开连jie502：网关错误(Bad Gateway)504：网关超时：(Gateway Timeout)

一、proxy和webserver不能连接

1.1 proxy_pass ip不存在

这时候会重复发送arp解析协议，约3秒后超时，proxy返回码为502。

1.2 proxy_pass ip存在

webserver机器上端口上没有对应服务；

webserver所在机器的内核会直接返回RESET包，没有额外超时，proxy返回码为502。

webserver机器端口上有服务，但是iptables DROP了proxy的包；

因为webserver drop(iptables -I INPUT -s xxx.xxx.xxx.xxx -j DROP)了proxy的包，proxy会TCP连接不断重试，默认会重试60秒后proxy返回码504，这个重试时间60秒由参数proxy_connect_timeout指定，重试间隔就是TCP的重试间隔(1，2，4…)。

如果在超时之前，客户端主动关闭连接(比如停止浏览器的请求)，这个时候proxy会记录 499状态码，而且$request_time
记录的是proxy已经处理的时间，而$upstream_response_time为-。客户端主动关闭后，proxy也不会再向webserver发送重试请求。

但是如果你在proxy配置了proxy_ignore_client_abort on，那么即便客户端主动关闭，proxy还是会不停的发送重试请求到webserver，直至超时，记录的状态码为webserver返回的状态码。

webserver机器端口有服务，但是iptables reject了proxy的包。

因为webserver reject(iptables -I INPUT -s xxx.xxx.xxx.xxx -j REJECT)了proxy的包，与drop不同之处在于，这个时候webserver会返回一个端口不可达的ICMP包给proxy，proxy会重试一次后返回 502 给客户端，超时时间约为1秒。

二、proxy和webserver连接正常(请求时间过长)

proxy的nginx.conf中的proxy_read_timeout=60webserver的nginx.conf中fastcgi_read_timeout=300php-fpm中的 request_terminate_timeout=120

未分类

nginx.conf配置文件

2.1 php执行时间超过proxy的proxy_read_timeout：

假设php-fpm有一个test.php执行时间为100秒，超过了默认的proxy_read_timeout=60，则到1分钟后proxy会关闭到webserver的连接，webserver记录的返回码为499，proxy的返回码为 504，客户端看到的返回码也就是 504。

关于proxy_read_timeout要多说一句，在nginx文档中可以看到这个参数的含义是：

The timeout is set only between two successive read operations,not for the transmission of the whole response.

意思是说并非response的传输超时，而是两次读操作之间的间隔超时。比如在proxy中设置proxy_read_timeout=10，而测试的test.php 如下：

<?phpsleep(7);echo "haha";ob_flush();flush();sleep(7);echo "haha after 7s";?>

这整个请求的响应时间是14秒，其实是不会超时的，因为相邻两次读操作的间隔是7秒小于10秒。注意代码中的ob_flush()
和flush()两个函数，其中ob_flush()是为了刷php的缓存，flush()则是为了刷系统层面的缓存。将/etc/php5/fpm/php.ini中设置output_buffering=off，则可以不用调用ob_flush()了，但是flush()还是需要的。如果不flush的话，php会等到整个响应完成才会将数据返回给webserver，webserver再返回给proxy，在没有返回整个响应之前(14秒才能返回)，超过了 proxy_read_timeout的10秒，此时，proxy会关闭和webserver的连接，导致出现504错误。为了这个测试test.php不超时，webserver的nginx还要加一个配置fastcgi_buffering off，因为虽然我们的php返回了数据了，但是webserver的nginx还是缓存了fastcgi的返回，导致没有及时将数据返回给proxy，从而超时。

未分类

php.ini文件

在如上面配置好后，可以发现，浏览器输出了hahahaha after 7s那么问题来了，这两个字符串是同时输出的，并没有像代码中那样隔了7秒，那这个问题是什么导致的呢？答案是proxy的nginx也有缓存配置，需要关闭才能看到先后输出两个字符串的效果。nginx proxy的缓存配置为proxy_buffering off，这样你就能看到先后输出两个字符串的效果了。

2.2 php执行时间超过webserver的fastcgi_read_timeout

设置fastcgi_read_timeout=10，test.php执行时间100秒，则10秒后webserver会关闭和PHP的连接，webserver记录日志的返回码为 504，proxy日志的返回码也是 504。

2.3 php执行时间超过php-fpm的request_terminate_timeout

设置request_terminate_timeout=5，test.php还是执行100秒，可以发现5秒之后，php-fpm会终止对应的php子进程，webserver日志的状态码为 404，proxy的日志的状态码也是 404。

注：经测试，在php-fpm模式中，php.ini中的max_execution_time参数没有什么效果。

三、关于文件数问题

Linux里面的一些限制参数可以通过ulimit -a查看，比如我的debian8.2系统的输出如下：

# ulimit -acore file size (blocks, -c) 0data seg size (kbytes, -d) unlimitedscheduling priority (-e) 0file size (blocks, -f) unlimitedpending signals (-i) 96537max locked memory (kbytes, -l) 64max memory size (kbytes, -m) unlimitedopen files (-n) 1000000pipe size (512 bytes, -p) 8POSIX message queues (bytes, -q) 819200real-time priority (-r) 0stack size (kbytes, -s) 8192cpu time (seconds, -t) unlimitedmax user processes (-u) 96537virtual memory (kbytes, -v) unlimitedfile locks (-x) unlimited

其中open files是一个进程可以同时打开的文件数，超过则会报too many open files错误，修改可以通过ulimit -n xxx来实现。而max user processes则是用户最多创建的进程数。

另外，系统允许打开的最大文件数在配置file-max中。

# cat /proc/sys/fs/file-max2471221

修改file-max可以通过# sysctl -w fs.file-max=1000000修改，永久生效需要在/etc/sysctl.conf中加入这行fs.file-max=1000000然后sysctl -p即可。

要针对用户限制文件数之类的，可以修改/etc/security/limits.conf，内容格式如下：

<domain> <type> <item> <value>## 比如限制 bob这个用户的一个进程同时打开的文件数## Example hard limit for max opened filesbob hard nofile 4096## Example soft limit for max opened filesbob soft nofile 1024

nginx配置中的worker_rlimit_nofile可以配置为open files这个值。

未分类

ulimit -a命令

未分类

sysctl.conf文件

nginx upstream 容错解决“nginx upstream timed out”错误

未分类

有相关人员在调查在服务器日志中发现的有关nginx错误：

Upstream timed out (110: Connection timed out) while reading response header from upstream

这个nginx超时错误位于一个nginx – apache代理服务器中nginx upstream 容错，其中nginx服务所有静态内容和apache所有动态。

Nginx Upstream Timed Out方案

调查错误，并尝试了一些修复后，发现这个错误可能会发生在两种情况：

1）Nginx作为代理

尝试在proxy_read_timeout虚拟主机配置中添加选项，应该如下所示：

nginx upstream 高可用proxy_read_timeout 150;

将其放置在您的根位置配置中：

位置 / {

…

proxy_read_timeout 150;

…

}

2）Nginx作为具有php-fpm或其他应用程序的独立服务器。

如果是这种情况，请尝试添加fastcgi_read_timeout选项：

fastcgi_read_timeout 150;

使用php-fpm配置，应该如下所示：

nginx cache位置?* .php $ {

包括fastcgi_params;

fastcgi_index index.php

fastcgi_read_timeout 150;

fastcgi_pass 127.0.0.1:9000;

fastcgi_param _FILENAME $ document_root $ fastcgi__name;

}

在这两种情况下，只需重新启动nginx即可应用更改。

GDCA一直以“构建网络信任体系，服务现代数字生活”的宗旨，致力于提供全球化的数字证书认证服务。其自主品牌——信鉴易?TrustAUTHSSL证书系列，为涉足互联网的企业打造更安全的生态环境，建立更具公信力的企业网站形象。

阿里云 debian 下 apt-get 搭建 nginx+php环境

1. 更新apt-get源

apt-get update

2. 安装Nginx

apt-get install nginx

nginx相关操作

service nginx start
service nginx restart
service nginx stop

3. 安装php

apt-get install php5-fpm php5-gd php5-mysql php5-memcache php5-curl

4. 配置Nginx让其支持php

cd /etc/nginx/conf.d #进入nginx虚拟站点配置目录
vi xxx.com.conf #创建域名配置文件

然后把下面的代码拷贝进去

server {
    listen 80;
    server_name phpmyadmin.xxx.com;
    root /home/wwwroot/phpMyadmin;
    index index.php;
    location / {
        try_files $uri $uri/ =404;
    }
    location ~ .php$ {
        include snippets/fastcgi-php.conf;
    #
    #   # With php5-cgi alone:
    #   fastcgi_pass 127.0.0.1:9000;
    #   # With php5-fpm:
        fastcgi_pass unix:/var/run/php5-fpm.sock;
    }
}

附上nginx方向代理到nodeJs的配置

server{
    listen 80;
    server_name open.xxx.com;
    access_log off;
    location / {
        #proxy_cache_key "$scheme://$host$request_uri";
        #proxy_cache cache_one;
        #proxy_cache_valid  200 304 3h;
        #proxy_cache_valid 301 3d;
        #proxy_cache_valid any 10s;
        proxy_set_header   X-Real-IP  $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        #proxy_set_header   Referer http://xxxx;
        proxy_set_header   Host $host;
        #proxy_hide_header Set-Cookie;
        proxy_pass http://xx.xx.xx.xx:8080;
    }
}

CentOS 7：使用HAProxy实现Nginx负载均衡

HAProxy是一款功能强大、灵活好用的反向代理的开源软件，它提供了负载均衡、服务器代理的功能。HAProxy是Willy Tarreau使用C语言编写的，它支持SSL、压缩、keep-alive、自定义日志格式和header重写。

HAProxy是轻量级的负载均衡和代理服务软件，占用系统资源较少。很多大型的网站都在使用它，例如Github、StackOverflow。

下面我安装配置HAProxy做为两个Nginx服务器的负载均衡。一共需要使用3个服务器，在一台机器上安装HAProxy，另两台机器安装Nginx服务。

未分类

HAProxy的基本概念

4层和7层

HAProxy可以使用两种模式运行：TCP 4层模式和HTTP 7层模式。TCP模式：HAProxy把原始TCP数据包从客户端转向到应用服务器；HTTP模式：解析http请求，然后转向到web服务器。我们将使用HTTP 7层模式。

负载均衡算法

HAProxy使用负载均衡算法决定把请求发送给哪个服务器，使用的算法：
Roundrobin－轮流算法
这是最简单的负载均衡算法。对每个新连接，总是下一个后端服务器处理。如果到达最后一个后端服务器，从头开始。

Lastconn

有最少连接的后端服务器处理新请求。当请求量较大时非常好。

Source

根据客户端IP决定哪个后端服务器处理。如果IP1是server1处理，那么这个IP1的所有请求都由server1处理。根据IP地址的哈希值决定后端服务器。

系统要求

3个CentOS 7服务器：

处理负载均衡的HAProxy服务器：192.168.0.101
Nginx1服务器：192.168.0.108
Nginx2服务器：192.168.0.109

第一步

编辑HAProxy服务器(102.168.0.101)的/etc/hosts：

vim /etc/hosts

添加Nginx1和Nginx2的主机名：

192.168.0.108    nginx1.your_domain.com     nginx1
192.168.0.109    nginx2.your_domain.com     nginx2

保存退出。

同样，编辑两个Nginx服务器的/etc/hosts，添加：

192.168.0.101    loadbalancer

在两个Nginx服务器上都要设置。

第二步

在HAProxy服务器上安装HAProxy：

# yum update
# yum install haproxy

haproxy的配置文件位于/etc/haproxy/。为了防止出错，先备份原始配置文件：

# cd /etc/haproxy/
# mv haproxy.cfg haproxy.cfg.backup

编辑配置文件：

# vim haproxy.cfg

写入如下内容：

#---------------------------------------------------------------------
# 全局设置
#---------------------------------------------------------------------
global
log         127.0.0.1 local2     # 日志
chroot      /var/lib/haproxy
pidfile     /var/run/haproxy.pid
maxconn     4000                
user        haproxy             # Haproxy在haproxy用户和组下运行
group       haproxy
daemon
# 开启 stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# 基本设置
#---------------------------------------------------------------------
defaults
mode                    http
log                     global
option                  httplog
option                  dontlognull
option http-server-close
option forwardfor       except 127.0.0.0/8
option                  redispatch
retries                 3
timeout http-request    10s
timeout queue           1m
timeout connect         10s
timeout client          1m
timeout server          1m
timeout http-keep-alive 10s
timeout check           10s
maxconn                 3000
#---------------------------------------------------------------------
# HAProxy Monitoring 配置
#---------------------------------------------------------------------
listen haproxy3-monitoring *:8080                # Haproxy Monitoring 的使用端口：8080
mode http
option forwardfor
option httpclose
stats enable
stats show-legends
stats refresh 5s
stats uri /stats                            # HAProxy monitoring的网址
stats realm Haproxy Statistics
stats auth testuser:test1234                # 登录Monitoring的用户和密码
stats admin if TRUE
default_backend app-main
#---------------------------------------------------------------------
# FrontEnd 配置
#---------------------------------------------------------------------
frontend main
bind *:80
option http-server-close
option forwardfor
default_backend app-main
#---------------------------------------------------------------------
# 使用roundrobin做为负载均衡算法
#---------------------------------------------------------------------
backend app-main
balance roundrobin                                    # 使用的负载均衡算法
option httpchk HEAD / HTTP/1.1rnHost: localhost    # 检查nginx服务器是否连通- 200状态码
server nginx1 192.168.0.108:80 check                  # Nginx1 
server nginx2 192.168.0.109:80 check                  # Nginx2
配置rsyslog

我们需要使用rsyslog记录HAProxy的日志，编辑rsyslog.conf配置文件，打开UDP的514端口：

# vim /etc/rsyslog.conf

去掉如下行的注释：

$ModLoad imudp
$UDPServerRun 514

如果你想指定特定IP，可以更改如下行：

$UDPServerAddress 127.0.0.1