Linux 进程扶起,管理工具 Supervisord

运维中偶尔会有服务意外停止的情况,一般的解决方法是在服务器本机上写一个监控的shell脚本,监控服务进程,出现异常自动扶起,还有就是配置第三方监控如zabbix,检查进程,出现异常自动执行shell脚本扶起,除此之外,还有一个比前2者稍稍微简单一点的工具:supervisord

什么是supervisord?:请参考官方文档介绍
Introduction
Overview
Supervisor is a client/server system that allows its users to control a number of processes on UNIX-like operating systems. It was inspired by the following:

Convenience

It is often inconvenient to need to write rc.d scripts for every single process instance. rc.d scripts are a great lowest-common-denominator form of process initialization/autostart/management, but they can be painful to write and maintain. Additionally, rc.d scripts cannot automatically restart a crashed process and many programs do not restart themselves properly on a crash. Supervisord starts processes as its subprocesses, and can be configured to automatically restart them on a crash. It can also automatically be configured to start processes on its own invocation.
Accuracy

It’s often difficult to get accurate up/down status on processes on UNIX. Pidfiles often lie. Supervisord starts processes as subprocesses, so it always knows the true up/down status of its children and can be queried conveniently for this data.
Delegation

Users who need to control process state often need only to do that. They don’t want or need full-blown shell access to the machine on which the processes are running. Processes which listen on “low” TCP ports often need to be started and restarted as the root user (a UNIX misfeature). It’s usually the case that it’s perfectly fine to allow “normal” people to stop or restart such a process, but providing them with shell access is often impractical, and providing them with root access or sudo access is often impossible. It’s also (rightly) difficult to explain to them why this problem exists. If supervisord is started as root, it is possible to allow “normal” users to control such processes without needing to explain the intricacies of the problem to them. Supervisorctl allows a very limited form of access to the machine, essentially allowing users to see process status and control supervisord-controlled subprocesses by emitting “stop”, “start”, and “restart” commands from a simple shell or web UI.
Process Groups

Processes often need to be started and stopped in groups, sometimes even in a “priority order”. It’s often difficult to explain to people how to do this. Supervisor allows you to assign priorities to processes, and allows user to emit commands via the supervisorctl client like “start all”, and “restart all”, which starts them in the preassigned priority order. Additionally, processes can be grouped into “process groups” and a set of logically related processes can be stopped and started as a unit.
Features
Simple

Supervisor is configured through a simple INI-style config file that’s easy to learn. It provides many per-process options that make your life easier like restarting failed processes and automatic log rotation.
Centralized

Supervisor provides you with one place to start, stop, and monitor your processes. Processes can be controlled individually or in groups. You can configure Supervisor to provide a local or remote command line and web interface.
Efficient

Supervisor starts its subprocesses via fork/exec and subprocesses don’t daemonize. The operating system signals Supervisor immediately when a process terminates, unlike some solutions that rely on troublesome PID files and periodic polling to restart failed processes.
Extensible

Supervisor has a simple event notification protocol that programs written in any language can use to monitor it, and an XML-RPC interface for control. It is also built with extension points that can be leveraged by Python developers.
Compatible

Supervisor works on just about everything except for Windows. It is tested and supported on Linux, Mac OS X, Solaris, and FreeBSD. It is written entirely in Python, so installation does not require a C compiler.
Proven

While Supervisor is very actively developed today, it is not new software. Supervisor has been around for years and is already in use on many servers.
Supervisor Components
supervisord

The server piece of supervisor is named supervisord. It is responsible for starting child programs at its own invocation, responding to commands from clients, restarting crashed or exited subprocesseses, logging its subprocess stdout and stderr output, and generating and handling “events” corresponding to points in subprocess lifetimes.

The server process uses a configuration file. This is typically located in /etc/supervisord.conf. This configuration file is a “Windows-INI” style config file. It is important to keep this file secure via proper filesystem permissions because it may contain unencrypted usernames and passwords.

supervisorctl

The command-line client piece of the supervisor is named supervisorctl. It provides a shell-like interface to the features provided by supervisord. From supervisorctl, a user can connect to different supervisord processes (one at a time), get status on the subprocesses controlled by, stop and start subprocesses of, and get lists of running processes of a supervisord.

The command-line client talks to the server across a UNIX domain socket or an internet (TCP) socket. The server can assert that the user of a client should present authentication credentials before it allows him to perform commands. The client process typically uses the same configuration file as the server but any configuration file with a [supervisorctl] section in it will work.

Web Server

A (sparse) web user interface with functionality comparable to supervisorctl may be accessed via a browser if you start supervisord against an internet socket. Visit the server URL (e.g. http://localhost:9001/) to view and control process status through the web interface after activating the configuration file’s [inet_http_server] section.
XML-RPC Interface

The same HTTP server which serves the web UI serves up an XML-RPC interface that can be used to interrogate and control supervisor and the programs it runs. See XML-RPC API Documentation.
Platform Requirements
Supervisor has been tested and is known to run on Linux (Ubuntu 9.10), Mac OS X (10.4/10.5/10.6), and Solaris (10 for Intel) and FreeBSD 6.1. It will likely work fine on most UNIX systems.

Supervisor will not run at all under any version of Windows.

Supervisor is intended to work on Python 3 version 3.4 or later and on Python 2 version 2.7.

supervisord可以使用yum安装,源码编译安装,和easy_install ,这里使用yum演示。
[root@redis ~]#  yum install supervisor
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
Package supervisor-3.1.4-1.el7.noarch already installed and latest version
Nothing to do
[root@redis ~]#

安装后supervisor配置文件默认生成好的,但也不尽然,easy_install和源码编译的需要手动生成配置文件。配置文件简介。

[unix_http_server]
file=/tmp/supervisor.sock   ; UNIX socket 文件,supervisorctl 会使用
;chmod=0700                 ; socket 文件的 mode,默认是 0700
;chown=nobody:nogroup       ; socket 文件的 owner,格式: uid:gid
;[inet_http_server]         ; HTTP 服务器,提供 web 管理界面
;port=127.0.0.1:9001        ; Web 管理后台运行的 IP 和端口,如果开放到公网,需要注意安全性
;username=user              ; 登录管理后台的用户名
;password=123               ; 登录管理后台的密码
[supervisord]
logfile=/tmp/supervisord.log ; 日志文件,默认是 $CWD/supervisord.log
logfile_maxbytes=50MB        ; 日志文件大小,超出会 rotate,默认 50MB
logfile_backups=10           ; 日志文件保留备份数量默认 10
loglevel=info                ; 日志级别,默认 info,其它: debug,warn,trace
pidfile=/tmp/supervisord.pid ; pid 文件
nodaemon=false               ; 是否在前台启动,默认是 false,即以 daemon 的方式启动
minfds=1024                  ; 可以打开的文件描述符的最小值,默认 1024
minprocs=200                 ; 可以打开的进程数的最小值,默认 200
; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; 通过 UNIX socket 连接 supervisord,路径与 unix_http_server 部分的 file 一致
;serverurl=http://127.0.0.1:9001 ; 通过 HTTP 的方式连接 supervisord
; 包含其他的配置文件
[include]
files = relative/directory/*.ini    ; 可以是 *.conf 或 *.ini

和nginx一样supervisoed也支持使用-c检查配置文件语法。
我这里创建的是监控redis的进程,

[root@redis ~]# vim /etc/supervisord.d/redis.ini
[program:redisd]
directory=/root/
command=/usr/local/redis/bin/redis-server   /usr/local/redis/etc/redis.conf
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=10
stdout_capture_maxbytes=10MB
process_name=%(program_name)s_%(process_num)02d
numprocs=1
autostart=true
autorestart=true
user=root
startsecs=1
startretries=10
stdout_logfile=/tmp/supervisor.redisd.log

program配置文件介绍

[program:redis]  #program名称,随便写,但不要重复,是program的唯一标识
directory=/root #指定运行目录
command=/usr/local/redis/bin/redis-server   /usr/local/redis/etc/redis.conf#运行目录下执行命令
process_name=%(program_name)s_%(process_num)02d   #进程名
numprocs=1         #进程数,注意:(celery进程数量,不是work数量,相当于执行了10个command命令,而不是在celery中指定-c 为10)
autostart=true      #当supervisor启动时,程序将会自动启动
autorestart=true    #自动重启(当work被kill了之后会重新启动)
;user=root        #运行用户
;startsecs=1 #程序重启时候停留在runing状态的秒数
;startretries=10 #启动失败时的最多重试次数
stopsignal=INT
#停止信号,默认TERM
#中断:INT (类似于Ctrl+C)(kill -INT pid),退出后会将写文件或日志(推荐)
#终止:TERM (kill -TERM pid)
#挂起:HUP (kill -HUP pid),注意与Ctrl+Z/kill -stop pid不同
#从容停止:QUIT (kill -QUIT pid)
stdout_logfile_maxbytes=10MB #日志配置
stdout_logfile_backups=10
stdout_capture_maxbytes=10MB
stdout_logfile=/tmp/supervisor.redisd.log

演示效果

[root@redis ~]# netstat -tulpn | grep redis 
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      23210/redis-server  
[root@redis ~]# kill -9 23210
[root@redis ~]# netstat -tulpn | grep redis 
[root@redis ~]# systemctl start supervisord
[root@redis ~]# netstat -tulpn | grep redis 
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      23274/redis-server  
[root@redis ~]# kill -9 23274
[root@redis ~]# netstat -tulpn | grep redis 
[root@redis ~]# netstat -tulpn | grep redis 
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      23300/redis-server  
[root@redis ~]# 

Last modification:June 3rd, 2019 at 11:54 am
如果觉得我的文章对你有用,请随意赞赏

Leave a Comment