Supervisor is a client/server system that allows its users to control a number of processes on UNIX-like operating systems. It was inspired by the following:
It is often inconvenient to need to write rc.d scripts for every single process instance. rc.d scripts are a great lowest-common-denominator form of process initialization/autostart/management, but they can be painful to write and maintain. Additionally, rc.d scripts cannot automatically restart a crashed process and many programs do not restart themselves properly on a crash. Supervisord starts processes as its subprocesses, and can be configured to automatically restart them on a crash. It can also automatically be configured to start processes on its own invocation.
It’s often difficult to get accurate up/down status on processes on UNIX. Pidfiles often lie. Supervisord starts processes as subprocesses, so it always knows the true up/down status of its children and can be queried conveniently for this data.
Users who need to control process state often need only to do that. They don’t want or need full-blown shell access to the machine on which the processes are running. Processes which listen on “low” TCP ports often need to be started and restarted as the root user (a UNIX misfeature). It’s usually the case that it’s perfectly fine to allow “normal” people to stop or restart such a process, but providing them with shell access is often impractical, and providing them with root access or sudo access is often impossible. It’s also (rightly) difficult to explain to them why this problem exists. If supervisord is started as root, it is possible to allow “normal” users to control such processes without needing to explain the intricacies of the problem to them. Supervisorctl allows a very limited form of access to the machine, essentially allowing users to see process status and control supervisord-controlled subprocesses by emitting “stop”, “start”, and “restart” commands from a simple shell or web UI.
Processes often need to be started and stopped in groups, sometimes even in a “priority order”. It’s often difficult to explain to people how to do this. Supervisor allows you to assign priorities to processes, and allows user to emit commands via the supervisorctl client like “start all”, and “restart all”, which starts them in the preassigned priority order. Additionally, processes can be grouped into “process groups” and a set of logically related processes can be stopped and started as a unit.
Supervisor is configured through a simple INI-style config file that’s easy to learn. It provides many per-process options that make your life easier like restarting failed processes and automatic log rotation.
Supervisor provides you with one place to start, stop, and monitor your processes. Processes can be controlled individually or in groups. You can configure Supervisor to provide a local or remote command line and web interface.
Supervisor starts its subprocesses via fork/exec and subprocesses don’t daemonize. The operating system signals Supervisor immediately when a process terminates, unlike some solutions that rely on troublesome PID files and periodic polling to restart failed processes.
Supervisor has a simple event notification protocol that programs written in any language can use to monitor it, and an XML-RPC interface for control. It is also built with extension points that can be leveraged by Python developers.
Supervisor works on just about everything except for Windows. It is tested and supported on Linux, Mac OS X, Solaris, and FreeBSD. It is written entirely in Python, so installation does not require a C compiler.
While Supervisor is very actively developed today, it is not new software. Supervisor has been around for years and is already in use on many servers.
The server piece of supervisor is named supervisord. It is responsible for starting child programs at its own invocation, responding to commands from clients, restarting crashed or exited subprocesseses, logging its subprocess stdout and stderr output, and generating and handling “events” corresponding to points in subprocess lifetimes.
The server process uses a configuration file. This is typically located in /etc/supervisord.conf. This configuration file is a “Windows-INI” style config file. It is important to keep this file secure via proper filesystem permissions because it may contain unencrypted usernames and passwords.
The command-line client piece of the supervisor is named supervisorctl. It provides a shell-like interface to the features provided by supervisord. From supervisorctl, a user can connect to different supervisord processes (one at a time), get status on the subprocesses controlled by, stop and start subprocesses of, and get lists of running processes of a supervisord.
The command-line client talks to the server across a UNIX domain socket or an internet (TCP) socket. The server can assert that the user of a client should present authentication credentials before it allows him to perform commands. The client process typically uses the same configuration file as the server but any configuration file with a [supervisorctl] section in it will work.
A (sparse) web user interface with functionality comparable to supervisorctl may be accessed via a browser if you start supervisord against an internet socket. Visit the server URL (e.g. http://localhost:9001/) to view and control process status through the web interface after activating the configuration file’s [inet_http_server] section.
The same HTTP server which serves the web UI serves up an XML-RPC interface that can be used to interrogate and control supervisor and the programs it runs. See XML-RPC API Documentation.
Supervisor has been tested and is known to run on Linux (Ubuntu 9.10), Mac OS X (10.4/10.5/10.6), and Solaris (10 for Intel) and FreeBSD 6.1. It will likely work fine on most UNIX systems.
Supervisor will not run at all under any version of Windows.
Supervisor is intended to work on Python 3 version 3.4 or later and on Python 2 version 2.7.
[root@redis ~]# yum install supervisor Loaded plugins: fastestmirror, langpacks Loading mirror speeds from cached hostfile Package supervisor-3.1.4-1.el7.noarch already installed and latest version Nothing to do [root@redis ~]#
[unix_http_server] file=/tmp/supervisor.sock ; UNIX socket 文件，supervisorctl 会使用 ;chmod=0700 ; socket 文件的 mode，默认是 0700 ;chown=nobody:nogroup ; socket 文件的 owner，格式： uid:gid ;[inet_http_server] ; HTTP 服务器，提供 web 管理界面 ;port=127.0.0.1:9001 ; Web 管理后台运行的 IP 和端口，如果开放到公网，需要注意安全性 ;username=user ; 登录管理后台的用户名 ;password=123 ; 登录管理后台的密码 [supervisord] logfile=/tmp/supervisord.log ; 日志文件，默认是 $CWD/supervisord.log logfile_maxbytes=50MB ; 日志文件大小，超出会 rotate，默认 50MB logfile_backups=10 ; 日志文件保留备份数量默认 10 loglevel=info ; 日志级别，默认 info，其它: debug,warn,trace pidfile=/tmp/supervisord.pid ; pid 文件 nodaemon=false ; 是否在前台启动，默认是 false，即以 daemon 的方式启动 minfds=1024 ; 可以打开的文件描述符的最小值，默认 1024 minprocs=200 ; 可以打开的进程数的最小值，默认 200 ; the below section must remain in the config file for RPC ; (supervisorctl/web interface) to work, additional interfaces may be ; added by defining them in separate rpcinterface: sections [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; 通过 UNIX socket 连接 supervisord，路径与 unix_http_server 部分的 file 一致 ;serverurl=http://127.0.0.1:9001 ; 通过 HTTP 的方式连接 supervisord ; 包含其他的配置文件 [include] files = relative/directory/*.ini ; 可以是 *.conf 或 *.ini
[root@redis ~]# vim /etc/supervisord.d/redis.ini [program:redisd] directory=/root/ command=/usr/local/redis/bin/redis-server /usr/local/redis/etc/redis.conf stdout_logfile_maxbytes=10MB stdout_logfile_backups=10 stdout_capture_maxbytes=10MB process_name=%(program_name)s_%(process_num)02d numprocs=1 autostart=true autorestart=true user=root startsecs=1 startretries=10 stdout_logfile=/tmp/supervisor.redisd.log
[program:redis] #program名称，随便写，但不要重复，是program的唯一标识 directory=/root #指定运行目录 command=/usr/local/redis/bin/redis-server /usr/local/redis/etc/redis.conf#运行目录下执行命令 process_name=%(program_name)s_%(process_num)02d #进程名 numprocs=1 #进程数，注意：（celery进程数量,不是work数量，相当于执行了10个command命令，而不是在celery中指定-c 为10） autostart=true #当supervisor启动时,程序将会自动启动 autorestart=true #自动重启（当work被kill了之后会重新启动） ;user=root #运行用户 ;startsecs=1 #程序重启时候停留在runing状态的秒数 ;startretries=10 #启动失败时的最多重试次数 stopsignal=INT #停止信号,默认TERM #中断:INT (类似于Ctrl+C)(kill -INT pid)，退出后会将写文件或日志(推荐) #终止:TERM (kill -TERM pid) #挂起:HUP (kill -HUP pid),注意与Ctrl+Z/kill -stop pid不同 #从容停止:QUIT (kill -QUIT pid) stdout_logfile_maxbytes=10MB #日志配置 stdout_logfile_backups=10 stdout_capture_maxbytes=10MB stdout_logfile=/tmp/supervisor.redisd.log
[root@redis ~]# netstat -tulpn | grep redis tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 23210/redis-server [root@redis ~]# kill -9 23210 [root@redis ~]# netstat -tulpn | grep redis [root@redis ~]# systemctl start supervisord [root@redis ~]# netstat -tulpn | grep redis tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 23274/redis-server [root@redis ~]# kill -9 23274 [root@redis ~]# netstat -tulpn | grep redis [root@redis ~]# netstat -tulpn | grep redis tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 23300/redis-server [root@redis ~]#