sonic業務進程都是運行在容器中的,那容器啓動後是如何啓動它的服務呢。python
要分析這個問題,首先要搞清楚容器構建過程。咱們以docker-orchagent容器爲例進行分析。web
sonic中的Dockerfile由Dockerfile.j2文件生成。redis
docker-orchagent/Dockerfile.j2docker
FROM docker-config-engine ARG docker_container_name RUN [ -f /etc/rsyslog.conf ] && sed -ri "s/%syslogtag%/$docker_container_name\/%syslogtag%/;" /etc/rsyslog.conf ## Make apt-get non-interactive ENV DEBIAN_FRONTEND=noninteractive RUN apt-get update RUN apt-get install -f -y ifupdown arping libdbus-1-3 libdaemon0 libjansson4 ## Install redis-tools dependencies ## TODO: implicitly install dependencies RUN apt-get -y install libjemalloc1 COPY \ {% for deb in docker_orchagent_debs.split(' ') -%} debs/{{ deb }}{{' '}} {%- endfor -%} debs/ RUN dpkg -i \ {% for deb in docker_orchagent_debs.split(' ') -%} debs/{{ deb }}{{' '}} {%- endfor %} ## Clean up RUN apt-get clean -y; apt-get autoclean -y; apt-get autoremove -y RUN rm -rf /debs COPY ["files/arp_update", "/usr/bin"] COPY ["enable_counters.py", "/usr/bin"] COPY ["start.sh", "orchagent.sh", "swssconfig.sh", "/usr/bin/"] COPY ["supervisord.conf", "/etc/supervisor/conf.d/"] ## Copy all Jinja2 template files into the templates folder COPY ["*.j2", "/usr/share/sonic/templates/"] #程序的入口點 ENTRYPOINT ["/usr/bin/supervisord"]
從上面的配置來看,容器啓動後製定的程序爲:/usr/bin/supervisordshell
Host是以swss服務形式啓動docker-orchagent容器的,使用以下命令能夠看出:數據庫
admin@sonic:~$ sudo config reload -y Running command: systemctl stop swss Running command: systemctl stop pmon Running command: systemctl stop teamd Running command: sonic-cfggen -j /etc/sonic/config_db.json --write-to-db Running command: systemctl restart hostname-config Running command: systemctl restart interfaces-config Running command: systemctl restart ntp-config Running command: systemctl restart rsyslog-config Running command: systemctl restart swss Running command: systemctl restart teamd Running command: systemctl restart pmon admin@sonic:~$
咱們查看一下swss的service文件json
admin@sonic:~$ cat /etc/systemd/system/swss.servicebash
[Unit] Description=switch state service Requires=database.service updategraph.service After=database.service updategraph.service After=interfaces-config.service [Service] User=root # Wait for redis server start before database clean ExecStartPre=/bin/bash -c 'until [[ $(/usr/bin/docker exec database redis-cli ping | grep -c PONG) -gt 0 ]]; do sleep 1; done' ExecStartPre=/usr/bin/docker exec database redis-cli -n 0 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 1 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 2 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 5 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 6 FLUSHDB ExecStartPre=/usr/bin/swss.sh start ExecStartPre=/usr/bin/syncd.sh start ExecStart=/usr/bin/swss.sh attach ExecStop=/usr/bin/swss.sh stop ExecStopPost=/usr/bin/syncd.sh stop [Install] WantedBy=multi-user.target
能夠看出swss服務的啓動程序是/usr/bin/swss.sh attach。在啓動該服務以前,須要執行以下命令:網絡
# Wait for redis server start before database clean # 等待,直到redis可用,可用表示ping以後會返回PONG,那麼grep -c PONG則爲1大於0 ExecStartPre=/bin/bash -c 'until [[ $(/usr/bin/docker exec database redis-cli ping | grep -c PONG) -gt 0 ]]; do sleep 1; done' ExecStartPre=/usr/bin/docker exec database redis-cli -n 0 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 1 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 2 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 5 FLUSHDB ExecStartPre=/usr/bin/docker exec database redis-cli -n 6 FLUSHDB ExecStartPre=/usr/bin/swss.sh start ExecStartPre=/usr/bin/syncd.sh start
會清空數據庫0,1,2,5,6,不會清空4(config_db),即會保留配置。還會啓動/usr/bin/swss.sh start 和/usr/bin/syncd.sh start。socket
咱們看一下/usr/bin/swss.sh腳本
#!/bin/bash function getMountPoint() { echo $1 | python -c "import sys, json, os; mnts = [x for x in json.load(sys.stdin)[0]['Mounts'] if x['Destination'] == '/usr/share/sonic/hwsku']; print '' if len(mnts) == 0 else os.path.basename(mnts[0]['Source'])" 2>/dev/null } function postStartAction() { docker exec swss rm -f /ready # remove cruft if [[ -d /host/fast-reboot ]]; then test -e /host/fast-reboot/fdb.json && docker cp /host/fast-reboot/fdb.json swss:/ test -e /host/fast-reboot/arp.json && docker cp /host/fast-reboot/arp.json swss:/ test -e /host/fast-reboot/default_routes.json && docker cp /host/fast-reboot/default_routes.json swss:/ rm -fr /host/fast-reboot fi docker exec swss touch /ready # signal swssconfig.sh to go } # Obtain our platform as we will mount directories with these names in each docker PLATFORM=`sonic-cfggen -H -v DEVICE_METADATA.localhost.platform` # Obtain our HWSKU as we will mount directories with these names in each docker HWSKU=`sonic-cfggen -d -v 'DEVICE_METADATA["localhost"]["hwsku"]'` #啓動容器 start() { DOCKERCHECK=`docker inspect --type container swss 2>/dev/null` if [ "$?" -eq "0" ]; then DOCKERMOUNT=`getMountPoint "$DOCKERCHECK"` if [ "$DOCKERMOUNT" == "$HWSKU" ]; then echo "Starting existing swss container with HWSKU $HWSKU" docker start swss postStartAction exit 0 fi # docker created with a different HWSKU, remove and recreate echo "Removing obsolete swss container with HWSKU $DOCKERMOUNT" docker rm -f swss fi echo "Starting new swss container with HWSKU $HWSKU" docker run -d --net=host --privileged -t -v /etc/network/interfaces:/etc/network/interfaces:ro -v /etc/network/interfaces.d/:/etc/network/interfaces.d/:ro -v /host/machine.conf:/host/machine.conf:ro -v /etc/sonic:/etc/sonic:ro -v /var/log/swss:/var/log/swss:rw \ --log-opt max-size=2M --log-opt max-file=5 \ -v /var/run/redis:/var/run/redis:rw \ -v /usr/share/sonic/device/$PLATFORM:/usr/share/sonic/platform:ro \ -v /usr/share/sonic/device/$PLATFORM/$HWSKU:/usr/share/sonic/hwsku:ro \ --tmpfs /tmp \ --tmpfs /var/tmp \ --name=swss docker-orchagent-bfn:latest || { echo "Failed to docker run" >&1 exit 4 } postStartAction } attach() { docker attach --no-stdin swss } stop() { docker stop swss } case "$1" in start|stop|attach) $1 ;; *) echo "Usage: $0 {start|stop|attach}" exit 1 ;; esac
從上面的腳本能夠看出,Host使用以下命令啓動容器:
docker run -d --net=host --privileged -t -v /etc/network/interfaces:/etc/network/interfaces:ro -v /etc/network/interfaces.d/:/etc/network/interfaces.d/:ro -v /host/machine.conf:/host/machine.conf:ro -v /etc/sonic:/etc/sonic:ro -v /var/log/swss:/var/log/swss:rw \ --log-opt max-size=2M --log-opt max-file=5 \ -v /var/run/redis:/var/run/redis:rw \ -v /usr/share/sonic/device/$PLATFORM:/usr/share/sonic/platform:ro \ -v /usr/share/sonic/device/$PLATFORM/$HWSKU:/usr/share/sonic/hwsku:ro \ --tmpfs /tmp \ --tmpfs /var/tmp \ --name=swss docker-orchagent-bfn:latest
咱們單獨的使用run
只會啓動容器,他會當即啓動,相應而後就自動消失。你在這個時候使用exec
命令已經太遲了。
因此,當咱們啓動容器的時候必定要加上--detach或者-d
來保持容器在後臺持續運行。
命令沒有攜帶CMD。
從Dockerfile.j2文件能夠看出文件的入口點爲ENTRYPOINT ["/usr/bin/supervisord"],使用supervisord進行進程監控,咱們看一下supervisord的配置文件
進入swss容器後,咱們查看啓動了多少個進程。
root@switch:/# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 09:48 ? 00:00:01 /usr/bin/python /usr/bin/supervisord root 20 1 0 09:48 ? 00:00:00 /usr/bin/watcherd root 42 1 0 09:48 ? 00:00:00 /usr/sbin/rsyslogd -n root 47 1 0 09:48 ? 00:00:00 /usr/bin/orchagent -d /var/log/swss -b 8192 -m 00:90:fb:60:e2:8b root 59 1 1 09:48 ? 00:00:23 /usr/bin/portsyncd -p /usr/share/sonic/hwsku/port_config.ini root 62 1 0 09:48 ? 00:00:00 /usr/bin/intfsyncd root 65 1 0 09:48 ? 00:00:00 /usr/bin/neighsyncd root 77 1 0 09:49 ? 00:00:00 /usr/bin/vlanmgrd root 94 1 0 09:49 ? 00:00:00 /usr/bin/intfmgrd root 102 1 0 09:49 ? 00:00:00 /usr/bin/buffermgrd -l /usr/share/sonic/hwsku/pg_profile_lookup.ini root 112 1 0 09:49 ? 00:00:00 /bin/bash /usr/bin/arp_update root 335 112 0 10:24 ? 00:00:00 sleep 300 root 344 0 1 10:25 ? 00:00:00 bash root 349 344 0 10:25 ? 00:00:00 ps -ef root@switch:/#
上面的結果/usr/bin/python /usr/bin/supervisord能夠看出,supervisord啓動的時候沒有指定配置文件,那麼其使用的是默認配置文件/etc/supervisor/supervisord.conf:
; supervisor config file [unix_http_server] file=/var/run/supervisor.sock ; (the path to the socket file) chmod=0700 ; sockef file mode (default 0700) username=dummy password=dummy [supervisord] logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log) pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid) childlogdir=/var/log/supervisor ; ('AUTO' child log dir, default $TEMP) user=root ; the below section must remain in the config file for RPC ; (supervisorctl/web interface) to work, additional interfaces may be ; added by defining them in separate rpcinterface: sections [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///var/run/supervisor.sock ; use a unix:// URL for a unix socket username=dummy password=dummy ; The [include] section can just contain the "files" setting. This ; setting can list multiple files (separated by whitespace or ; newlines). It can also contain wildcards. The filenames are ; interpreted as relative to this file. Included files *cannot* ; include files themselves. [include] files = /etc/supervisor/conf.d/*.conf
查看子配置文件files = /etc/supervisor/conf.d/*.conf
/etc/supervisor/conf.d/目錄下只有一個文件supervisord.conf:
[supervisord] logfile_maxbytes=1MB logfile_backups=2 nodaemon=true #運行start.sh,優先級爲1 [program:start.sh] command=/usr/bin/start.sh priority=1 autostart=true autorestart=false stdout_logfile=syslog stderr_logfile=syslog #rsyslogd,優先級爲2 [program:rsyslogd] command=/usr/sbin/rsyslogd -n priority=2 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog #rsyslogd,優先級爲2 [program:orchagent] command=/usr/bin/orchagent.sh priority=3 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog #rsyslogd,優先級爲2 [program:portsyncd] command=/usr/bin/portsyncd -p /usr/share/sonic/hwsku/port_config.ini priority=4 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog #intfsyncd,優先級爲2 [program:intfsyncd] command=/usr/bin/intfsyncd priority=5 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog #neighsyncd,優先級爲6 [program:neighsyncd] command=/usr/bin/neighsyncd priority=6 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog #swssconfig.sh,優先級爲7 [program:swssconfig] command=/usr/bin/swssconfig.sh priority=7 autostart=false autorestart=unexpected startretries=0 stdout_logfile=syslog stderr_logfile=syslog #arp_update,優先級爲8 [program:arp_update] command=/usr/bin/arp_update priority=8 autostart=false autorestart=unexpected stdout_logfile=syslog stderr_logfile=syslog #vlanmgrd,優先級爲9 [program:vlanmgrd] command=/usr/bin/vlanmgrd priority=9 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog [program:intfmgrd] command=/usr/bin/intfmgrd priority=10 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog [program:buffermgrd] command=/usr/bin/buffermgrd -l /usr/share/sonic/hwsku/pg_profile_lookup.ini priority=10 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog [program:enable_counters] command=/usr/bin/enable_counters.py priority=11 autostart=false autorestart=false stdout_logfile=syslog stderr_logfile=syslog [eventlistener:mylistener] command=/usr/bin/watcherd events=PROCESS_STATE
#!/usr/bin/env bash mkdir -p /etc/swss/config.d/ sonic-cfggen -d -t /usr/share/sonic/templates/switch.json.j2 > /etc/swss/config.d/switch.json sonic-cfggen -d -t /usr/share/sonic/templates/ipinip.json.j2 > /etc/swss/config.d/ipinip.json sonic-cfggen -d -t /usr/share/sonic/templates/ports.json.j2 > /etc/swss/config.d/ports.json export platform=`sonic-cfggen -H -v DEVICE_METADATA.localhost.platform` rm -f /var/run/rsyslogd.pid supervisorctl start rsyslogd supervisorctl start orchagent supervisorctl start portsyncd supervisorctl start intfsyncd supervisorctl start neighsyncd supervisorctl start swssconfig supervisorctl start vlanmgrd supervisorctl start intfmgrd supervisorctl start buffermgrd supervisorctl start enable_counters # Start arp_update when VLAN exists VLAN=`sonic-cfggen -d -v 'VLAN.keys() | join(" ") if VLAN'` if [ "$VLAN" != "" ]; then supervisorctl start arp_update fi