sonic容器swss啓動過程

sonic容器swss啓動過程

sonic業務進程都是運行在容器中的,那容器啓動後是如何啓動它的服務呢。python

要分析這個問題,首先要搞清楚容器構建過程。咱們以docker-orchagent容器爲例進行分析。web

Dockerfile文件

sonic中的Dockerfile由Dockerfile.j2文件生成。redis

docker-orchagent/Dockerfile.j2docker

FROM docker-config-engine

ARG docker_container_name
RUN [ -f /etc/rsyslog.conf ] && sed -ri "s/%syslogtag%/$docker_container_name\/%syslogtag%/;" /etc/rsyslog.conf

## Make apt-get non-interactive
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update

RUN apt-get install -f -y ifupdown arping libdbus-1-3 libdaemon0 libjansson4

## Install redis-tools dependencies
## TODO: implicitly install dependencies
RUN apt-get -y install libjemalloc1

COPY \
{% for deb in docker_orchagent_debs.split(' ') -%}
debs/{{ deb }}{{' '}}
{%- endfor -%}
debs/

RUN dpkg -i \
{% for deb in docker_orchagent_debs.split(' ') -%}
debs/{{ deb }}{{' '}}
{%- endfor %}

## Clean up
RUN apt-get clean -y; apt-get autoclean -y; apt-get autoremove -y
RUN rm -rf /debs

COPY ["files/arp_update", "/usr/bin"]
COPY ["enable_counters.py", "/usr/bin"]
COPY ["start.sh", "orchagent.sh", "swssconfig.sh", "/usr/bin/"]
COPY ["supervisord.conf", "/etc/supervisor/conf.d/"]

## Copy all Jinja2 template files into the templates folder
COPY ["*.j2", "/usr/share/sonic/templates/"]
#程序的入口點
ENTRYPOINT ["/usr/bin/supervisord"]

從上面的配置來看,容器啓動後製定的程序爲:/usr/bin/supervisordshell

Host啓動容器

Host是以swss服務形式啓動docker-orchagent容器的,使用以下命令能夠看出:數據庫

admin@sonic:~$ sudo config reload -y
Running command: systemctl stop swss
Running command: systemctl stop pmon
Running command: systemctl stop teamd
Running command: sonic-cfggen -j /etc/sonic/config_db.json --write-to-db
Running command: systemctl restart hostname-config
Running command: systemctl restart interfaces-config
Running command: systemctl restart ntp-config
Running command: systemctl restart rsyslog-config
Running command: systemctl restart swss
Running command: systemctl restart teamd
Running command: systemctl restart pmon
admin@sonic:~$

咱們查看一下swss的service文件json

admin@sonic:~$ cat /etc/systemd/system/swss.servicebash

[Unit]
Description=switch state service
Requires=database.service updategraph.service

After=database.service updategraph.service
After=interfaces-config.service


[Service]
User=root
# Wait for redis server start before database clean
ExecStartPre=/bin/bash -c 'until [[ $(/usr/bin/docker exec database redis-cli ping | grep -c PONG) -gt 0 ]]; do sleep 1; done'
ExecStartPre=/usr/bin/docker exec database redis-cli -n 0 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 1 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 2 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 5 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 6 FLUSHDB



ExecStartPre=/usr/bin/swss.sh start 
ExecStartPre=/usr/bin/syncd.sh start
ExecStart=/usr/bin/swss.sh attach

ExecStop=/usr/bin/swss.sh stop
ExecStopPost=/usr/bin/syncd.sh stop



[Install]
WantedBy=multi-user.target

能夠看出swss服務的啓動程序是/usr/bin/swss.sh attach。在啓動該服務以前,須要執行以下命令:網絡

# Wait for redis server start before database clean
# 等待,直到redis可用,可用表示ping以後會返回PONG,那麼grep -c PONG則爲1大於0
ExecStartPre=/bin/bash -c 'until [[ $(/usr/bin/docker exec database redis-cli ping | grep -c PONG) -gt 0 ]]; do sleep 1; done'
ExecStartPre=/usr/bin/docker exec database redis-cli -n 0 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 1 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 2 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 5 FLUSHDB
ExecStartPre=/usr/bin/docker exec database redis-cli -n 6 FLUSHDB
ExecStartPre=/usr/bin/swss.sh start 
ExecStartPre=/usr/bin/syncd.sh start

會清空數據庫0,1,2,5,6,不會清空4(config_db),即會保留配置。還會啓動/usr/bin/swss.sh start 和/usr/bin/syncd.sh start。socket

咱們看一下/usr/bin/swss.sh腳本

#!/bin/bash

function getMountPoint()
{
    echo $1 |  python -c "import sys, json, os; mnts = [x for x in json.load(sys.stdin)[0]['Mounts'] if x['Destination'] == '/usr/share/sonic/hwsku']; print '' if len(mnts) == 0 else os.path.basename(mnts[0]['Source'])" 2>/dev/null
}

function postStartAction()
{
    docker exec swss rm -f /ready   # remove cruft
    if [[ -d /host/fast-reboot ]];
    then
        test -e /host/fast-reboot/fdb.json && docker cp /host/fast-reboot/fdb.json swss:/
        test -e /host/fast-reboot/arp.json && docker cp /host/fast-reboot/arp.json swss:/
        test -e /host/fast-reboot/default_routes.json && docker cp /host/fast-reboot/default_routes.json swss:/
        rm -fr /host/fast-reboot
    fi
    docker exec swss touch /ready   # signal swssconfig.sh to go
}

# Obtain our platform as we will mount directories with these names in each docker
PLATFORM=`sonic-cfggen -H -v DEVICE_METADATA.localhost.platform`
# Obtain our HWSKU as we will mount directories with these names in each docker
HWSKU=`sonic-cfggen -d -v 'DEVICE_METADATA["localhost"]["hwsku"]'`
#啓動容器
start() {
    DOCKERCHECK=`docker inspect --type container swss 2>/dev/null`
    if [ "$?" -eq "0" ]; then
        DOCKERMOUNT=`getMountPoint "$DOCKERCHECK"`
        if [ "$DOCKERMOUNT" == "$HWSKU" ]; then
            echo "Starting existing swss container with HWSKU $HWSKU"
            docker start swss
            postStartAction
            exit 0
        fi

        # docker created with a different HWSKU, remove and recreate
        echo "Removing obsolete swss container with HWSKU $DOCKERMOUNT"
        docker rm -f swss
    fi
    echo "Starting new swss container with HWSKU $HWSKU"
    docker run -d --net=host --privileged -t -v /etc/network/interfaces:/etc/network/interfaces:ro -v /etc/network/interfaces.d/:/etc/network/interfaces.d/:ro -v /host/machine.conf:/host/machine.conf:ro -v /etc/sonic:/etc/sonic:ro -v /var/log/swss:/var/log/swss:rw  \
        --log-opt max-size=2M --log-opt max-file=5 \
        -v /var/run/redis:/var/run/redis:rw \
        -v /usr/share/sonic/device/$PLATFORM:/usr/share/sonic/platform:ro \
        -v /usr/share/sonic/device/$PLATFORM/$HWSKU:/usr/share/sonic/hwsku:ro \
        --tmpfs /tmp \
        --tmpfs /var/tmp \
        --name=swss docker-orchagent-bfn:latest || {
            echo "Failed to docker run" >&1
            exit 4
    }

    postStartAction
}

attach() {
    docker attach --no-stdin swss
}

stop() {
    docker stop swss
}

case "$1" in
    start|stop|attach)
        $1
        ;;
    *)
        echo "Usage: $0 {start|stop|attach}"
        exit 1
        ;;
esac

從上面的腳本能夠看出,Host使用以下命令啓動容器:

docker run -d --net=host --privileged -t -v /etc/network/interfaces:/etc/network/interfaces:ro -v /etc/network/interfaces.d/:/etc/network/interfaces.d/:ro -v /host/machine.conf:/host/machine.conf:ro -v /etc/sonic:/etc/sonic:ro -v /var/log/swss:/var/log/swss:rw  \
        --log-opt max-size=2M --log-opt max-file=5 \
        -v /var/run/redis:/var/run/redis:rw \
        -v /usr/share/sonic/device/$PLATFORM:/usr/share/sonic/platform:ro \
        -v /usr/share/sonic/device/$PLATFORM/$HWSKU:/usr/share/sonic/hwsku:ro \
        --tmpfs /tmp \
        --tmpfs /var/tmp \
        --name=swss docker-orchagent-bfn:latest
  • -d, --detach Run container in background and print container ID

    咱們單獨的使用run只會啓動容器,他會當即啓動,相應而後就自動消失。你在這個時候使用exec命令已經太遲了。
    因此,當咱們啓動容器的時候必定要加上--detach或者-d來保持容器在後臺持續運行。

  • --net=host 與host共享網絡命名空間
  • --privileged Give extended privileges to this container 使用該參數,container內的root擁有真正的root權限
  • -v掛在host的一些目錄到容器中。
  • --name=swss 容器名字爲swss
  • docker-orchagent-bfn:latest 使用docker-orchagent-bfn:latest

命令沒有攜帶CMD。

容器運行入口點

從Dockerfile.j2文件能夠看出文件的入口點爲ENTRYPOINT ["/usr/bin/supervisord"],使用supervisord進行進程監控,咱們看一下supervisord的配置文件

進入swss容器後,咱們查看啓動了多少個進程。

root@switch:/# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 09:48 ?        00:00:01 /usr/bin/python /usr/bin/supervisord
root        20     1  0 09:48 ?        00:00:00 /usr/bin/watcherd
root        42     1  0 09:48 ?        00:00:00 /usr/sbin/rsyslogd -n
root        47     1  0 09:48 ?        00:00:00 /usr/bin/orchagent -d /var/log/swss -b 8192 -m 00:90:fb:60:e2:8b
root        59     1  1 09:48 ?        00:00:23 /usr/bin/portsyncd -p /usr/share/sonic/hwsku/port_config.ini
root        62     1  0 09:48 ?        00:00:00 /usr/bin/intfsyncd
root        65     1  0 09:48 ?        00:00:00 /usr/bin/neighsyncd
root        77     1  0 09:49 ?        00:00:00 /usr/bin/vlanmgrd
root        94     1  0 09:49 ?        00:00:00 /usr/bin/intfmgrd
root       102     1  0 09:49 ?        00:00:00 /usr/bin/buffermgrd -l /usr/share/sonic/hwsku/pg_profile_lookup.ini
root       112     1  0 09:49 ?        00:00:00 /bin/bash /usr/bin/arp_update
root       335   112  0 10:24 ?        00:00:00 sleep 300
root       344     0  1 10:25 ?        00:00:00 bash
root       349   344  0 10:25 ?        00:00:00 ps -ef
root@switch:/#

上面的結果/usr/bin/python /usr/bin/supervisord能夠看出,supervisord啓動的時候沒有指定配置文件,那麼其使用的是默認配置文件/etc/supervisor/supervisord.conf:

; supervisor config file

[unix_http_server]
file=/var/run/supervisor.sock   ; (the path to the socket file)
chmod=0700                       ; sockef file mode (default 0700)
username=dummy
password=dummy

[supervisord]
logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log)
pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
childlogdir=/var/log/supervisor            ; ('AUTO' child log dir, default $TEMP)
user=root

; the below section must remain in the config file for RPC
; (supervisorctl/web interface) to work, additional interfaces may be
; added by defining them in separate rpcinterface: sections
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///var/run/supervisor.sock ; use a unix:// URL  for a unix socket
username=dummy
password=dummy

; The [include] section can just contain the "files" setting.  This
; setting can list multiple files (separated by whitespace or
; newlines).  It can also contain wildcards.  The filenames are
; interpreted as relative to this file.  Included files *cannot*
; include files themselves.

[include]
files = /etc/supervisor/conf.d/*.conf

查看子配置文件files = /etc/supervisor/conf.d/*.conf

/etc/supervisor/conf.d/目錄下只有一個文件supervisord.conf:

[supervisord]
logfile_maxbytes=1MB
logfile_backups=2
nodaemon=true

#運行start.sh,優先級爲1
[program:start.sh]
command=/usr/bin/start.sh
priority=1
autostart=true
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

#rsyslogd,優先級爲2
[program:rsyslogd]
command=/usr/sbin/rsyslogd -n
priority=2
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

#rsyslogd,優先級爲2
[program:orchagent]
command=/usr/bin/orchagent.sh
priority=3
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

#rsyslogd,優先級爲2
[program:portsyncd]
command=/usr/bin/portsyncd -p /usr/share/sonic/hwsku/port_config.ini
priority=4
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

#intfsyncd,優先級爲2
[program:intfsyncd]
command=/usr/bin/intfsyncd
priority=5
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

#neighsyncd,優先級爲6
[program:neighsyncd]
command=/usr/bin/neighsyncd
priority=6
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

#swssconfig.sh,優先級爲7
[program:swssconfig]
command=/usr/bin/swssconfig.sh
priority=7
autostart=false
autorestart=unexpected
startretries=0
stdout_logfile=syslog
stderr_logfile=syslog

#arp_update,優先級爲8
[program:arp_update]
command=/usr/bin/arp_update
priority=8
autostart=false
autorestart=unexpected
stdout_logfile=syslog
stderr_logfile=syslog

#vlanmgrd,優先級爲9
[program:vlanmgrd]
command=/usr/bin/vlanmgrd
priority=9
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

[program:intfmgrd]
command=/usr/bin/intfmgrd
priority=10
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

[program:buffermgrd]
command=/usr/bin/buffermgrd -l /usr/share/sonic/hwsku/pg_profile_lookup.ini
priority=10
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

[program:enable_counters]
command=/usr/bin/enable_counters.py
priority=11
autostart=false
autorestart=false
stdout_logfile=syslog
stderr_logfile=syslog

[eventlistener:mylistener]
command=/usr/bin/watcherd
events=PROCESS_STATE

start.sh

#!/usr/bin/env bash

mkdir -p /etc/swss/config.d/

sonic-cfggen -d -t /usr/share/sonic/templates/switch.json.j2 > /etc/swss/config.d/switch.json
sonic-cfggen -d -t /usr/share/sonic/templates/ipinip.json.j2 > /etc/swss/config.d/ipinip.json
sonic-cfggen -d -t /usr/share/sonic/templates/ports.json.j2 > /etc/swss/config.d/ports.json

export platform=`sonic-cfggen -H -v DEVICE_METADATA.localhost.platform`

rm -f /var/run/rsyslogd.pid

supervisorctl start rsyslogd

supervisorctl start orchagent

supervisorctl start portsyncd

supervisorctl start intfsyncd

supervisorctl start neighsyncd

supervisorctl start swssconfig

supervisorctl start vlanmgrd

supervisorctl start intfmgrd

supervisorctl start buffermgrd

supervisorctl start enable_counters

# Start arp_update when VLAN exists
VLAN=`sonic-cfggen -d -v 'VLAN.keys() | join(" ") if VLAN'`
if [ "$VLAN" != "" ]; then
    supervisorctl start arp_update
fi
相關文章
相關標籤/搜索