hyper容器網絡相關源碼分析

1、網絡初始化linux

一、hyperd/daemon/daemon.gojson

func NewDaemon(cfg *apitypes.HyperConfig) (*Daemon, error)api

該函數直接調用daemon.initNetworks(cfg)bash

 

二、hyperd/daemon/daemon.go網絡

func (daemon *Daemon) initNetworks(c *apitypes.HyperConfig) error數據結構

該函數僅僅只是調用hypervisor.InitNetwork(c.Bridge, c.BridgeIP, c.DisableIptables),所以關於網絡的內容都是在runv中完成的app

 

三、runv/hypervisor/hypervisor.godom

func InitNetwork(bIface, bIP string, disableIptables bool) erroride

若HDriver.BuildinNetwork()爲true,則return HDriver.InitNetwork(bIface, bIP, disableIptables)  // QEMU爲false函數

不然,return network.InitNetwork(bIface, bIP, disableIptables)

 

四、runv/hypervisor/network/network_linux.go

func InitNetwork(bIface, bIP string, disable bool) error

(1)、首先設置BridgeIface和BridgeIP,BridgeIface默認爲"hyper0",bIP默認爲"192.168.123.0/24",並將disableIptables設置爲disable

(2)、調用addr, err := GetIfaceAddr(BridgeIface),若err 不爲nil,則說明bridge不存在,須要建立一個,不然說明bridge存在,可是仍然須要對配置信息進行匹配檢查

(3)、若bridge不存在,則調用configureBridge(BridgeIP, BridgeIface)建立一個,再調用addr, err = GetIfaceAddr(BridgeIface)獲取bridge信息,再調用BridgeIPv4Net = addr.(*net.IPNet)

(4)、調用setupIPTables(addr)

(5)、調用setupIPForwarding()

(6)、最後調用IpAllocator.RequestIP(BridgeIPv4Net, BridgeIPv4Net.IP)

 

// Return the first IPv4 address for the specified network interface

五、runv/hypervisor/network_linux.go

func GetIfaceAddr(name string) (net.Addr, error)

(1)、首先調用iface, err := net.InterfaceByName(name)以及addrs, err := iface.Addrs()獲取地址信息

(2)、設置變量var addr4 []net.Addr,再從addrs中解析,最終返回addr4[0]

 

// create and setup network bridge

六、runv/hypervisor/network_linux.go

func configureBridge(bridgeIP, bridgeIface string) error

(1)、檢測bridgeIP並將其賦值給ifaceAddr

(2)、調用CreateBridgeIface(bridgeIface),並忽略已經"exists"的錯誤

(3)、調用iface, err := net.InterfaceByName(bridgeIface)獲取接口

(4)、調用ipAddr, ipNet, err := net.ParseCIDR(ifaceAddr) (注:For example, ParseCIDR("198.51.100.1/24") returns the IP address 198.51.100.1 and the network 198.51.100.0/24.)

(5)、若ipAddr.Equal(ipNet.IP)則調用ipAddr, err = IpAllocator.RequestIP(ipNet, nil)

不然調用ipAddr, err = IpAllocator.RequestIP(ipNet, ipAddr)

(6)、調用NetworkLinkAddIp(iface, ipAddr, ipNet)

(7)、調用NetworkLinkUp(iface)   ---> 都是對進行底層的syscall.Syscall()的調用

 

// Create the actual bridge device. This is more backward-compatible than netlink and works on RHEL 6.

七、runv/hypervisor/network_linux.go

func CreateBridgeIface(name string) error

該函數進行最底層的syscall.Syscall(...)來建立網橋

 

IPAllocator結構以下所示:

type IPAllocator struct {
  allocatedIPs  networkSet
  mutex     sync.Mutex
}

  

networkSet的定義以下所示:

type networkSet  map[string]*allocatedMap

  

allocatedMap結構以下所示:

type allocatedMap struct {
  p      map[string]struct{}
  last   *big.Int
  begin   *big.Int
  end    *big.Int
}

  

// 當參數ip爲nil時,返回network中下一個可獲取的IP地址,若是參數ip不爲nil,則會校驗給定的ip是否合法

八、runv/hypervisor/network/ipallocator/ipallocator.go

func (a *IPAllocator) RequestIP(network *net.IPNet, ip net.IP) (net.IP, error)

(1) 、調用key := network.String()返回該network的字符串表示,並調用allocated, ok := a.allocatedIPs[key]

(2)、若該network不存在,則調用allocated = newAllocatedMap(network)新建一個,並調用a.allocatedIPs[key] = allocated

(3)、若ip == nil,則調用return allocated.getNextIP(),不然調用allocated.checkIP(ip)

 

// This function is identical to: ip addr add $ip/$ipNet dev $iface

9、runv/hypervisor/network/network_linux.go

func NetworkLinkAddIp(iface *net.Interface, ip net.IP, ipNet *net.IPNet) error

(1)、該函數直接調用return networkLinkIpAction(syscall.RTM_NEWADDR, syscall.NLM_F_CREAT|syscall.NLM_F_EXCL|syscall.NLM_F_ACK, IfAddr{iface, ip, ipNet})

至於networkLinkIpAction(...)函數則僅僅只是利用netlink執行命令而已

 

八、runv/hypervisor/network_linux.go

func setupIPTables(addr net.Addr) error

(1)、Enable NAT:

`iptables  -t nat -I POSTROUTING -s 192.168.123.0/24 ! -o hyper0 -j MASQUERADE`,將進入host,可是目的地不是本地其餘容器的容器流量作snat

(2)、Create HYPER iptables Chain

(3)、Goto HYPER chain

`iptables -t filter -I FORWARD -o hyper0 -j HYPER`將轉發到hyper0的流量交由HYPER鏈處理

(4)、Accept all outgoing packets

`iptables -t filter -I FORWARD -i hyper0 -j ACCEPT`從hyper0進入的流量所有接受

(5)、Accept incoming packets for existing connections

`iptables -t filter -I FORWARD -o hyper0  -m conntrack --ctstate RELATED, ESTABLISHED -j ACCETP`

(6)、在nat中,Create HYPER iptables Chain

`iptables -t nat -N HYPER`

(7)、Goto HYPER chain

`iptables -t nat -I OUTPUT -m addrtype --dst-type LOCAL ! -d 127.0.0.1/8 -j HYPER`

`iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -j HYPER`

 

九、runv/hypervisor/network_linux.go

func setupIPForwarding() error

(1)、Get current IPv4 forward setup

(2)、Enable IPv4 forwarding only if it is not already enabled

 

2、hyperd部分網絡配置

// hyperd/daemon/pod/provision.go

一、func CreateXPod(factory *PodFactory, spec *apitypes.UserPod) (*Xpod, error)

...

(1)、調用p.initResources(spec, true)

(2)、調用err = p.prepareResources()

(3)、調用err = p.addResourcesToSandbox()

....

 

// hyperd/daemon/pod/provision.go

二、func (p *XPod) initResources (spec *apitypes.UserPod, allowCreate bool) error

....

(1)、當len(spec.Interfaces) == 0時,調用spec.Interfaces = append(spec.Interfaces, &apitypes.UserInterface{})

(2)、遍歷spec.Interfaces,調用inf := newInterface(p, nspec)和p.interfaces[nspec.Ifname] = inf

其中newInterface()函數僅僅返回一個Interface{}結構,若是spec.Ifname爲""時,將其設置爲"eth-default"

....

Interface{}的數據結構以下所示:

type Interface struct {
  p      *XPod
  spec    *apitypes.UserInterface
  descript *runv.InterfaceDescription
}

 

apitypes.UserInterface 結構以下所示:

type UserInterface struct {
  Bridge    string
  Ip      string
  Ifname    string
  Mac     string
  Gateway   string
  Tap       string
} 

  

// hyperd/daemon/pod/provision.go  

三、func (p *XPod) prepareResources() error

....

(1)、遍歷p.interfaces,調用inf.prepare()

....

 

// hyperd/daemon/pod/networks.go

四、func (inf *Interface) prepare() error

(1)、當inf.spec.Ip == ""而且inf.spec.Bridge != ""時報錯 --> if configured a bridg, must specify the IP address

(2)、當inf.spec.Ip == ""時,調用setting, err := network.AllocateAddr(""),而且用setting的內容填充&runv.InterfaceDescription{}結構,並賦值給inf.descript

不然,直接將用inf的內容填充&runv.InterfaceDescription{}結構,並賦值給inf.descript

 

// runv/hypervisor/network/network_linux.go

五、func AllocateAddr(requestedIP string) (*Settings, error)

(1)、調用ip, err := IpAllocator.RequestIP(BridgeIPv4Net, net.parseIP(requestedIP))

(2)、調用maskSize, _ := BridgeIPv4Net.Mask.Size()以及mac, err := GenRandomMac()

(3)、返回return &Settings{...}

 

// hyperd/daemon/pod/provision.go

// addResourcesToSandbox() add resources to sandbox parallelly, it issues runV API parallelly to send the

// NIC, Vols, and Containers to sandbox

六、func (p *XPod) addResourcesToSandbox() error

...

(1)、調用future := utils.NewFutureSet()

(2)、調用函數future.Add("addInterface", func() error {}),其中在func函數中調用for _, inf := range p.interfaces,並調用err := inf.add()

在遍歷完p.interfaces以後,再調用p.sandbox.AddRoute()

...

 

// hyperd/daemon/pod/networks.go

func (inf *Interface) add() error

(1)、若inf.descript == nil 或者inf.descript.Ip 爲"",則報錯

(2)、調用inf.p.sandbox.AddNic(inf.descript)

 

3、runv部分網絡配置

(1)、添加網卡

網卡的數據結構以下所示:

type InterfaceDescription struct {

  Id      string
  Lo      bool
  Bridge    string
  Ip       string
  Mac      string
  Gw      string
  TapName   string
  Options   string }

 

// runv/hypervisor/vm.go

一、func (vm *Vm) AddNic(info *api.InterfaceDescription)

(1)、設置client := make(chan api.Result, 1),用於同步

(2)、調用vm.ctx.AddInterface(info, client)

(3)、調用ev, ok := <-client等待網卡建立完成

(3)、調用return vm.ctx.updateInterface(info.Id)

 

// runv/hypervisor/context.go

2、func (ctx *VmContext) AddInterface(inf *api.InterfaceDescription, result chan api.Result)

(1)、當ctx.current != StateRunning時,報錯,調用result <- NewNotReadyError(ctx.Id)

(2)、調用ctx.networks.addInterface(inf, result)

 

// runv/hypervisor/network.go

二、func (nc *NetworkContext) addInterface(inf *api.InterfaceDescription, result chan api.Result)

(1)、當inf.Lo爲true時,填充i := &InterfaceCreated{...},nc.lo[inf.Ip] = i, nc.idMap[inf.Id] = i,併成功返回

(2)、啓動一個goroutine,調用idx := nc.applySlot(),獲取interface對應的ethernet slot

nc.configureInterface(idx, nc.sandbox.netPciAddr(), fmt.Sprintf("eth%d", idx), inf, devChan)

(3)、啓動一個goroutine,等待device inserted狀況,並經過result返回網卡插入成功或者失敗的信息

 

// runv/hypervisor/network.go

三、func (nc *NetworkContext) configureInterface(index, pciAddr int, name string, inf *api.InterfaceDescription, result chan<- VmEvent)

(1)、調用settings, err = network.Configure(nc.sandbox.Id, "", false, inf)

(2)、調用created, err := interfaceGot(inf.Id, index, pciAddr, name)

(3)、用created填充h := &HostNicInfo{}和g := &GuestNicInfo{}

(4)、調用nc.eth[index] = created以及nc.idMap[created.Id] = created

(5)、最後調用nc.sandbox.DCtx.AddNic(nc.sandbox, h, g, result)

 

HostNicInfo結構以下所示:

type HostNicInfo struct {
  Id    string
  Fd     uint64
  Device  string
  Mac    string
  Bridge  string
  Gateway string
}

 

GuestNicInfo結構以下所示:

type GuestNicInfo struct {
  Device    string
  Ipaddr     string
  Index      int
  Busaddr    int
}  

   

// runv/hypervisor/network/network_linux.go

四、func configure(vmId, requestIP string, addrOnly bool, inf *api.InterfaceDescription) (*Settings, error)

(1)、調用ip, mask, err := ipParser(inf.Ip)獲取配置的ip,再調用maskSize, _ := mask.Size()獲取mask的長度

(2)、調用mac := inf.Mac,若是mac爲"",則調用mac, err := GenRandomMac()建立一個

(3)、若是addrOnly爲True,則return &Settings{...},其中Device爲inf.TapName, File爲nil,

(4)、不然調用device, tapFile, err := GetTapFd(inf.TapName, inf.Bridge, inf.Options),GetTapFd建立一個tap設備,將它加到bridge中,並啓動

最終return &Settings{...},其中Device爲device, File爲tapFile

 

Settings結構以下所示:

type Settings struct {
  Mac       string
  IPAddress    string
  IPPrefixLen  int
  Gateway    string
  Bridge     string
  Device     string
  File      *os.File
  Automatic   bool
}

  

// runv/hypervisor/network.go

五、func interfaceGot(id string, index int, pciAddr int, name string, inf *network.Settings) (*InterfaceCreated, error)

(1)、調用ip, nw, err := net.ParseCIDR(fmt.Sprintf("%s/%d", inf.IPAddress, inf.IPPrefixLen))

 (2)、建立rt := []*RouteRule{},若是該interface爲第一個且inf.Automatic爲true(默認爲false),或者配有gateway且inf.Automatic爲false,則建立相應的RouteRule,調用:

rt = append(rt, &RouteRule{Destination: "0.0.0.0/0", Gateway: inf.Gateway, ViaThis: true,})

(3)、最後return &InterfaceCreated{}, nil

 

InterfaceCreated結構以下所示:

type InterfaceCreated struct {
  Id        string
  Index       int
  PCIAddr     int
  Fd         *os.File
  Bridge      string
  HostDevice     string
  DeviceName    string
  MacAddr     string
  IpAddr      string
  NetMask     string
  RouteTable   []*RouteRule
}

 

RouteRule結構以下所示:

type RouteRule struct {
  Destination  string
  Gateway   string
  ViaThis    bool
}  

 

// 當虛擬機驅動爲QEMU時

// runv/hypervisor/qemu/qemu.go

六、func (qc *QemuContext) AddNic(ctx *hypervisor.VmContext, host *hypervisor.HostNicInfo, guest *hypervisor.GuestNicInfo, result chan <- hypervisor.VmEvent)

該函數直接調用newNetworkAddSession(ctx, qc, host.Id, host.Fd, guest.Device, host.Mac, guest.Index, guest.Busaddr, result)

 

// runv/hypervisor/qemu/qmp_wrapper_amd64.go

七、func newNetworkAddSession(ctx *hypervisor.VmContext, qc *QemuContext, id string, fd uint64, device, mac string, index, addr int, result chan<- hypervisor.VmEvent)

(1)、先建立"getfd","netdev_add"和"device_add"三個QmpCommand命令

(2)、再將這三個命令組建成一個QmpSession,發送給QEMU

 

// runv/hypervisor/vm_states.go

八、func (ctx *VmContext) updateInterface(id string) error

(1)、首先調用inf := ctx.networks.getInterface(id)獲取建立的interface

(2)、若inf不爲nil,則調用ctx.hyperstart.UpdateInterface(inf.DeviceName, inf.IpAddr, inf.NetMask)

 

// runv/hyperstart/libhyperstart/json.go

九、func (h *jsonBasedHyperstart) UpdateInterface(dev, ip, mask string) error

該函數直接調用return h.hyperstartCommand(hyperstartapi.INIT_SETUPINTERFACE, hyperstartapi.NetworkInf{Device: dev, IpAddress: ip, NetMask: mask})

hyperstartCommand()進一步調用hyperstartCommandWithRetMsg(),最後建立hyperstartCmd{}將請求發給QEMU完成命令

 

(2)、添加路由

// runv/hypervisor/vm.go

一、func (vm *Vm) AddRoute()

(1)、調用routes := vm.ctx.networks.getRoutes()獲取路由信息

(2)、再調用return vm.ctx.hyperstart.AddRoute(routes)添加路由

 

2、func (h *jsonBasedHyperstart) AddRoute(r []hyperstartapi.Route) error

該函數僅僅調用return h.hyperstartCommand(hyperstartapi.INIT_SETUPROUTE, hyperstartapi.Routes{Routes: r}),具體操做和更新網卡信息時相同

 

4、hyperstart的網絡配置

// hyperstart/src/init.c

1、static int hyper_setup_pod(struct hyper_pod *pod)

...

(1)、調用hyper_setup_network(pod)

...

 

// hyperstart/src/net.c

2、int hyper_setup_network(struct hyper_pod *pod)

(1)、首先調用hyper_rescan()

(2)、建立變量struct rtnl_handle rth,並調用netlink_open(&rth)

(3)、建立for循環遍歷pod->iface[],調用ret = hyper_setup_interface(&rth, iface, pod)配置網卡

(4)、調用ret  = hyper_up_nic(&rth, 1)啓動lo

(5)、建立for循環遍歷pod->rt[],調用ret = hyper_setup_route(&rth, rt, pod)建立路由

 

注意:網卡和路由既能夠在建立pod時設置,也能夠單獨設置,經過直接給hyperstart發送cmd

最終在hyperstart中經過hyper_cmd_setup_interface和hyper_cmd_setup_route完成(二者再直接調用hyper_setup_interface和hyper_setup_route)

 

hyper_interface結構以下所示:

struct hyper_interface {
  char    *device;
  struct list_head  ipaddresses;
  char    *new_device_name;
  unsigned int  mtu;
}

  

hyper_route結構以下所示:

struct hyper_route {
  char    *dst;
  char    *gw;
  char    *device;
}

  

// hyperstart/src/net.c

三、static int hyper_setup_interface(struct rtnl_handler *rth, struct hyper_interface *iface, struct hyper_pod *pod)

 (1)、構建netlink request,調用ifindex = hyper_get_ifindex(iface->device, pod)獲取網卡的index  ---> hyper_get_index經過讀取/sys/class/net/$NIC/ifindex來讀取索引號

(2)、遍歷iface->device,設置網卡的IP地址

(3)、若是iface->new_device_name不爲空且和iface->device不一樣,則調用hyper_set_interface_name()設置網卡名字

(4)、若是iface->mtu大於0,則調用hyper_set_interface_mtu設置網卡的MTU

(5)、調用hyper_up_nic(rth, ifindex)啓動網卡

 

// hyperstart/src/net.c

四、static int hyper_setup_route(struct rtnl_handle *rth, struct hyper_route *rt, struct hyper_pod *pod)

(1)、構建netlink request

(2)、若是rt->gw不爲NULL,則先調用get_addr_ipv4(...)獲取網關,再經過addattr_l(...)設置網關

(3)、若是rt->device不爲NULL,則先調用hyper_get_ifindex(...)獲取網卡的index,再經過addattr_l(...)設置出口網絡設備

(4)、若是rt->dst不爲"default","any"或者"all",則說明不是默認子網,首先調用char *slash = strchr(rt->dst. '/')

以後再調用get_addr_ipv4(...),獲取相應dst的IP地址,並調用addattr_l(...)添加。接着,若slash不爲NULL,則調用get_netmask(...)獲取子網掩碼

最後調用rtnl_talk(...)設置路由

相關文章
相關標籤/搜索