當咱們在使用那些建設在OpenStack
之上的雲平臺服務的時候,每每在概覽頁面都有一個明顯的位置用來展現當前集羣的一些資源使用狀況,如,CPU,內存,硬盤等資源的總量、使用量、剩餘量。並且,每當咱們拓展集羣規模以後,概覽頁面上的資源總量也會自動增長,咱們都熟知,OpenStack
中的Nova
服務負責管理這些計算資源,那麼你有沒有想過,它們是如何被Nova
服務獲取的嗎?html
Nova
如何統計資源咱們知道,統計資源的操做屬於Nova
服務內部的機制,考慮到資源統計結果對後續操做(如建立虛擬機,建立硬盤)的重要性,咱們推斷該機制的運行順序必定先於其餘服務。node
經過上述簡單的分析,再加上一些必要的Debug操做,咱們得出:
該機制的觸發點位於nova.service.WSGIService.start
方法中:數據庫
def start(self): """Start serving this service using loaded configuration. Also, retrieve updated port number in case '0' was passed in, which indicates a random port should be used. :returns: None """ if self.manager: self.manager.init_host() self.manager.pre_start_hook() if self.backdoor_port is not None: self.manager.backdoor_port = self.backdoor_port self.server.start() if self.manager: self.manager.post_start_hook()
其中,self.manager.pre_start_hook()
的做用就是去獲取資源信息,它的直接調用爲nova.compute.manager.pre_start_hook
以下:json
def pre_start_hook(self): """After the service is initialized, but before we fully bring the service up by listening on RPC queues, make sure to update our available resources (and indirectly our available nodes). """ self.update_available_resource(nova.context.get_admin_context()) ... @periodic_task.periodic_task def update_available_resource(self, context): """See driver.get_available_resource() Periodic process that keeps that the compute host's understanding of resource availability and usage in sync with the underlying hypervisor. :param context: security context """ new_resource_tracker_dict = {} nodenames = set(self.driver.get_available_nodes()) for nodename in nodenames: rt = self._get_resource_tracker(nodename) rt.update_available_resource(context) new_resource_tracker_dict[nodename] = rt # Delete orphan compute node not reported by driver but still in db compute_nodes_in_db = self._get_compute_nodes_in_db(context, use_slave=True) for cn in compute_nodes_in_db: if cn.hypervisor_hostname not in nodenames: LOG.audit(_("Deleting orphan compute node %s") % cn.id) cn.destroy() self._resource_tracker_dict = new_resource_tracker_dict
上述代碼中的rt.update_available_resource()
的直接調用實爲nova.compute.resource_tracker.update_available_resource()
以下:數組
def update_available_resource(self, context): """Override in-memory calculations of compute node resource usage based on data audited from the hypervisor layer. Add in resource claims in progress to account for operations that have declared a need for resources, but not necessarily retrieved them from the hypervisor layer yet. """ LOG.audit(_("Auditing locally available compute resources")) resources = self.driver.get_available_resource(self.nodename) if not resources: # The virt driver does not support this function LOG.audit(_("Virt driver does not support " "'get_available_resource' Compute tracking is disabled.")) self.compute_node = None return resources['host_ip'] = CONF.my_ip # TODO(berrange): remove this once all virt drivers are updated # to report topology if "numa_topology" not in resources: resources["numa_topology"] = None self._verify_resources(resources) self._report_hypervisor_resource_view(resources) return self._update_available_resource(context, resources)
上述代碼中的self._update_available_resource
的做用是根據計算節點上的資源實際使用結果來同步數據庫記錄,這裏咱們不作展開;self.driver.get_available_resource()
的做用就是獲取節點硬件資源信息,它的實際調用爲:dom
class LibvirtDriver(driver.ComputeDriver): def get_available_resource(self, nodename): """Retrieve resource information. This method is called when nova-compute launches, and as part of a periodic task that records the results in the DB. :param nodename: will be put in PCI device :returns: dictionary containing resource info """ # Temporary: convert supported_instances into a string, while keeping # the RPC version as JSON. Can be changed when RPC broadcast is removed stats = self.get_host_stats(refresh=True) stats['supported_instances'] = jsonutils.dumps( stats['supported_instances']) return stats def get_host_stats(self, refresh=False): """Return the current state of the host. If 'refresh' is True, run update the stats first. """ return self.host_state.get_host_stats(refresh=refresh) def _get_vcpu_total(self): """Get available vcpu number of physical computer. :returns: the number of cpu core instances can be used. """ if self._vcpu_total != 0: return self._vcpu_total try: total_pcpus = self._conn.getInfo()[2] + 1 except libvirt.libvirtError: LOG.warn(_LW("Cannot get the number of cpu, because this " "function is not implemented for this platform. ")) return 0 if CONF.vcpu_pin_set is None: self._vcpu_total = total_pcpus return self._vcpu_total available_ids = hardware.get_vcpu_pin_set() if sorted(available_ids)[-1] >= total_pcpus: raise exception.Invalid(_("Invalid vcpu_pin_set config, " "out of hypervisor cpu range.")) self._vcpu_total = len(available_ids) return self._vcpu_total ..... class HostState(object): """Manages information about the compute node through libvirt.""" def __init__(self, driver): super(HostState, self).__init__() self._stats = {} self.driver = driver self.update_status() def get_host_stats(self, refresh=False): """Return the current state of the host. If 'refresh' is True, run update the stats first. """ if refresh or not self._stats: self.update_status() return self._stats def update_status(self): """Retrieve status info from libvirt.""" ... data["vcpus"] = self.driver._get_vcpu_total() data["memory_mb"] = self.driver._get_memory_mb_total() data["local_gb"] = disk_info_dict['total'] data["vcpus_used"] = self.driver._get_vcpu_used() data["memory_mb_used"] = self.driver._get_memory_mb_used() data["local_gb_used"] = disk_info_dict['used'] data["hypervisor_type"] = self.driver._get_hypervisor_type() data["hypervisor_version"] = self.driver._get_hypervisor_version() data["hypervisor_hostname"] = self.driver._get_hypervisor_hostname() data["cpu_info"] = self.driver._get_cpu_info() data['disk_available_least'] = _get_disk_available_least() ...
注意get_available_resource
方法的註釋信息,徹底符合咱們開始的推斷。咱們下面單以vcpus
爲例繼續調查資源統計流程,self.driver._get_vcpu_total
的實際調用爲LibvirtDriver._get_vcpu_total
(上述代碼中已給出),若是配置項vcpu_pin_set
沒有生效,那麼獲得的_vcpu_total
的值爲self._conn.getInfo()[2]
(self._conn
能夠理解爲libvirt的適配器,它表明與kvm
,qemu
等底層虛擬化工具的抽象鏈接,getInfo()
就是對libvirtmod.virNodeGetInfo
的一次簡單的封裝,它的返回值是一組數組,其中第三個元素就是vcpus
的數量),咱們看到這裏基本就能夠了,再往下就是libvirt的C語言代碼而不是Python的範疇了。ide
另外一方面,若是咱們配置了vcpu_pin_set
配置項,那麼該配置項就被hardware.get_vcpu_pin_set
方法解析成一個可用CPU位置索引的集合,再經過對該集合求長後,咱們也能獲得最終想要的vcpus
的數量。工具
如上,就是Nova統計節點硬件資源的整個邏輯過程(vcpus
爲例)。post