Kilo - nova rbd虛擬機快照

環境後端

版本: rdo OpenStack Kilo
api

ceph version 0.94.7bash



背景介紹dom

先來講下OpenStack nova傳統的虛擬機快照方式(這裏無論nova後端存儲是啥,實現方式都是同樣的)ide

virt/libvirt/driver.py -> def snapshot(最終會走到這裏)ui

一、獲取虛擬機磁盤文件的格式this

           默認狀況下跟上傳到glance的image格式保持一致,固然也能夠配置CONF.libvirt.snapshot_image_format,強制修改;url

    虛擬機後端類型爲lvm或rbd的話,上傳的glance鏡像統一爲rawspa

二、建立快照的元數據,由於對nova來講,快照即鏡像。
.net

三、uuid.uuid4().hex生成快照名字

四、獲取虛擬機電源狀態(shutdown下不能作在線快照,跟下面判斷是否支持在線快照有關)

五、判斷是否支持在線快照

          要求qemu版本1.3以上、libvirt版本1.0.0以上;CONF.ephemeral_storage_encryption.enabledCONF.workarounds.disable_libvirt_livesnapshot

  這兩個配置參數爲false.

六、卸載虛擬機的pci設備和sriov端口(前提:虛擬化類型不是lxc、虛擬機出於運行中或暫停狀態)

七、更改虛擬機任務狀態爲IMAGE_PENDING_UPLOAD

八、建立快照相關的臨時目錄

九、進行snapshot_extract操做(實際上qemu-img convert操做

十、從新掛載虛擬機的pci設備和sriov端口

十一、上傳glance


這裏是否是很奇怪,若是nova和glance後端都是ceph,那虛擬機作快照的時候,把磁盤從ceph中拉到本地,而後再上傳到ceph中,有沒可能直接在ceph集羣中作呢?



具體實現

master分支上已經實現了這個功能,具體參考:https://blueprints.launchpad.net/nova/+spec/rbd-instance-snapshots


大體步驟以下:

# glance conf文件增長以下配置 
[DEFAULT]
show_image_direct_url = True   # 顯示鏡像後端url

# 部分代碼邏輯
virt/libvirt/driver.py
        try:
            update_task_state(task_state=task_states.IMAGE_UPLOADING,
                              expected_state=task_states.IMAGE_PENDING_UPLOAD)
            metadata['location'] = snapshot_backend.direct_snapshot(     # 嘗試rbd快照邏輯,若是不行的話,仍是執行本來的快照邏輯
                context, snapshot_name, image_format, image_id,
                instance.image_ref)
            self._snapshot_domain(context, live_snapshot, virt_dom, state,
                                  instance)
            self._image_api.update(context, image_id, metadata,
                                   purge_props=False)
        except (NotImplementedError, exception.ImageUnacceptable,
                exception.Forbidden) as e:
            if type(e) != NotImplementedError:
                LOG.warning(_LW('Performing standard snapshot because direct '
                                'snapshot failed: %(error)s'), {'error': e})
            failed_snap = metadata.pop('location', None)
            if failed_snap:
                failed_snap = {'url': str(failed_snap)}
            snapshot_backend.cleanup_direct_snapshot(failed_snap,
                                                     also_destroy_volume=True,
                                                     ignore_errors=True)
            update_task_state(task_state=task_states.IMAGE_PENDING_UPLOAD,
                              expected_state=task_states.IMAGE_UPLOADING)
            snapshot_directory = CONF.libvirt.snapshots_directory
            fileutils.ensure_tree(snapshot_directory)
            
# 利用ceph快照的機制來實現虛擬機快照             
virt/libvirt/imagebackend.py
    def _get_parent_pool(self, context, base_image_id, fsid):
        parent_pool = None
        try:
            # The easy way -- the image is an RBD clone, so use the parent
            # images' storage pool
            parent_pool, _im, _snap = self.driver.parent_info(self.rbd_name)
        except exception.ImageUnacceptable:
            # The hard way -- the image is itself a parent, so ask Glance
            # where it came from
            LOG.debug('No parent info for %s; asking the Image API where its '
                      'store is', base_image_id)
            try:
                image_meta = IMAGE_API.get(context, base_image_id,
                                           include_locations=True)
            except Exception as e:
                LOG.debug('Unable to get image %(image_id)s; error: %(error)s',
                          {'image_id': base_image_id, 'error': e})
                image_meta = {}

            # Find the first location that is in the same RBD cluster
            for location in image_meta.get('locations', []):
                try:
                    parent_fsid, parent_pool, _im, _snap = \
                        self.driver.parse_url(location['url'])
                    if parent_fsid == fsid:
                        break
                    else:
                        parent_pool = None
                except exception.ImageUnacceptable:
                    continue

        if not parent_pool:
            raise exception.ImageUnacceptable(
                    _('Cannot determine the parent storage pool for %s; '
                      'cannot determine where to store images') %
                    base_image_id)

        return parent_pool

    def direct_snapshot(self, context, snapshot_name, image_format,
                        image_id, base_image_id):
        """Creates an RBD snapshot directly.
        """
        fsid = self.driver.get_fsid()
        # NOTE(nic): Nova has zero comprehension of how Glance's image store
        # is configured, but we can infer what storage pool Glance is using
        # by looking at the parent image.  If using authx, write access should
        # be enabled on that pool for the Nova user
        parent_pool = self._get_parent_pool(context, base_image_id, fsid)

        # Snapshot the disk and clone it into Glance's storage pool.  librbd
        # requires that snapshots be set to "protected" in order to clone them
        self.driver.create_snap(self.rbd_name, snapshot_name, protect=True)
        location = {'url': 'rbd://%(fsid)s/%(pool)s/%(image)s/%(snap)s' %
                           dict(fsid=fsid,
                                pool=self.pool,
                                image=self.rbd_name,
                                snap=snapshot_name)}
        try:
            self.driver.clone(location, image_id, dest_pool=parent_pool)
            # Flatten the image, which detaches it from the source snapshot
            self.driver.flatten(image_id, pool=parent_pool)        # flatten以後至關一個完整的克隆
        finally:
            # all done with the source snapshot, clean it up
            self.cleanup_direct_snapshot(location)

        # Glance makes a protected snapshot called 'snap' on uploaded
        # images and hands it out, so we'll do that too.  The name of
        # the snapshot doesn't really matter, this just uses what the
        # glance-store rbd backend sets (which is not configurable).
        self.driver.create_snap(image_id, 'snap', pool=parent_pool,
                                protect=True)
        return ('rbd://%(fsid)s/%(pool)s/%(image)s/snap' %
                dict(fsid=fsid, pool=parent_pool, image=image_id))
                
     def cleanup_direct_snapshot(self, location, also_destroy_volume=False,
                                ignore_errors=False):
        """Unprotects and destroys the name snapshot.

        With also_destroy_volume=True, it will also cleanup/destroy the parent
        volume.  This is useful for cleaning up when the target volume fails
        to snapshot properly.
        """
        if location:
            _fsid, _pool, _im, _snap = self.driver.parse_url(location['url'])
            self.driver.remove_snap(_im, _snap, pool=_pool, force=True,
                                    ignore_errors=ignore_errors)
            if also_destroy_volume:
                self.driver.destroy_volume(_im, pool=_pool)


可能錯誤

一、nova ceph pool用戶cinder對glance ceph pool沒有寫的權限(根據ceph官方文檔操做)


解決:在線爲pool用戶cinder增長images pool的寫權限

ceph auth caps client.cinder mon 'allow r' \
          osd 'allow class-read object_prefix rbd_children, \
          allow rwx pool=volumes, allow rwx pool=vms, allow rwx pool=images'
相關文章
相關標籤/搜索