雲上快速搭建Serverless AI實驗室

時間 2019-12-09

標籤快速搭建 serverless 實驗室简体版

原文原文鏈接

Serverless Kubernetes和ACK虛擬節點都已基於ECI提供GPU容器實例功能，讓用戶在雲上低成本快速搭建serverless AI實驗室，用戶無需維護服務器和GPU基礎運行環境，極大下降AI平臺運維的負擔，顯著提高總體計算效率。html

如何使用GPU容器實例

在pod的annotation中指定所需GPU的類型（P4/P100/V100等），同時在resource.limits中指定GPU的個數便可建立GPU容器實例。每一個pod獨佔GPU，暫不支持vGPU，GPU實例的收費與ECS GPU類型收費一致，不產生額外費用，目前ECI提供多種規格的GPU類型。（請參考https://help.aliyun.com/document_detail/114581.html）python

示例

1. 建立Serverless Kubernetes集羣

選擇深圳區域，可用區D。

api

2. 建立GPU容器實例

咱們使用tensorflow模型對以下圖片進行識別：
服務器

使用模版建立pod，其中選擇P100 GPU規格。在pod中的腳本會下載上述圖片文件，並根據模型進行識別計算。
less

apiVersion: v1
kind: Pod
metadata:
  name: tensorflow
  annotations:
    k8s.aliyun.com/eci-gpu-type : "P100"
spec:
  containers:
  - image: registry-vpc.cn-shenzhen.aliyuncs.com/ack-serverless/tensorflow
    name: tensorflow
    command:
    - "sh"
    - "-c"
    - "python models/tutorials/image/imagenet/classify_image.py"
    resources:
      limits:
        nvidia.com/gpu: "1"
  restartPolicy: OnFailure

部署後pod會處於pending狀態：
運維

等待幾十秒後pod狀態變成Running，待計算完成後會變成Terminated狀態。
url

從pod的日誌咱們能夠看到pod可以識別P100 GPU硬件，並且能夠正確識別圖片爲Panda。
spa