使用ML.NET + ASP.NET Core + Docker + Azure Container Instances部署.NET機器學習模型

時間 2019-12-08

標籤使用 ml.net asp.net asp core docker azure container instances 部署機器學習模型欄目 ASP 简体版

原文原文鏈接

本文將使用ML.NET建立機器學習分類模型，經過ASP.NET Core Web API公開它，將其打包到Docker容器中，並經過Azure Container Instances將其部署到雲中。
linux

先決條件

本文假設您對Docker有必定的瞭解。構建和部署示例應用程序還須要如下軟件/依賴項。重要的是要注意應用程序是在Ubuntu 16.04 PC上構建的，但全部軟件都是跨平臺的，應該適用於任何環境。web

設置項目

咱們要作的第一件事是爲咱們的解決方案建立一個文件夾。docker

mkdir mlnetacidemo

而後，咱們想在新建立的文件夾中建立一個解決方案。json

cd mlnetacidemo
dotnet new sln

創建模型

在咱們的解決方案文件夾中，咱們想要建立一個新的控制檯應用程序，這是咱們構建和測試咱們的機器學習模型的地方。ubuntu

設置模型項目

首先，咱們要建立項目。從解決方案文件夾輸入：api

dotnet new console -o model

如今咱們要將這個新項目添加到咱們的解決方案中。服務器

dotnet sln mlnetacidemo.sln add model/model.csproj

添加依賴項

因爲咱們將使用ML.NET框架，咱們須要將其添加到咱們的model項目中。數據結構

cd model
dotnet add package Microsoft.ML
dotnet restore

在咱們開始訓練模型以前，咱們須要下載咱們將用於訓練的數據。咱們經過建立一個名爲data的目錄並將數據文件下載到那裏來實現。app

mkdir data
curl -o data/iris.txt https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

若是咱們看一下數據文件，它看起來應該是這樣的：框架

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa

訓練模型

如今咱們已經設置了全部依賴項，如今是構建模型的時候了。我利用了ML.NET入門網站上使用的演示。

定義數據結構

在咱們model項目的根目錄中，讓咱們建立兩個被調用的類IrisData，IrisPrediction它們將分別定義咱們的特性和預測屬性。它們都將用於Microsoft.ML.Runtime.Api添加屬性屬性。

這是咱們IrisData的樣子：

using Microsoft.ML.Runtime.Api;

namespace model
{
public class IrisData
    {
        [Column("0")]
        public float SepalLength;

        [Column("1")]
        public float SepalWidth;

        [Column("2")]
        public float PetalLength;
        
        [Column("3")]
        public float PetalWidth;

        [Column("4")]
        [ColumnName("Label")]
        public string Label;
    }       
}

一樣，這是IrisPrediction：

using Microsoft.ML.Runtime.Api;

namespace model
{
    public class IrisPrediction
    {
        [ColumnName("PredictedLabel")]
        public string PredictedLabels;
    }
}

構建LearningPipeLine

using Microsoft.ML.Data;
using Microsoft.ML;
using Microsoft.ML.Runtime.Api;
using Microsoft.ML.Trainers;
using Microsoft.ML.Transforms;
using Microsoft.ML.Models;
using System;
using System.Threading.Tasks;

namespace model
{
    class Model
    {
        
        public static async Task<PredictionModel<IrisData,IrisPrediction>> Train(LearningPipeline pipeline, string dataPath, string modelPath)
        {
            // Load Data
            pipeline.Add(new TextLoader(dataPath).CreateFrom<IrisData>(separator:',')); 

            // Transform Data
            // Assign numeric values to text in the "Label" column, because 
            // only numbers can be processed during model training   
            pipeline.Add(new Dictionarizer("Label"));

            // Vectorize Features
            pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"));

            // Add Learner
            pipeline.Add(new StochasticDualCoordinateAscentClassifier());

            // Convert Label back to text 
            pipeline.Add(new PredictedLabelColumnOriginalValueConverter() {PredictedLabelColumn = "PredictedLabel"});

            // Train Model
            var model = pipeline.Train<IrisData,IrisPrediction>();

            // Persist Model
            await model.WriteAsync(modelPath);

            return model;
        }
    }
}

除了構建LearningPipLine並訓練咱們的機器學習模型以外，該模型還序列化並保存在名爲model.zip的文件中以供未來使用。

測試咱們的模型

如今是時候測試全部內容以確保它正常工做。

using System;
using Microsoft.ML;

namespace model
{
    class Program
    {
        static void Main(string[] args)
        {

            string dataPath = "model/data/iris.txt";

            string modelPath = "model/model.zip";

            var model = Model.Train(new LearningPipeline(),dataPath,modelPath).Result;

            // Test data for prediction
            var prediction = model.Predict(new IrisData() 
            {
                SepalLength = 3.3f,
                SepalWidth = 1.6f,
                PetalLength = 0.2f,
                PetalWidth = 5.1f
            });

            Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}");
        }
    }
}

所有設定運行。咱們能夠經過從解決方案目錄輸入如下命令來完成此操做：

dotnet run -p model/model.csproj

運行應用程序後，控制檯上將顯示如下輸出。

Automatically adding a MinMax normalization transform, use 'norm=Warn' or
'norm=No' to turn this behavior off.Using 2 threads to train.
Automatically choosing a check frequency of 2.Auto-tuning parameters: maxIterations = 9998.
Auto-tuning parameters: L2 = 2.667734E-05.
Auto-tuning parameters: L1Threshold (L1/L2) = 0.Using best model from iteration 882.
Not training a calibrator because it is not needed.
Predicted flower type is: Iris-virginica

公開模型

此外，您會注意到在咱們model項目的根目錄中建立了一個名爲model.zip的文件。這個持久化模型如今能夠在咱們的應用程序以外用於進行預測，咱們接下來將經過API執行操做。

一旦構建了機器學習模型，您就但願部署它以便開始進行預測。一種方法是經過REST API。它的核心部分須要作的是接受來自客戶端的數據輸入並回復預測。爲了幫助咱們這樣作，咱們將使用ASP.NET Core API。

設置API項目

咱們要作的第一件事是建立項目。

dotnet new webapi -o api

而後咱們想將這個新項目添加到咱們的解決方案中

dotnet sln mlnetacidemo.sln add api/api.csproj

添加依賴項

由於咱們將加載咱們的模型並經過咱們的API進行預測，因此咱們須要將ML.NET包添加到咱們的api項目中。

cd api
dotnet add package Microsoft.ML
dotnet restore

引用模型

在咱們構建機器學習模型的上一步中，它被保存到一個名爲的文件中model.zip。這是咱們將在API中引用的文件，以幫助咱們進行預測。要在咱們的API中引用它，只需將它從模型項目目錄複製到咱們的api項目目錄中。

建立數據模型

咱們的模型是使用數據結構構建的IrisData，IrisPrediction用於定義特徵以及預測屬性。所以，當咱們的模型經過咱們的API進行預測時，它也須要引用這些數據類型。所以，咱們須要在項目內部定義IrisData和IrisPrediction類api。類的內容幾乎與model項目中的內容相同，惟一的例外是咱們的命名空間從更改model爲api。

using Microsoft.ML.Runtime.Api;

namespace api
{
    public class IrisData
    {
        [Column("0")]
        public float SepalLength;

        [Column("1")]
        public float SepalWidth;

        [Column("2")]
        public float PetalLength;
        
        [Column("3")]
        public float PetalWidth;

        [Column("4")]
        [ColumnName("Label")]
        public string Label;
    }    
}

using Microsoft.ML.Runtime.Api;

namespace api
{
    public class IrisPrediction
    {
        [ColumnName("PredictedLabel")]
        public string PredictedLabels;
    }
}

構建控制器

如今咱們的項目已經創建，是時候添加一個控制器來處理來自客戶端的預測請求了。在Controllers咱們api項目的目錄中，咱們能夠建立一個PredictController使用單個POST端點調用的新類。該文件的內容應以下所示：

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.ML;

namespace api.Controllers
{
    [Route("api/[controller]")]
    public class PredictController : Controller
    {
        // POST api/predict
        [HttpPost]
        public string Post([FromBody] IrisData instance)
        {
            var model = PredictionModel.ReadAsync<IrisData,IrisPrediction>("model.zip").Result;
            var prediction = model.Predict(instance);
            return prediction.PredictedLabels;
        }
    }
}

測試API

當咱們的predict控制器完成編碼，就能夠來測試它了。從咱們mlnetacidemo解決方案的根目錄中，輸入如下命令。

dotnet run -p api/api.csproj

咱們的請求的正文應該相似於下面的代碼段：在POSTMAN或Insomnia等客戶端中，向端點發送HHTP POST請求http://localhost:5000/api/predict。

{
    "SepalLength": 3.3,
    "SepalWidth": 1.6,
    "PetalLength": 0.2,
    "PetalWidth": 5.1,
}

打包應用程序

若是成功，返回的輸出應該Iris-virginica與咱們的控制檯應用程序相同。大！如今咱們的應用程序已在本地成功運行，如今是時候將它打包到Docker容器中並將其推送到Docker Hub。

建立Dockerfile

在咱們的mlnetacidemo解決方案目錄中，使用如下內容建立一個Dockerfile：

FROM microsoft/dotnet:2.0-sdk AS build
WORKDIR /app

# copy csproj and restore as distinct layers
COPY *.sln .
COPY api/*.csproj ./api/
RUN dotnet restore

# copy everything else and build app
COPY api/. ./api/
WORKDIR /app/api
RUN dotnet publish -c release -o out


FROM microsoft/aspnetcore:2.0 AS runtime
WORKDIR /app
COPY api/model.zip .
COPY --from=build /app/api/out ./
ENTRYPOINT ["dotnet", "api.dll"]

構建鏡像

咱們須要在命令提示符中輸入如下命令。這須要一段時間，由於它須要下載.NET Core SDK和ASP.NET Core運行時Docker鏡像。

docker build -t <DOCKERUSERNAME>/<IMAGENAME>:latest .

本地測試鏡像

咱們須要在本地測試咱們的鏡像，以確保它能夠在雲上運行。爲此，咱們可使用該docker run命令。

docker run -d -p 5000:80 <DOCKERUSERNAME>/<IMAGENAME>:latest

要中止容器，請使用Ctrl + C。雖然API暴露了端口80，但咱們將其綁定到本地端口5000只是爲了保持咱們先前的API請求不變。向http://localhost:5000/api/predict適當的主體發送POST請求時，應該再次響應一樣的結果Iris-virginica。

推送到Docker Hub

如今Docker鏡像在本地成功運行，是時候推送到Docker Hub了。一樣，咱們使用Docker CLI來執行此操做。

docker login
docker push <DOCKERUSERNAME>/<IMAGENAME>:latest

部署到雲

如今，最後一步是向全世界部署和展現咱們的機器學習模型和API。咱們的部署將經過Azure容器實例進行，由於它幾乎不須要配置或管理服務器。

準備部署清單

儘管能夠在命令行中執行部署，但一般最好將全部配置放在文件中以備文檔，並節省時間，而沒必要每次都輸入參數。使用Azure，咱們能夠經過JSON文件來實現。

{
  "$schema":
    "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "containerGroupName": {
      "type": "string",
      "defaultValue": "mlnetacicontainergroup",
      "metadata": {
        "description": "Container Group name."
      }
    }
  },
  "variables": {
    "containername": "mlnetacidemo",
    "containerimage": "<DOCKERUSERNAME>/<IMAGENAME>:latest"
  },
  "resources": [
    {
      "name": "[parameters('containerGroupName')]",
      "type": "Microsoft.ContainerInstance/containerGroups",
      "apiVersion": "2018-04-01",
      "location": "[resourceGroup().location]",
      "properties": {
        "containers": [
          {
            "name": "[variables('containername')]",
            "properties": {
              "image": "[variables('containerimage')]",
              "resources": {
                "requests": {
                  "cpu": 1,
                  "memoryInGb": 1.5
                }
              },
              "ports": [
                {
                  "port": 80
                }
              ]
            }
          }
        ],
        "osType": "Linux",
        "ipAddress": {
          "type": "Public",
          "ports": [
            {
              "protocol": "tcp",
              "port": "80"
            }
          ]
        }
      }
    }
  ],
  "outputs": {
    "containerIPv4Address": {
      "type": "string",
      "value":
        "[reference(resourceId('Microsoft.ContainerInstance/containerGroups/', parameters('containerGroupName'))).ipAddress.ip]"
    }
  }
}

如今咱們可使用這個模板並將其保存到咱們mlnetacidemo解決方案根目錄下的文件azuredeploy.json中。惟一須要改變的是containerimage的配置，將其替換爲您的Docker Hub用戶名和剛剛推送到Docker Hub的鏡像的名稱。

部署

爲了部署咱們的應用程序，咱們須要確保登陸咱們的Azure賬戶。要經過Azure CLI執行此操做，請在命令提示符下鍵入：

az login

按照提示登陸。登陸後，是時候爲容器建立資源組了。

az group create --name mlnetacidemogroup --location eastus

成功建立組後，就能夠部署咱們的應用程序了。

az group deployment create --resource-group mlnetacidemogroup --template-file azuredeploy.json

完成後，可使用如下命令清理資源：

az group delete --name mlnetacidemogroup

爲部署初始化須要消耗幾分鐘的時間。若是部署成功，您應該在命令行上看到一些輸出。尋找ContainerIPv4Address主機，這是能夠訪問容器的IP地址，更換URL後再次作一個POST請求到http://<ContainerIPv4Address>/api/predict，ContainerIPv4Address是在部署後命令行中找到的值。若是成功，響應內容應該像之前的請求同樣返回Iris-virginica。

小結

在本文中，咱們構建了一個分類機器學習模型，使用ML.NET該模型預測鳶尾花的分類，給出了四種分類的預測功能，經過ASP.NET Core REST API公開它，將其打包到容器中並使用Azure Container Instances將其部署到雲中。雖然隨着模型的變化，這些操做變得更加複雜，可是目前介紹的內容已經足夠標準化，擴展此示例僅須要進行不多量的修改便可。