利用Meida Service的Java SDK來調用Azure Media Services的Index V2實現視頻字幕自動識別

Azure Media Services新的Index V2 支持自動將視頻文件中的語音自動識別成字幕文件WebVtt,很是方便的就能夠跟Azure Media Player集成,將一個原來沒字幕的視頻文件自動配上一個對應語言的字幕。並且強大的語音識別功能支持識別多國語言包括:json

  • English [EnUs]
  • Spanish [EsEs]
  • Chinese [ZhCn]
  • French [FrFr]
  • German [DeDe]
  • Italian [ItIt]
  • Portuguese [PtBr]
  • Arabic (Egyptian) [ArEg]

 

從上面列表咱們能夠看到這個Index  V2是支持中文的識別,若是公司裏已經存在了大量演講或者課程視頻可是又沒有配上字幕的話,Media Services的Index V2功能,可以很好的幫上忙。windows

下面咱們試試用Java的代碼來調用Media Services的這個功能:函數

引用Media Service的相關SDK,咱們須要在pom.xml增長几個dependency編碼

<dependency>
        <groupId>com.microsoft.azure</groupId>
        <artifactId>azure</artifactId>
        <version>1.0.0-beta2</version>
   </dependency>
  <dependency>
      <groupId>com.microsoft.azure</groupId>
      <artifactId>azure-media</artifactId>
      <version>0.9.4</version>
</dependency>

首先咱們準備好訪問Media Service的基本資料,譬如帳號和登陸的Keyspa

// Media Services account credentials configuration
    private static String mediaServiceUri = "https://media.windows.net/API/";
    private static String oAuthUri = "https://wamsprodglobal001acs.accesscontrol.windows.net/v2/OAuth2-13";
    private static String clientId = "wingsample";
    private static String clientSecret = "p8BDkk+kLYZzpnvP0B5KFy98uLTv7ALGuSX7F9LmHtk=";
    private static String scope = "urn:WindowsAzureMediaServices";

而後就是建立一個訪問Media Service的Context.net

public static MediaContract getMediaService(){
            
         Configuration configuration = MediaConfiguration.configureWithOAuthAuthentication(
                 mediaServiceUri, oAuthUri, clientId, clientSecret, scope);
         MediaContract  mediaService = MediaService.create(configuration);
    
         return mediaService;
    }

爲了可以調用一個processor來執行index,咱們須要一個獲取處理起的方法:code

public static MediaProcessorInfo getLatestProcessorByName(MediaContract mediaService,String processname){
        ListResult<MediaProcessorInfo> mediaProcessors;
        try {
            mediaProcessors = mediaService
                    .list(MediaProcessor.list().set("$filter", String.format("Name eq '%s'", processname)));
        
         // Use the latest version of the Media Processor
        MediaProcessorInfo mediaProcessor = null;
        for (MediaProcessorInfo info : mediaProcessors) {
            if (null == mediaProcessor || info.getVersion().compareTo(mediaProcessor.getVersion()) > 0) {
                mediaProcessor = info;
                return mediaProcessor;
            }
        }
        } catch (ServiceException e) {
            // TODO Auto-generated catch block
            ;e.printStackTrace();
        }
        return null;
    }

固然咱們好須要根據Asset的Id來獲取到某個具體的Asset來進行處理orm

public static AssetInfo getAssetById(MediaContract mediaService,String assetName) throws ServiceException
    {
        AssetInfo    resultAsset = mediaService.get(Asset.get(assetName));
     return resultAsset;
    }

有了AssetInfo,processor,和mediaserivce的Context,咱們就能夠執行Index V2視頻

public static String index2(MediaContract mediaService,AssetInfo assetInfo)
    {
        try
        {
            logger.info("start index2: " + assetInfo.getName());
           
            String config = "{"+
                  "\"version\":\"1.0\","+
                  "\"Features\":"+
                    "["+
                       "{"+
                       "\"Options\": {"+
                        "    \"Formats\":[\"WebVtt\",\"ttml\"],"+
                            "\"Language\":\"enUs\","+
                            "\"Type\":\"RecoOptions\""+
                       "},"+
                       "\"Type\":\"SpReco\""+
                    "}]"+
                "}";
          
            String taskXml = "<taskBody><inputAsset>JobInputAsset(0)</inputAsset>"
                    + "<outputAsset assetCreationOptions=\"0\"" // AssetCreationOptions.None
                    + " assetName=\"" + assetInfo.getName()+"index 2" + "\">JobOutputAsset(0)</outputAsset></taskBody>";
            
            System.out.println("config: " + config);
            MediaProcessorInfo indexerMP =
                getLatestProcessorByName(mediaService,"Azure Media Indexer 2 Preview");
                
                // Create a task with the Indexer Media Processor
                Task.CreateBatchOperation task =
                    Task.create(indexerMP.getId(), taskXml)
                        .setConfiguration(config)
                        .setName(assetInfo.getName() + "_Indexing");
                
                Job.Creator jobCreator = Job.create()
                    .setName(assetInfo.getName() + "_Indexing")
                    .addInputMediaAsset(assetInfo.getId())
                    .setPriority(2)
                    .addTaskCreator(task);
                
                final JobInfo jobInfo;
                final String jobId;
                synchronized (mediaService)
                {
                    jobInfo = mediaService.create(jobCreator);
                    jobId = jobInfo.getId();
                }
           //     checkJobStatus(jobId, assetInfo.getName());
                
                return jobId;//downloadAssetFilesFromJob(jobInfo);
            }
            catch (Exception e)
            {
                logger.error("Exception occured while running indexing job: "
                    + e.getMessage());
            }
            return "";
        }

這些方法都寫好了,咱們就能夠直接在Main函數裏面執行它了xml

public static void main( String[] args )
    {
        try {
        MediaContract mediaService=getMediaService();
        AssetInfo asset;
        
    asset = getAssetById(mediaService,"nb:cid:UUID:13144339-d09b-4e6f-a86b-3113a64dbabe");
        
        String result=index2(mediaService,asset);
           System.out.println( "Job:"+result );
    } catch (ServiceException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    }

從Index2這個函數裏面咱們有兩個東西是很重要的。一個是json格式preset結構,咱們若是須要更改識別語言,生成的格式的話,只須要對這個Json文件進行更改就行了。

{
  "version":"1.0",
  "Features":
    [
       {
       "Options": {
            "Formats":["WebVtt","ttml"],
            "Language":"enUs",
            "Type":"RecoOptions"
       },
       "Type":"SpReco"
    }]
}

一個是Task的描述XML,這個XML是用來描述這個任務是處理那個Asset,處理完放到那個Asset裏面。基本上跟Media Service相關的各類編碼,識別都須要這個task的xml配合對應的preset文件來處理的。

<?xml version="1.0" encoding="utf-16"?>
<taskBody>
  <inputAsset>JobInputAsset(0)</inputAsset>
  <outputAsset assetCreationOptions="0" assetName="ep48_mid.mp4index 2">JobOutputAsset(0)</outputAsset>
</taskBody>

Media Service的識別分析服務很是強大,它還包含了移動偵測、人臉識別、表情識別等等。

https://azure.microsoft.com/zh-cn/documentation/articles/media-services-analytics-overview/

相關文章
相關標籤/搜索