IOS開發學習筆記（二）語音識別

時間 2019-11-24

原文原文鏈接

上次簡單地講解了如何利用科大訊飛完成語音合成，今天接着也把語音識別整理一下。固然，寫代碼前咱們須要作的一些工做（如申請appid、導庫），在上一篇語音合成的文章當中已經說過了，不瞭解的能夠看看我上次的博文，那麼此次直接從堆代碼開始吧。json

詳細步驟：api

1.導完類庫以後，在工程裏添加好用的頭文件。在視圖裏只用了一個UITextField顯示識別的內容，兩個UIButton（一個開始監聽語音，一個結束監聽），而後引入類、添加代理，和語音合成的同樣。app

MainViewController.h函數

 1 #import <UIKit/UIKit.h>
 2 #import "iflyMSC/IFlySpeechRecognizerDelegate.h"
 3 //引入語音識別類
 4 @class IFlyDataUploader;
 5 @class IFlySpeechUnderstander;
 6 //注意要添加語音識別代理
 7 @interface MainViewController : UIViewController<IFlySpeechRecognizerDelegate>
 8 @property (nonatomic,strong) IFlySpeechUnderstander *iFlySpeechUnderstander;
 9 @property (strong, nonatomic) IBOutlet UITextField *content;
10 @property (nonatomic,strong) NSString               *result;
11 @property (nonatomic,strong) NSString               *str_result;
12 @property (nonatomic)         BOOL                  isCanceled;
13 
14 - (IBAction)understand:(id)sender;
15 - (IBAction)finish:(id)sender;
16 
17 @end

MainViewController.m測試

  1 #import "MainViewController.h"
  2 #import <QuartzCore/QuartzCore.h>
  3 #import <AVFoundation/AVAudioSession.h>
  4 #import <AudioToolbox/AudioSession.h>
  5 
  6 #import "iflyMSC/IFlyContact.h"
  7 #import "iflyMSC/IFlyDataUploader.h"
  8 #import "iflyMSC/IFlyUserWords.h"
  9 #import "iflyMSC/IFlySpeechUtility.h"
 10 #import "iflyMSC/IFlySpeechUnderstander.h"
 11 
 12 @interface MainViewController ()
 13 
 14 @end
 15 
 16 @implementation MainViewController
 17 
 18 - (void)viewDidLoad
 19 {
 20     [super viewDidLoad];
 21     //建立識別對象
 22     //建立語音配置
 23     NSString *initString = [[NSString alloc] initWithFormat:@"appid=%@,timeout=%@",@"53b5560a",@"20000"];   
 24     
 25     //全部服務啓動前，須要確保執行createUtility
 26     [IFlySpeechUtility createUtility:initString];
 27     _iFlySpeechUnderstander = [IFlySpeechUnderstander sharedInstance];
 28     _iFlySpeechUnderstander.delegate = self;
 29 }
 30 
 31 -(void)viewWillDisappear:(BOOL)animated
 32 {
 33     [_iFlySpeechUnderstander cancel];
 34     _iFlySpeechUnderstander.delegate = nil;
 35     //設置回非語義識別
 36     [_iFlySpeechUnderstander destroy];
 37     [super viewWillDisappear:animated];
 38 }
 39 
 40 - (void)didReceiveMemoryWarning
 41 {
 42     [super didReceiveMemoryWarning];
 43 }
 44 
 45 - (IBAction)understand:(id)sender {
 46     bool ret = [_iFlySpeechUnderstander startListening];  //開始監聽
 47     if (ret) {
 48         self.isCanceled = NO;
 49     }
 50     else{
 51         NSLog(@"啓動識別失敗!");
 52     }  
 53 }
 54 
 55 - (IBAction)finish:(id)sender {
 56     [_iFlySpeechUnderstander stopListening];   //結束監聽，並開始識別
 57 }
 58 
 59 #pragma mark - IFlySpeechRecognizerDelegate
 60 /**
 61  * @fn      onVolumeChanged
 62  * @brief   音量變化回調
 63  * @param   volume      -[in] 錄音的音量，音量範圍1~100
 64  * @see
 65  */
 66 - (void) onVolumeChanged: (int)volume
 67 {
 68     
 69 }
 70 
 71 /**
 72  * @fn      onBeginOfSpeech
 73  * @brief   開始識別回調
 74  * @see
 75  */
 76 - (void) onBeginOfSpeech
 77 {
 78    
 79 }
 80 
 81 /**
 82  * @fn      onEndOfSpeech
 83  * @brief   中止錄音回調
 84  * @see
 85  */
 86 - (void) onEndOfSpeech
 87 {
 88    
 89 }
 90 
 91 /**
 92  * @fn      onError
 93  * @brief   識別結束回調
 94  * @param   errorCode   -[out] 錯誤類，具體用法見IFlySpeechError
 95  */
 96 - (void) onError:(IFlySpeechError *) error
 97 {
 98     NSString *text ;
 99     if (self.isCanceled) {
100         text = @"識別取消";
101     }
102     else if (error.errorCode ==0 ) {
103         if (_result.length==0) {
104             text = @"無識別結果";
105         }
106         else{
107             text = @"識別成功";
108         }
109     }
110     else{
111         text = [NSString stringWithFormat:@"發生錯誤：%d %@",error.errorCode,error.errorDesc];
112         NSLog(@"%@",text);
113     }
114 }
115 
116 /**
117  * @fn      onResults
118  * @brief   識別結果回調
119  * @param   result      -[out] 識別結果，NSArray的第一個元素爲NSDictionary，NSDictionary的key爲識別結果，value爲置信度
120  * @see
121  */
122 - (void) onResults:(NSArray *) results isLast:(BOOL)isLast
123 {
124     NSArray * temp = [[NSArray alloc]init];
125     NSString * str = [[NSString alloc]init];
126     NSMutableString *result = [[NSMutableString alloc] init];
127     NSDictionary *dic = results[0];
128     for (NSString *key in dic) {
129         [result appendFormat:@"%@",key];
130         
131     }
132     NSLog(@"聽寫結果：%@",result);
133     //---------訊飛語音識別JSON數據解析---------//
134     NSError * error;
135     NSData * data = [result dataUsingEncoding:NSUTF8StringEncoding];
136     NSLog(@"data: %@",data);
137     NSDictionary * dic_result =[NSJSONSerialization JSONObjectWithData:data options:NSJSONReadingMutableLeaves error:&error];
138     NSArray * array_ws = [dic_result objectForKey:@"ws"];
139     //遍歷識別結果的每個單詞
140     for (int i=0; i<array_ws.count; i++) {
141         temp = [[array_ws objectAtIndex:i] objectForKey:@"cw"];
142         NSDictionary * dic_cw = [temp objectAtIndex:0];
143         str = [str  stringByAppendingString:[dic_cw objectForKey:@"w"]];
144         NSLog(@"識別結果:%@",[dic_cw objectForKey:@"w"]);
145     }
146     NSLog(@"最終的識別結果:%@",str);
147     //去掉識別結果最後的標點符號
148     if ([str isEqualToString:@"。"] || [str isEqualToString:@"？"] || [str isEqualToString:@"！"]) {
149         NSLog(@"末尾標點符號：%@",str);
150     }
151     else{
152         self.content.text = str;
153     }
154     _result = str;
155 }
156 
157 @end

2.語音識別和語音合成在api函數的使用上大致類似，但語音識別返回的結果是json數據格式的。而後我在查看官方提供的sdk文件中，發現有UI界面的Sample裏使用的語音識別返回的竟是原本來本的字符串，原本是想換掉原來的語音識別方法，但實在不想用官方的UI界面，因此就本身分析識別結果，寫了一個Json數據解析。通過測試，Json數據解析的效果仍是能夠的，經過遍歷把分割的識別內容拼接起來，最後組合成一句完整的話就好了。atom

語法識別結果示例：spa

{"sn":1,"ls":true,"bg":0,"ed":0,"ws":[{"bg":0,"cw":[{"w":" 今天 ","sc":0}]},{"bg":0,"cw":[{"w":" 的
","sc":0}]},{"bg":0,"cw":[{"w":" 天氣 ","sc":0}]},{"bg":0,"cw":[{"w":" 怎麼
樣 ","sc":0}]},{"bg":0,"cw":[{"w":" 。","sc":0}]}]}