TensorFlow對Android、iOS、樹莓派都提供移動端支持。html
移動端應用原理。移動端、嵌入式設備應用深度學習方式,一模型運行在雲端服務器,向服務器發送請求,接收服務器響應;二在本地運行模型,PC訓練模型,放到移動端預測。向服務端請求數據可行性差,移動端資源稀缺。本地運行實時性更好。加速計算,內存空間和速度優化。精簡模型,節省內存空間,加快計算速度。加快框架執行速度,優化模型複雜度和每步計算速度。
精簡模型,用更低權得精度,量化(quantization)、權重剪枝(weight pruning,剪小權重鏈接,把全部權值鏈接低於閾值的從網絡移除)。加速框架執行,優化矩陣通用乘法(GEMM)運算,影響卷積層(先數據im2col運行,再GEMM運算)和全鏈接層。im2col,索引圖像塊重排列爲矩陣列。先將大矩陣重疊劃分多個子矩陣,每一個子矩陣序列化成向量,獲得另外一個矩陣。java
量化(quantitative)。《How to Quantize Neural Networks with TensorFlow》https://www.tensorflow.org/pe... 。離散化。用比32位浮點數更少空間存儲、運行模型,TensorFlow量化實現屏蔽存儲、運行細節。神經網絡預測,浮點影響速度,量化加快速度,保持較高精度。減少模型文件大小。存儲模型用8位整數,加載模型運算轉換回32位浮點數。下降預測過程計算資源。神經網絡噪聲健壯笥強,量化精度損失不會危害總體準確度。訓練,反向傳播須要計算梯度,不能用低精度格式直接訓練。PC訓練浮點數模型,轉8位,移動端用8位模型預測。
量化示例。GoogleNet模型轉8位模型例子。下載訓練好GoogleNet模型,http://download.tensorflow.or... 。node
bazel build tensorflow/tools/quantization:quantization_graph bazel-bin/tensorflow/tools/quantization/quantization_graph \ --input=/tmp/classify_image_graph_def.pb \ --output_node_names="softmax" --output=/tmp/quantized_graph.pb \ --mode=eightbit
生成量化後模型大小隻有原來的1/4。執行:python
bazel build tensorflow/examples/label_image:label_image bazel-bin/tensorflow/examples/label_image/label_image \ --image=/tmp/cropped_panda.jpg \ --graph=/tmp/quantized_graph.pb \ --labels=/tmp/imagenet_synset_to_human_label_map.txt \ --input_width=299 \ --input_height=299 \ --input_mean=128 \ --input_std=128 \ --input_layer="Mul:0" \ --output_layer="softmax:0"
量化過程實現。預測操做轉換成等價8位版本操做實現。原始Relu操做,輸入、輸出浮點數。量化Relu操做,根據輸入浮點數計算最大值、最小值,進入量化(Quantize)操做輸入數據轉換8位。保證輸出層輸入數據準確性,須要反量化(Dequantize)操做,權重轉回32位精度,保證預測準確性。整個模型前向傳播用8位整數支行,最後一層加反量化層,8位轉回32位輸出層輸入。每一個量化操做後執行反量化操做。android
量化數據表示。浮點數轉8位表示,是壓縮問題。權重、通過激活函數處理上層輸出,是分佈在一個範圍內的值。量化過程,找出最大值、最小值,將浮點數線性分佈,作線性擴展。ios
優化矩陣乘法運算。谷歌開源小型獨立低精度通用矩陣乘法(General Matrix to Matrix Multiplication,GEMM)庫 gemmlowp。https://github.com/google/gem... 。git
iOS系統實踐。github
環境準備。操做系統Mac OS X,集成開發工具Xcode 7.3以上版本。編譯TensorFlow核心靜態庫。tensorflow/contrib/makefiles/download_depencies.sh 。依賴庫下載到tensorflow/contrib/makefile/downloads目錄。eigen #C++開源矩陣計算工具。gemmlowp #小型獨立低精度通用矩陣乘法(GEMM)庫。googletest #谷歌開源C++測試框架。protobuf #谷歌開源數據交換格式協議。re2 #谷歌開源正則表達式庫。正則表達式
編譯演示程度,運行。tensorflow/contrib/makefile/build_all_iso.sh。編譯生成靜態庫,tensorflow/contrib/makefile/gen/lib:ios_ARM6四、ios_ARMV七、ios_ARMV7S、ios_I38六、ios_X86_6四、libtensorflow-core.a。Xcode模擬器或iOS設備運行APP預測示例。TensorFlow iOS示例。https://github.com/tensorflow... 。3個目錄。benchmark目錄是預測基準示例。simple目錄是圖片預測示例。camera目錄是視頻流實時預測示例。下載Inception V1模型,能識別1000類圖片,https://storage.googleapis.co... 。解壓模型,複製到benchmark、simple、camera的data目錄。運行目錄下xcodeproj文件。選擇iPhone 7 Plus模擬器,點擊運行標誌,編譯完成點擊Run Model按鈕。預測結果見Xcode 控制檯。macos
自定義模型編譯、運行。https://github.com/tensorflow... 。下載花卉數據 http://download.tensorflow.or... 。鬱金香(tulips)、玫瑰(roses)、浦公英(dandelion)、向日葵(sunflowers)、雛菊(daisy)5種花卉文件目錄,各800張圖片。
訓練原始模型。下載預訓練Inception V3模型 http://download.tensorflow.or... 。
python tensorflow/examples/image_retraining/retrain.py \ --bottlenectk_dir=/tmp/bottlenecks/ \ --how_many_training_steps 10 \ --model_dir=/tmp/inception \ --output_graph=/tmp/retrained_graph.pb \ --output_labels=/tmp/retrained_labels.txt \ --image_dir /tmp/flower_photos
訓練完成,/tmp目錄有模型文件retrained_graph.pb、標籤文件上retrained_labels.txt。「瓶頸」(bottlenecks)文件,描述實際分類最終輸出層前一層(倒數第二層)。倒數第二層訓練很好,瓶頸值是有意義緊湊圖像摘要,包含足夠信息使分類選擇。第一次訓練,retrain.py文件代碼先分析全部圖片,計算每張圖片瓶頸值存儲下來。每張圖片被使用屢次,沒必要重複計算。
編譯iOS支持模型。https://petewarden.com/2016/0... 。原始模型到iOS模型,先去掉iOS系統不支持操做,優化模型,再將模型量化,權重變8位常數,縮小模型,最後模型內存映射。
去掉iOS系統不支持操做,優化模型。iOS版本TensorFlow僅支持預測階段常見沒有大外部依賴關係操做。支持操做列表:https://github.com/tensorflow... 。DecodeJpeg不支持,JPEG格式圖片解碼,依賴libjpeg。從攝像頭實時識別花卉種類,直接處理相機圖像緩衝區,不存JPEG文件再解碼。預訓練模型Inception V3 從圖片數據集訓練,包含DecodeJpeg操做。輸入數據直接提供(feed)Decode後Mul操做,繞過Decode操做。優化加速預測,顯式批處理規範化(explicit batch normalization)操做合併到卷積權重,減小計算次數。
bazel build tensorflow/python/tools:optimize_for_inference bazel-bin/tensorflow/python/tools/optimize_for_inference \ --input=/tmp/retrained_graph.pb \ --output=/tmp/optimized_graph.pb \ --input_names=Mul \ --output_names=final_result \
label_image命令預測:
bazel-bin/tensorflow/examples/label_image/label_image \ --output_layer=final_result \ --labels=/tmp/output_labels.txt \ --image=/tmp/flower_photos/daisy/5547758_eea9edfd54_n.jpg --graph=/tmp/output_graph.pb \ --input_layer=Mul \ --input_mean=128 \ --input_std=128 \
量化模型。蘋果系統在.ipa包分發應用程度,全部應用程度資源都用zip壓縮。模型權重從浮點數轉整數(範圍0~255),損失準確度,小於1%。
bazel build tensorflow/tools/quantization:quantization_graph bazel-bin/tensorflow/tools/quantization/quantization_graph \ --input=/tmp/optimized_graph.pb \ --output=/tmp/rounded_graph.pb \ --output_node_names=final_result \ --mode=weights_rounded
內存映射 memory mapping。物理內存映射到進程地址空間內,應用程序直接用輸入/輸出地址空間,提升讀寫效率。模型所有一次性加載到內存緩衝區,會對iOS RAM施加過大壓力,操做系統會殺死內存佔用過多程序。模型權值緩衝區只讀,可映射到內存。從新排列模型,權重分部分逐塊從主GraphDef加載到內存。
bazel build tensorflow/contrib/util:convert_graphdef_memmapped_format bazel-bin/tensorflow/contrib/util/convert_graphdef_memmapped_format \ --in_graph=/tmp/rounded_graph.pb \ --out_graph=/tmp/mmapped_graph.pb
生成iOS工程文件運行。視頻流實進預測演示程序例子。https://github.com/tensorflow... 。模型文件、標記文件複製到data目錄。修改CameraExampleViewController.mm,更改加載模型文件名稱、輸入圖片尺寸、操做節點名字、縮放像素大小。
#import <AssertMacros.h> #import <AssetsLibrary/AssetsLibrary.h> #import <CoreImage/CoreImage.h> #import <ImageIO/ImageIO.h> #import "CameraExampleViewController.h" #include <sys/time.h> #include "tensorflow_utils.h" // If you have your own model, modify this to the file name, and make sure // you've added the file to your app resources too. static NSString* model_file_name = @"tensorflow_inception_graph"; static NSString* model_file_type = @"pb"; // This controls whether we'll be loading a plain GraphDef proto, or a // file created by the convert_graphdef_memmapped_format utility that wraps a // GraphDef and parameter file that can be mapped into memory from file to // reduce overall memory usage. const bool model_uses_memory_mapping = false; // If you have your own model, point this to the labels file. static NSString* labels_file_name = @"imagenet_comp_graph_label_strings"; static NSString* labels_file_type = @"txt"; // These dimensions need to match those the model was trained with. // 如下尺寸須要和模型訓練時相匹配 const int wanted_input_width =299;// 224; const int wanted_input_height = 299;//224; const int wanted_input_channels = 3; const float input_mean = 128.0f;//117.0f; const float input_std = 128.0f;//1.0f; const std::string input_layer_name = "Mul";//"input"; const std::string output_layer_name = "final_result";//"softmax1"; static void *AVCaptureStillImageIsCapturingStillImageContext = &AVCaptureStillImageIsCapturingStillImageContext; @interface CameraExampleViewController (InternalMethods) - (void)setupAVCapture; - (void)teardownAVCapture; @end @implementation CameraExampleViewController - (void)setupAVCapture { NSError *error = nil; session = [AVCaptureSession new]; if ([[UIDevice currentDevice] userInterfaceIdiom] == UIUserInterfaceIdiomPhone) [session setSessionPreset:AVCaptureSessionPreset640x480]; else [session setSessionPreset:AVCaptureSessionPresetPhoto]; AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo]; AVCaptureDeviceInput *deviceInput = [AVCaptureDeviceInput deviceInputWithDevice:device error:&error]; assert(error == nil); isUsingFrontFacingCamera = NO; if ([session canAddInput:deviceInput]) [session addInput:deviceInput]; stillImageOutput = [AVCaptureStillImageOutput new]; [stillImageOutput addObserver:self forKeyPath:@"capturingStillImage" options:NSKeyValueObservingOptionNew context:(void *)(AVCaptureStillImageIsCapturingStillImageContext)]; if ([session canAddOutput:stillImageOutput]) [session addOutput:stillImageOutput]; videoDataOutput = [AVCaptureVideoDataOutput new]; NSDictionary *rgbOutputSettings = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCMPixelFormat_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]; [videoDataOutput setVideoSettings:rgbOutputSettings]; [videoDataOutput setAlwaysDiscardsLateVideoFrames:YES]; videoDataOutputQueue = dispatch_queue_create("VideoDataOutputQueue", DISPATCH_QUEUE_SERIAL); [videoDataOutput setSampleBufferDelegate:self queue:videoDataOutputQueue]; if ([session canAddOutput:videoDataOutput]) [session addOutput:videoDataOutput]; [[videoDataOutput connectionWithMediaType:AVMediaTypeVideo] setEnabled:YES]; previewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:session]; [previewLayer setBackgroundColor:[[UIColor blackColor] CGColor]]; [previewLayer setVideoGravity:AVLayerVideoGravityResizeAspect]; CALayer *rootLayer = [previewView layer]; [rootLayer setMasksToBounds:YES]; [previewLayer setFrame:[rootLayer bounds]]; [rootLayer addSublayer:previewLayer]; [session startRunning]; if (error) { NSString *title = [NSString stringWithFormat:@"Failed with error %d", (int)[error code]]; UIAlertController *alertController = [UIAlertController alertControllerWithTitle:title message:[error localizedDescription] preferredStyle:UIAlertControllerStyleAlert]; UIAlertAction *dismiss = [UIAlertAction actionWithTitle:@"Dismiss" style:UIAlertActionStyleDefault handler:nil]; [alertController addAction:dismiss]; [self presentViewController:alertController animated:YES completion:nil]; [self teardownAVCapture]; } } - (void)teardownAVCapture { [stillImageOutput removeObserver:self forKeyPath:@"isCapturingStillImage"]; [previewLayer removeFromSuperlayer]; } - (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object change:(NSDictionary *)change context:(void *)context { if (context == AVCaptureStillImageIsCapturingStillImageContext) { BOOL isCapturingStillImage = [[change objectForKey:NSKeyValueChangeNewKey] boolValue]; if (isCapturingStillImage) { // do flash bulb like animation flashView = [[UIView alloc] initWithFrame:[previewView frame]]; [flashView setBackgroundColor:[UIColor whiteColor]]; [flashView setAlpha:0.f]; [[[self view] window] addSubview:flashView]; [UIView animateWithDuration:.4f animations:^{ [flashView setAlpha:1.f]; }]; } else { [UIView animateWithDuration:.4f animations:^{ [flashView setAlpha:0.f]; } completion:^(BOOL finished) { [flashView removeFromSuperview]; flashView = nil; }]; } } } - (AVCaptureVideoOrientation)avOrientationForDeviceOrientation: (UIDeviceOrientation)deviceOrientation { AVCaptureVideoOrientation result = (AVCaptureVideoOrientation)(deviceOrientation); if (deviceOrientation == UIDeviceOrientationLandscapeLeft) result = AVCaptureVideoOrientationLandscapeRight; else if (deviceOrientation == UIDeviceOrientationLandscapeRight) result = AVCaptureVideoOrientationLandscapeLeft; return result; } - (IBAction)takePicture:(id)sender { if ([session isRunning]) { [session stopRunning]; [sender setTitle:@"Continue" forState:UIControlStateNormal]; flashView = [[UIView alloc] initWithFrame:[previewView frame]]; [flashView setBackgroundColor:[UIColor whiteColor]]; [flashView setAlpha:0.f]; [[[self view] window] addSubview:flashView]; [UIView animateWithDuration:.2f animations:^{ [flashView setAlpha:1.f]; } completion:^(BOOL finished) { [UIView animateWithDuration:.2f animations:^{ [flashView setAlpha:0.f]; } completion:^(BOOL finished) { [flashView removeFromSuperview]; flashView = nil; }]; }]; } else { [session startRunning]; [sender setTitle:@"Freeze Frame" forState:UIControlStateNormal]; } } + (CGRect)videoPreviewBoxForGravity:(NSString *)gravity frameSize:(CGSize)frameSize apertureSize:(CGSize)apertureSize { CGFloat apertureRatio = apertureSize.height / apertureSize.width; CGFloat viewRatio = frameSize.width / frameSize.height; CGSize size = CGSizeZero; if ([gravity isEqualToString:AVLayerVideoGravityResizeAspectFill]) { if (viewRatio > apertureRatio) { size.width = frameSize.width; size.height = apertureSize.width * (frameSize.width / apertureSize.height); } else { size.width = apertureSize.height * (frameSize.height / apertureSize.width); size.height = frameSize.height; } } else if ([gravity isEqualToString:AVLayerVideoGravityResizeAspect]) { if (viewRatio > apertureRatio) { size.width = apertureSize.height * (frameSize.height / apertureSize.width); size.height = frameSize.height; } else { size.width = frameSize.width; size.height = apertureSize.width * (frameSize.width / apertureSize.height); } } else if ([gravity isEqualToString:AVLayerVideoGravityResize]) { size.width = frameSize.width; size.height = frameSize.height; } CGRect videoBox; videoBox.size = size; if (size.width < frameSize.width) videoBox.origin.x = (frameSize.width - size.width) / 2; else videoBox.origin.x = (size.width - frameSize.width) / 2; if (size.height < frameSize.height) videoBox.origin.y = (frameSize.height - size.height) / 2; else videoBox.origin.y = (size.height - frameSize.height) / 2; return videoBox; } - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection { CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer); CFRetain(pixelBuffer); [self runCNNOnFrame:pixelBuffer]; CFRelease(pixelBuffer); } - (void)runCNNOnFrame:(CVPixelBufferRef)pixelBuffer { assert(pixelBuffer != NULL); OSType sourcePixelFormat = CVPixelBufferGetPixelFormatType(pixelBuffer); int doReverseChannels; if (kCVPixelFormatType_32ARGB == sourcePixelFormat) { doReverseChannels = 1; } else if (kCVPixelFormatType_32BGRA == sourcePixelFormat) { doReverseChannels = 0; } else { assert(false); // Unknown source format } const int sourceRowBytes = (int)CVPixelBufferGetBytesPerRow(pixelBuffer); const int image_width = (int)CVPixelBufferGetWidth(pixelBuffer); const int fullHeight = (int)CVPixelBufferGetHeight(pixelBuffer); CVPixelBufferLockFlags unlockFlags = kNilOptions; CVPixelBufferLockBaseAddress(pixelBuffer, unlockFlags); unsigned char *sourceBaseAddr = (unsigned char *)(CVPixelBufferGetBaseAddress(pixelBuffer)); int image_height; unsigned char *sourceStartAddr; if (fullHeight <= image_width) { image_height = fullHeight; sourceStartAddr = sourceBaseAddr; } else { image_height = image_width; const int marginY = ((fullHeight - image_width) / 2); sourceStartAddr = (sourceBaseAddr + (marginY * sourceRowBytes)); } const int image_channels = 4; assert(image_channels >= wanted_input_channels); tensorflow::Tensor image_tensor( tensorflow::DT_FLOAT, tensorflow::TensorShape( {1, wanted_input_height, wanted_input_width, wanted_input_channels})); auto image_tensor_mapped = image_tensor.tensor<float, 4>(); tensorflow::uint8 *in = sourceStartAddr; float *out = image_tensor_mapped.data(); for (int y = 0; y < wanted_input_height; ++y) { float *out_row = out + (y * wanted_input_width * wanted_input_channels); for (int x = 0; x < wanted_input_width; ++x) { const int in_x = (y * image_width) / wanted_input_width; const int in_y = (x * image_height) / wanted_input_height; tensorflow::uint8 *in_pixel = in + (in_y * image_width * image_channels) + (in_x * image_channels); float *out_pixel = out_row + (x * wanted_input_channels); for (int c = 0; c < wanted_input_channels; ++c) { out_pixel[c] = (in_pixel[c] - input_mean) / input_std; } } } CVPixelBufferUnlockBaseAddress(pixelBuffer, unlockFlags); if (tf_session.get()) { std::vector<tensorflow::Tensor> outputs; tensorflow::Status run_status = tf_session->Run( {{input_layer_name, image_tensor}}, {output_layer_name}, {}, &outputs); if (!run_status.ok()) { LOG(ERROR) << "Running model failed:" << run_status; } else { tensorflow::Tensor *output = &outputs[0]; auto predictions = output->flat<float>(); NSMutableDictionary *newValues = [NSMutableDictionary dictionary]; for (int index = 0; index < predictions.size(); index += 1) { const float predictionValue = predictions(index); if (predictionValue > 0.05f) { std::string label = labels[index % predictions.size()]; NSString *labelObject = [NSString stringWithUTF8String:label.c_str()]; NSNumber *valueObject = [NSNumber numberWithFloat:predictionValue]; [newValues setObject:valueObject forKey:labelObject]; } } dispatch_async(dispatch_get_main_queue(), ^(void) { [self setPredictionValues:newValues]; }); } } CVPixelBufferUnlockBaseAddress(pixelBuffer, 0); } - (void)dealloc { [self teardownAVCapture]; } // use front/back camera - (IBAction)switchCameras:(id)sender { AVCaptureDevicePosition desiredPosition; if (isUsingFrontFacingCamera) desiredPosition = AVCaptureDevicePositionBack; else desiredPosition = AVCaptureDevicePositionFront; for (AVCaptureDevice *d in [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo]) { if ([d position] == desiredPosition) { [[previewLayer session] beginConfiguration]; AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:d error:nil]; for (AVCaptureInput *oldInput in [[previewLayer session] inputs]) { [[previewLayer session] removeInput:oldInput]; } [[previewLayer session] addInput:input]; [[previewLayer session] commitConfiguration]; break; } } isUsingFrontFacingCamera = !isUsingFrontFacingCamera; } - (void)didReceiveMemoryWarning { [super didReceiveMemoryWarning]; } - (void)viewDidLoad { [super viewDidLoad]; square = [UIImage imageNamed:@"squarePNG"]; synth = [[AVSpeechSynthesizer alloc] init]; labelLayers = [[NSMutableArray alloc] init]; oldPredictionValues = [[NSMutableDictionary alloc] init]; tensorflow::Status load_status; if (model_uses_memory_mapping) { load_status = LoadMemoryMappedModel( model_file_name, model_file_type, &tf_session, &tf_memmapped_env); } else { load_status = LoadModel(model_file_name, model_file_type, &tf_session); } if (!load_status.ok()) { LOG(FATAL) << "Couldn't load model: " << load_status; } tensorflow::Status labels_status = LoadLabels(labels_file_name, labels_file_type, &labels); if (!labels_status.ok()) { LOG(FATAL) << "Couldn't load labels: " << labels_status; } [self setupAVCapture]; } - (void)viewDidUnload { [super viewDidUnload]; } - (void)viewWillAppear:(BOOL)animated { [super viewWillAppear:animated]; } - (void)viewDidAppear:(BOOL)animated { [super viewDidAppear:animated]; } - (void)viewWillDisappear:(BOOL)animated { [super viewWillDisappear:animated]; } - (void)viewDidDisappear:(BOOL)animated { [super viewDidDisappear:animated]; } - (BOOL)shouldAutorotateToInterfaceOrientation: (UIInterfaceOrientation)interfaceOrientation { return (interfaceOrientation == UIInterfaceOrientationPortrait); } - (BOOL)prefersStatusBarHidden { return YES; } - (void)setPredictionValues:(NSDictionary *)newValues { const float decayValue = 0.75f; const float updateValue = 0.25f; const float minimumThreshold = 0.01f; NSMutableDictionary *decayedPredictionValues = [[NSMutableDictionary alloc] init]; for (NSString *label in oldPredictionValues) { NSNumber *oldPredictionValueObject = [oldPredictionValues objectForKey:label]; const float oldPredictionValue = [oldPredictionValueObject floatValue]; const float decayedPredictionValue = (oldPredictionValue * decayValue); if (decayedPredictionValue > minimumThreshold) { NSNumber *decayedPredictionValueObject = [NSNumber numberWithFloat:decayedPredictionValue]; [decayedPredictionValues setObject:decayedPredictionValueObject forKey:label]; } } oldPredictionValues = decayedPredictionValues; for (NSString *label in newValues) { NSNumber *newPredictionValueObject = [newValues objectForKey:label]; NSNumber *oldPredictionValueObject = [oldPredictionValues objectForKey:label]; if (!oldPredictionValueObject) { oldPredictionValueObject = [NSNumber numberWithFloat:0.0f]; } const float newPredictionValue = [newPredictionValueObject floatValue]; const float oldPredictionValue = [oldPredictionValueObject floatValue]; const float updatedPredictionValue = (oldPredictionValue + (newPredictionValue * updateValue)); NSNumber *updatedPredictionValueObject = [NSNumber numberWithFloat:updatedPredictionValue]; [oldPredictionValues setObject:updatedPredictionValueObject forKey:label]; } NSArray *candidateLabels = [NSMutableArray array]; for (NSString *label in oldPredictionValues) { NSNumber *oldPredictionValueObject = [oldPredictionValues objectForKey:label]; const float oldPredictionValue = [oldPredictionValueObject floatValue]; if (oldPredictionValue > 0.05f) { NSDictionary *entry = @{ @"label" : label, @"value" : oldPredictionValueObject }; candidateLabels = [candidateLabels arrayByAddingObject:entry]; } } NSSortDescriptor *sort = [NSSortDescriptor sortDescriptorWithKey:@"value" ascending:NO]; NSArray *sortedLabels = [candidateLabels sortedArrayUsingDescriptors:[NSArray arrayWithObject:sort]]; const float leftMargin = 10.0f; const float topMargin = 10.0f; const float valueWidth = 48.0f; const float valueHeight = 26.0f; const float labelWidth = 246.0f; const float labelHeight = 26.0f; const float labelMarginX = 5.0f; const float labelMarginY = 5.0f; [self removeAllLabelLayers]; int labelCount = 0; for (NSDictionary *entry in sortedLabels) { NSString *label = [entry objectForKey:@"label"]; NSNumber *valueObject = [entry objectForKey:@"value"]; const float value = [valueObject floatValue]; const float originY = (topMargin + ((labelHeight + labelMarginY) * labelCount)); const int valuePercentage = (int)roundf(value * 100.0f); const float valueOriginX = leftMargin; NSString *valueText = [NSString stringWithFormat:@"%d%%", valuePercentage]; [self addLabelLayerWithText:valueText originX:valueOriginX originY:originY width:valueWidth height:valueHeight alignment:kCAAlignmentRight]; const float labelOriginX = (leftMargin + valueWidth + labelMarginX); [self addLabelLayerWithText:[label capitalizedString] originX:labelOriginX originY:originY width:labelWidth height:labelHeight alignment:kCAAlignmentLeft]; if ((labelCount == 0) && (value > 0.5f)) { [self speak:[label capitalizedString]]; } labelCount += 1; if (labelCount > 4) { break; } } } - (void)removeAllLabelLayers { for (CATextLayer *layer in labelLayers) { [layer removeFromSuperlayer]; } [labelLayers removeAllObjects]; } - (void)addLabelLayerWithText:(NSString *)text originX:(float)originX originY:(float)originY width:(float)width height:(float)height alignment:(NSString *)alignment { CFTypeRef font = (CFTypeRef) @"Menlo-Regular"; const float fontSize = 20.0f; const float marginSizeX = 5.0f; const float marginSizeY = 2.0f; const CGRect backgroundBounds = CGRectMake(originX, originY, width, height); const CGRect textBounds = CGRectMake((originX + marginSizeX), (originY + marginSizeY), (width - (marginSizeX * 2)), (height - (marginSizeY * 2))); CATextLayer *background = [CATextLayer layer]; [background setBackgroundColor:[UIColor blackColor].CGColor]; [background setOpacity:0.5f]; [background setFrame:backgroundBounds]; background.cornerRadius = 5.0f; [[self.view layer] addSublayer:background]; [labelLayers addObject:background]; CATextLayer *layer = [CATextLayer layer]; [layer setForegroundColor:[UIColor whiteColor].CGColor]; [layer setFrame:textBounds]; [layer setAlignmentMode:alignment]; [layer setWrapped:YES]; [layer setFont:font]; [layer setFontSize:fontSize]; layer.contentsScale = [[UIScreen mainScreen] scale]; [layer setString:text]; [[self.view layer] addSublayer:layer]; [labelLayers addObject:layer]; } - (void)setPredictionText:(NSString *)text withDuration:(float)duration { if (duration > 0.0) { CABasicAnimation *colorAnimation = [CABasicAnimation animationWithKeyPath:@"foregroundColor"]; colorAnimation.duration = duration; colorAnimation.fillMode = kCAFillModeForwards; colorAnimation.removedOnCompletion = NO; colorAnimation.fromValue = (id)[UIColor darkGrayColor].CGColor; colorAnimation.toValue = (id)[UIColor whiteColor].CGColor; colorAnimation.timingFunction = [CAMediaTimingFunction functionWithName:kCAMediaTimingFunctionLinear]; [self.predictionTextLayer addAnimation:colorAnimation forKey:@"colorAnimation"]; } else { self.predictionTextLayer.foregroundColor = [UIColor whiteColor].CGColor; } [self.predictionTextLayer removeFromSuperlayer]; [[self.view layer] addSublayer:self.predictionTextLayer]; [self.predictionTextLayer setString:text]; } - (void)speak:(NSString *)words { if ([synth isSpeaking]) { return; } AVSpeechUtterance *utterance = [AVSpeechUtterance speechUtteranceWithString:words]; utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"en-US"]; utterance.rate = 0.75 * AVSpeechUtteranceDefaultSpeechRate; [synth speakUtterance:utterance]; } @end
連上iPhone手機,雙擊tensorflow/contrib/ios_examples/camera/camera_example.xcodeproj編譯運行。手機安裝好APP,打開APP,找到玫瑰花識別。訓練迭代次數10000次後,識別率99%以上。模擬器打包,生成打包工程文件位於/Users/libinggen/Library/Developer/Xcode/DeriveData/camera_example-dhfdsdfesfmrwtfb1fpfkfjsdfhdskf/Build/Products/Debug-iphoneos。打開CameraExample.app,有可執行文件CameraExample、資源文件模型文件mmapped_graph.pb、標記文件retrained_labels.txt。
Android系統實踐。
環境準備。MacBook Pro。Oracle官網下載JDK1.8版本。http://www.oracle.com/technet... 。jdk-8u111-macosx-x64.dmg。雙擊安裝。設置Java環境變量:
JAVA_HOME='/usr/libexec/java_home' export JAVA_HOME
搭建Android SDK環境。Android官網下載Android SDK,https://developer.android.com 。25.0.2版本。android-sdk_r25.0.2-macosx.zip。解壓到~/Library/Android/sdk目錄。build-tools、extras、patcher、platform-tools #各版本SDK 根據API Level劃分SDK版本、platforms、sources、system-images、temp #臨時文件夾 在SDK更新安裝時用到、tools #各版本通用SDK工具 有adb、aapt、aidl、dx文件。
搭建Android NDK環境。Android官網下載Android NDK Mac OS X版本,https://developer.android.com... 。android-ndk-r13b-darwin-x86_64.zip文件。解壓,CHANGELOG.md、build、ndk-build、ndk-depends、ndk-gdb、ndk-stack、ndk-which、platforms、prebuilt、python-packages、shader-tools、simpleperf、source.properties、sources、toolchains。搭建Bazel。brew安裝bazel:
brew install bazel
更新bazel:
brew upgrade bazel
編譯演示程序運行。修改tensorflow-1.1.0根目錄WORKSPACE文件。android_sdk_repository、android_ndk_repository配置改成用戶本身安裝目錄、版本。
android_sdk_repository( name = "androidsdk", api_level = 25, build_tools_version = "25.0.2", # Replace with path to Android SDK on your system path = "~/Library/Android/sdk" ) android_ndk_repository( name = "androidndk", api_level = 23, path = "~/Downloads/android-ndk-r13b" )
在根目錄用bazel構建:
bazel build // tensorflow/examples/android:tensorflow_demo
編譯成功,默認在tensorflow-1.1.0/bazel-bin/tensorflow/examples/android目錄生成TensorFlow演示程序。
運行。生成apk文件傳輸到手機,手機攝像頭看效果。Android 6.0.1。開啓「開發者模式」。手機用數據線與計算機相連,進入SDK所在目錄,進入platform-tools文件夾,找到adb命令,執行:
./adb install tensorflow-0.12/bazel-bin/tensorflow/examples/android/tensorflow_demo.apk
tensorflow_demo.apk自動安裝到手機。打開TF Detec App。App 調起手機攝像頭,攝像頭返回數據流實時監測。
自定義模型編譯運行。訓練原始模型、編譯Android系統支持模型、生成Android apk文件運行。
訓練原始模型、編譯Android系統支持模型。用項目根目錄tensorflow/python/tools/optimize_for_inference.py、tensorflow/tools/quantization/quantize_graph.py、tensorflow/contrib/util/convert_graphdef_memmapped_format.cc對模型優化。將第一步生成原始模型文件retrained_graph.pb、標記文件retrained_labels.txt放在tensorflow/examples/android/assets目錄。修改tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowImageClassifier.java要加載模型文件名稱,輸入圖片尺寸、操做節點名字、縮放像素大小。
package org.tensorflow.demo; import android.content.res.AssetManager; import android.graphics.Bitmap; import android.os.Trace; import android.util.Log; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Comparator; import java.util.List; import java.util.PriorityQueue; import java.util.Vector; import org.tensorflow.Operation; import org.tensorflow.contrib.android.TensorFlowInferenceInterface; /** A classifier specialized to label images using TensorFlow. */ public class TensorFlowImageClassifier implements Classifier { private static final String TAG = "TensorFlowImageClassifier"; // Only return this many results with at least this confidence. private static final int MAX_RESULTS = 3; private static final float THRESHOLD = 0.1f; // Config values. private String inputName; private String outputName; private int inputSize; private int imageMean; private float imageStd; // Pre-allocated buffers. private Vector<String> labels = new Vector<String>(); private int[] intValues; private float[] floatValues; private float[] outputs; private String[] outputNames; private boolean logStats = false; private TensorFlowInferenceInterface inferenceInterface; private TensorFlowImageClassifier() {} /** * Initializes a native TensorFlow session for classifying images. * * @param assetManager The asset manager to be used to load assets. * @param modelFilename The filepath of the model GraphDef protocol buffer. * @param labelFilename The filepath of label file for classes. * @param inputSize The input size. A square image of inputSize x inputSize is assumed. * @param imageMean The assumed mean of the image values. * @param imageStd The assumed std of the image values. * @param inputName The label of the image input node. * @param outputName The label of the output node. * @throws IOException */ public static Classifier create( AssetManager assetManager, String modelFilename, String labelFilename, int inputSize, int imageMean, float imageStd, String inputName, String outputName) { TensorFlowImageClassifier c = new TensorFlowImageClassifier(); c.inputName = inputName; c.outputName = outputName; // Read the label names into memory. // TODO(andrewharp): make this handle non-assets. String actualFilename = labelFilename.split("file:///android_asset/")[1]; Log.i(TAG, "Reading labels from: " + actualFilename); BufferedReader br = null; try { br = new BufferedReader(new InputStreamReader(assetManager.open(actualFilename))); String line; while ((line = br.readLine()) != null) { c.labels.add(line); } br.close(); } catch (IOException e) { throw new RuntimeException("Problem reading label file!" , e); } c.inferenceInterface = new TensorFlowInferenceInterface(assetManager, modelFilename); // The shape of the output is [N, NUM_CLASSES], where N is the batch size. final Operation operation = c.inferenceInterface.graphOperation(outputName); final int numClasses = (int) operation.output(0).shape().size(1); Log.i(TAG, "Read " + c.labels.size() + " labels, output layer size is " + numClasses); // Ideally, inputSize could have been retrieved from the shape of the input operation. Alas, // the placeholder node for input in the graphdef typically used does not specify a shape, so it // must be passed in as a parameter. c.inputSize = inputSize; c.imageMean = imageMean; c.imageStd = imageStd; // Pre-allocate buffers. c.outputNames = new String[] {outputName}; c.intValues = new int[inputSize * inputSize]; c.floatValues = new float[inputSize * inputSize * 3]; c.outputs = new float[numClasses]; return c; } @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Log this method so that it can be analyzed with systrace. Trace.beginSection("recognizeImage"); Trace.beginSection("preprocessBitmap"); // Preprocess the image data from 0-255 int to normalized float based // on the provided parameters. bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } Trace.endSection(); // Copy the input data into TensorFlow. Trace.beginSection("feed"); inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); Trace.endSection(); // Run the inference call. Trace.beginSection("run"); inferenceInterface.run(outputNames, logStats); Trace.endSection(); // Copy the output Tensor back into the output array. Trace.beginSection("fetch"); inferenceInterface.fetch(outputName, outputs); Trace.endSection(); // Find the best classifications. PriorityQueue<Recognition> pq = new PriorityQueue<Recognition>( 3, new Comparator<Recognition>() { @Override public int compare(Recognition lhs, Recognition rhs) { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); } }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } final ArrayList<Recognition> recognitions = new ArrayList<Recognition>(); int recognitionsSize = Math.min(pq.size(), MAX_RESULTS); for (int i = 0; i < recognitionsSize; ++i) { recognitions.add(pq.poll()); } Trace.endSection(); // "recognizeImage" return recognitions; } @Override public void enableStatLogging(boolean logStats) { this.logStats = logStats; } @Override public String getStatString() { return inferenceInterface.getStatString(); } @Override public void close() { inferenceInterface.close(); } }
從新編譯apk,鏈接Android手機,安裝apk:
bazel buld //tensorflow/examples/android:tensorflow_demo adb install -r -g bazel-bin/tensorflow/examples/android/tensorflow_demo.apk
樹莓派實踐。
Tensorflow能夠在樹莓派(Raspberry Pi)運行。樹莓派,只有信用卡大小微型電腦,系統基於Linux,有音頻、視頻功能。應用,輸入1萬張本身的面部圖片,在樹莓派訓練人臉識別模型,教會它認識你,你進入家門後,幫你開燈、播放音樂各類功能。樹莓派編譯方法和直接在Linux環境上用類似。
參考資料:
《TensorFlow技術解析與實戰》
歡迎推薦上海機器學習工做機會,個人微信:qingxingfengzi