iOS開發中截取相機部分畫面，切割sampleBuffer（Crop sample buffer）

時間 2019-11-06

標籤 ios 開發截取相機部分畫面切割 samplebuffer crop sample buffer 欄目 iOS 简体版

原文原文鏈接

iOS開發中截取相機部分畫面，切割sampleBuffer（Crop sample buffer）

本例需求：在相似直播的功能界面,二維碼掃描，人臉識別或其餘需求中的功能界面或其餘需求中須要從相機捕獲的畫面中單獨截取出一部分區域。

原理：因爲須要截取相機捕獲整個畫面其中一部分，因此也就必須拿到那一部分畫面的數據，又由於相機AVCaptureVideoDataOutputSampleBufferDelegate中的sampleBuffer爲系統私有的數據結構不可直接操做，因此須要將其轉換成能夠切割的數據結構再進行切割，網上有種思路說將sampleBuffer間接轉換爲UIImage再對圖片切割，這種思路繁瑣且性能低，本例將sampleBuffer轉換爲CoreImage中的CIImage,性能相對較高且下降代碼繁瑣度。

最終效果以下，綠色框中即爲截圖的畫面，長按能夠移動。

GitHub地址(附代碼) : Crop sample buffer

注意：使用ARC與MRC下代碼有所區別，已經在項目中標註好，主要爲管理全局的CIContext對象，它在初始化的方法中編譯器沒有對其進行retain,因此，調用會報錯。

使用場景

本項目中相機捕捉的背景分辨率默認設置爲2K（即1920*1080），可切換爲4K ,因此須要iPhone 6s以上的設備才支持。
本例可使用CPU/GPU切割，在VC中須要在cropView初始化前設置isOpenGPU的值，打開則使用GPU,不然CPU
本例只實現了橫屏下的Crop功能，本例默認始終爲橫屏狀態，未作豎屏處理。

基本配置

1.配置相機基本環境(初始化AVCaptureSession，設置代理，開啓)，在示例代碼中有，這裏再也不重複。git

2.經過AVCaptureVideoDataOutputSampleBufferDelegate代理中拿到原始畫面數據(CMSampleBufferRef)進行處理github

實現途徑

1.利用CPU軟件截取(CPU進行計算並切割，消耗性能較大)

(CMSampleBufferRef)cropSampleBufferBySoftware:(CMSampleBufferRef)sampleBuffer；

2.利用硬件截取(利用Apple官方公開的方法利用硬件進行切割，性能較好，但仍有問題待解決)

(CMSampleBufferRef)cropSampleBufferByHardware:(CMSampleBufferRef)buffer；

解析

// Called whenever an AVCaptureVideoDataOutput instance outputs a new video frame. 每產生一幀視頻幀時調用一次
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
    CMSampleBufferRef cropSampleBuffer;
    
#warning 兩種切割方式任選其一，GPU切割性能較好，CPU切割取決於設備，通常時間長會掉幀。
    if (self.isOpenGPU) {
         cropSampleBuffer = [self.cropView cropSampleBufferByHardware:sampleBuffer];
    }else {
         cropSampleBuffer = [self.cropView cropSampleBufferBySoftware:sampleBuffer];
    }
    
    // 使用完後必須顯式release，不在iOS自動回收範圍
    CFRelease(cropSampleBuffer);
}

複製代碼

以上方法爲每產生一幀視頻幀時調用一次的相機代理，其中sampleBuffer爲每幀畫面的原始數據，須要對原始數據進行切割處理方可達到本例需求。注意最後必定要對cropSampleBuffer進行release避免內存溢出而發生閃退。

利用CPU截取

- (CMSampleBufferRef)cropSampleBufferBySoftware:(CMSampleBufferRef)sampleBuffer {
    OSStatus status;
    
    //    CVPixelBufferRef pixelBuffer = [self modifyImage:buffer];
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    // Lock the image buffer
    CVPixelBufferLockBaseAddress(imageBuffer,0);
    // Get information about the image
    uint8_t *baseAddress     = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
    size_t  bytesPerRow      = CVPixelBufferGetBytesPerRow(imageBuffer);
    size_t  width            = CVPixelBufferGetWidth(imageBuffer);
    // size_t  height           = CVPixelBufferGetHeight(imageBuffer);
    NSInteger bytesPerPixel  =  bytesPerRow/width;
    
    // YUV 420 Rule
    if (_cropX % 2 != 0) _cropX += 1;
    NSInteger baseAddressStart = _cropY*bytesPerRow+bytesPerPixel*_cropX;
    static NSInteger lastAddressStart = 0;
    lastAddressStart = baseAddressStart;
    
    // pixbuffer 與 videoInfo 只有位置變換或者切換分辨率或者相機重啓時須要更新，其他狀況不須要，Demo裏只寫了位置更新，其他狀況自行添加
    // NSLog(@"demon pix first : %zu - %zu - %@ - %d - %d - %d -%d",width, height, self.currentResolution,_cropX,_cropY,self.currentResolutionW,self.currentResolutionH);
    static CVPixelBufferRef            pixbuffer = NULL;
    static CMVideoFormatDescriptionRef videoInfo = NULL;
    
    // x,y changed need to reset pixbuffer and videoinfo
    if (lastAddressStart != baseAddressStart) {
        if (pixbuffer != NULL) {
            CVPixelBufferRelease(pixbuffer);
            pixbuffer = NULL;
        }
        
        if (videoInfo != NULL) {
            CFRelease(videoInfo);
            videoInfo = NULL;
        }
    }
    
    if (pixbuffer == NULL) {
        NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
                                 [NSNumber numberWithBool : YES],           kCVPixelBufferCGImageCompatibilityKey,
                                 [NSNumber numberWithBool : YES],           kCVPixelBufferCGBitmapContextCompatibilityKey,
                                 [NSNumber numberWithInt  : g_width_size],  kCVPixelBufferWidthKey,
                                 [NSNumber numberWithInt  : g_height_size], kCVPixelBufferHeightKey,
                                 nil];
        
        status = CVPixelBufferCreateWithBytes(kCFAllocatorDefault, g_width_size, g_height_size, kCVPixelFormatType_32BGRA, &baseAddress[baseAddressStart], bytesPerRow, NULL, NULL, (__bridge CFDictionaryRef)options, &pixbuffer);
        if (status != 0) {
            NSLog(@"Crop CVPixelBufferCreateWithBytes error %d",(int)status);
            return NULL;
        }
    }
    
    CVPixelBufferUnlockBaseAddress(imageBuffer,0);
    
    CMSampleTimingInfo sampleTime = {
        .duration               = CMSampleBufferGetDuration(sampleBuffer),
        .presentationTimeStamp  = CMSampleBufferGetPresentationTimeStamp(sampleBuffer),
        .decodeTimeStamp        = CMSampleBufferGetDecodeTimeStamp(sampleBuffer)
    };
    
    if (videoInfo == NULL) {
        status = CMVideoFormatDescriptionCreateForImageBuffer(kCFAllocatorDefault, pixbuffer, &videoInfo);
        if (status != 0) NSLog(@"Crop CMVideoFormatDescriptionCreateForImageBuffer error %d",(int)status);
    }
    
    CMSampleBufferRef cropBuffer = NULL;
    status = CMSampleBufferCreateForImageBuffer(kCFAllocatorDefault, pixbuffer, true, NULL, NULL, videoInfo, &sampleTime, &cropBuffer);
    if (status != 0) NSLog(@"Crop CMSampleBufferCreateForImageBuffer error %d",(int)status);
    
    lastAddressStart = baseAddressStart;
    
    return cropBuffer;
}

複製代碼

以上方法爲切割sampleBuffer的對象方法首先從CMSampleBufferRef中提取出CVImageBufferRef數據結構，而後對CVImageBufferRef進行加鎖處理，若是要進行頁面渲染，須要一個和OpenGL緩衝兼容的圖像。用相機API建立的圖像已經兼容，您能夠立刻映射他們進行輸入。假設你從已有畫面中截取一個新的畫面，用做其餘處理，你必須建立一種特殊的屬性用來建立圖像。對於圖像的屬性必須有Crop寬高，做爲字典的Key.所以建立字典的關鍵幾步不可省略。

位置的計算bash

在軟切中，咱們拿到一幀圖片的數據，經過遍歷其中的數據肯定真正要Crop的位置，利用以下公式可求出具體位置，具體切割原理在[YUV介紹]中有提到，計算時所需的變量在以上代碼中都可獲得。數據結構

`NSInteger baseAddressStart = _cropY*bytesPerRow+bytesPerPixel*_cropX;
    `
複製代碼

注意：iphone

1.對X,Y座標進行校訂，由於CVPixelBufferCreateWithBytes是按照像素進行切割，因此須要將點轉成像素，再按照比例算出當前位置。即爲上述代碼的int cropX = (int)(currentResolutionW / kScreenWidth * self.cropView.frame.origin.x); currentResolutionW爲當前分辨率的寬度，kScreenWidth爲屏幕實際寬度。
2.根據YUV 420的規則，每4個Y共用1個UV,而一行有2個Y，因此取點必須按照偶數取點。利用CPU切割中使用的方法爲YUV分隔法，具體切割方式請參考YUV介紹
3.本例中聲明pixelBuffer與videoInfo均爲靜態變量，爲了節省每次建立浪費內存，可是有三種狀況須要重置它們：位置變化，分辨率改變，重啓相機。文章最後注意詳細提到。

// hardware crop
- (CMSampleBufferRef)cropSampleBufferByHardware:(CMSampleBufferRef)buffer {
    // a CMSampleBuffer CVImageBuffer of media data.
    
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(buffer);
    CGRect           cropRect    = CGRectMake(_cropX, _cropY, g_width_size, g_height_size);
    //        log4cplus_debug("Crop", "dropRect x: %f - y : %f - width : %zu - height : %zu", cropViewX, cropViewY, width, height);
    
    /*
     First, to render to a texture, you need an image that is compatible with the OpenGL texture cache. Images that were created with the camera API are already compatible and you can immediately map them for inputs. Suppose you want to create an image to render on and later read out for some other processing though. You have to have create the image with a special property. The attributes for the image must have kCVPixelBufferIOSurfacePropertiesKey as one of the keys to the dictionary.
      若是要進行頁面渲染，須要一個和OpenGL緩衝兼容的圖像。用相機API建立的圖像已經兼容，您能夠立刻映射他們進行輸入。假設你從已有畫面中截取一個新的畫面，用做其餘處理，你必須建立一種特殊的屬性用來建立圖像。對於圖像的屬性必須有kCVPixelBufferIOSurfacePropertiesKey 做爲字典的Key.所以如下步驟不可省略
     */
    
    OSStatus status;
    
    /* Only resolution has changed we need to reset pixBuffer and videoInfo so that reduce calculate count */
    static CVPixelBufferRef            pixbuffer = NULL;
    static CMVideoFormatDescriptionRef videoInfo = NULL;
    
    if (pixbuffer == NULL) {
        NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
                                 [NSNumber numberWithInt:g_width_size],     kCVPixelBufferWidthKey,
                                 [NSNumber numberWithInt:g_height_size],    kCVPixelBufferHeightKey, nil];
        status = CVPixelBufferCreate(kCFAllocatorSystemDefault, g_width_size, g_height_size, kCVPixelFormatType_420YpCbCr8BiPlanarFullRange, (__bridge CFDictionaryRef)options, &pixbuffer);
        // ensures that the CVPixelBuffer is accessible in system memory. This should only be called if the base address is going to be used and the pixel data will be accessed by the CPU
        if (status != noErr) {
            NSLog(@"Crop CVPixelBufferCreate error %d",(int)status);
            return NULL;
        }
    }
    
    CIImage *ciImage = [CIImage imageWithCVImageBuffer:imageBuffer];
    ciImage = [ciImage imageByCroppingToRect:cropRect];
    // Ciimage get real image is not in the original point  after excute crop. So we need to pan.
    ciImage = [ciImage imageByApplyingTransform:CGAffineTransformMakeTranslation(-_cropX, -_cropY)];
    
    static CIContext *ciContext = nil;
    if (ciContext == nil) {
        //        NSMutableDictionary *options = [[NSMutableDictionary alloc] init];
        //        [options setObject:[NSNull null] forKey:kCIContextWorkingColorSpace];
        //        [options setObject:@0            forKey:kCIContextUseSoftwareRenderer];
        EAGLContext *eaglContext = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES3];
        ciContext = [CIContext contextWithEAGLContext:eaglContext options:nil];
    }
    [ciContext render:ciImage toCVPixelBuffer:pixbuffer];
    //    [ciContext render:ciImage toCVPixelBuffer:pixbuffer bounds:cropRect colorSpace:nil];
    
    CMSampleTimingInfo sampleTime = {
        .duration               = CMSampleBufferGetDuration(buffer),
        .presentationTimeStamp  = CMSampleBufferGetPresentationTimeStamp(buffer),
        .decodeTimeStamp        = CMSampleBufferGetDecodeTimeStamp(buffer)
    };
    
    if (videoInfo == NULL) {
        status = CMVideoFormatDescriptionCreateForImageBuffer(kCFAllocatorDefault, pixbuffer, &videoInfo);
        if (status != 0) NSLog(@"Crop CMVideoFormatDescriptionCreateForImageBuffer error %d",(int)status);
    }
    
    CMSampleBufferRef cropBuffer;
    status = CMSampleBufferCreateForImageBuffer(kCFAllocatorDefault, pixbuffer, true, NULL, NULL, videoInfo, &sampleTime, &cropBuffer);
    if (status != 0) NSLog(@"Crop CMSampleBufferCreateForImageBuffer error %d",(int)status);
    
    return cropBuffer;
}

複製代碼

以上爲硬件切割的方法，硬件切割利用GPU進行切割，主要利用CoreImage中CIContext 對象進行渲染。ide
CoreImage and UIKit coordinates （CoreImage 與 UIKit座標系問題）：我在開始作的時候跟正常同樣用設定的位置對圖像進行切割，可是發現，切出來的位置不對，經過上網查閱發現一個有趣的現象CoreImage 與 UIKit座標系不相同以下圖：正常UIKit座標系是以左上角爲原點：post

而CoreImage座標系是以左下角爲原點：（在CoreImage中，每一個圖像的座標系是獨立於設備的）性能

因此切割的時候必定要注意轉換Y，X的位置是正確的，Y是相反的。ui

若是要進行頁面渲染，須要一個和OpenGL緩衝兼容的圖像。用相機API建立的圖像已經兼容，您能夠立刻映射他們進行輸入。假設你從已有畫面中截取一個新的畫面，用做其餘處理，你必須建立一種特殊的屬性用來建立圖像。對於圖像的屬性必須有寬高做爲字典的Key.所以建立字典的關鍵幾步不可省略。
對CoreImage進行切割有兩種切割的方法都可用：

ciImage = [ciImage imageByCroppingToRect:cropRect]; 若是使用此行代碼則渲染時用[ciContext render:ciImage toCVPixelBuffer:pixelBuffer];
或者直接使用： [ciContext render:ciImage toCVPixelBuffer:pixelBuffer bounds:cropRect colorSpace:nil];

注意：CIContext 中包含圖像大量上下文信息，不能在回調中屢次調用，官方建議只初始化一次。可是注意ARC,MRC區別。

注意：

1. 使用ARC與MRC下代碼有所區別，已經在項目中標註好，主要爲管理全局的CIContext對象，它在初始化的方法中編譯器沒有對其進行retain,因此，調用會報錯。

2.切換先後置攝像頭：由於不一樣機型的先後置攝像頭差異較大，一種處理手段是在記錄iphone機型crop的plist文件中增長先後置攝像頭支持分辨率的屬性，而後在代碼中根據plist映射出來的模型進行分別引用。另外一種方案是作自動降級處理，例如後置支持2K，前置支持720P,則轉換後檢測到前置不支持2K就自動將前置下降一個等級，直到找到須要的等級。若是這樣操做處理邏輯較多且初看不易理解，而前置切割功能適用範圍不大，因此暫時只支持後置切割。

補充說明

屏幕邏輯分辨率與視頻分辨率

Point and pixel的區別由於此類說明網上不少，這裏就不作太多具體闡述，僅僅簡述一下 Point 便是設備的邏輯分辨率，即[UIScreen mainScreen].bounds.size.width 獲得的設備的寬高，因此點能夠簡單理解爲iOS開發中的座標系，方便對界面元素進行描述。spa
Pixel: 像素則是比點更精確的單位，在普通屏中1點=1像素，Retina屏中1點=2像素。
分辨率分辨率須要根據不一樣機型所支持的最大分辨率進行設置，例如iPhone 6S以上機型支持4k(3840 * 2160)分辨率拍攝視頻。而當咱們進行Crop操做的時候調用的API正是經過像素來進行切割，因此咱們操做的單位是pixel而不是point.下面會有詳細介紹。

ARC, MRC下所作工做不一樣

CIContext 的初始化

首先應該將CIContext聲明爲全局變量或靜態變量，由於CIContext初始化一次內部含有大量信息，比較耗內存，且只是渲染的時候使用，無需每次都初始化，而後以下若是在MRC中初始化完成後並未對ciContext發出retain的消息，因此須要手動retain,但在ARC下系統會自動完成此操做。

ARC:

static CIContext *ciContext = NULL;
ciContext = [CIContext contextWithOptions:nil];
複製代碼

MRC:

static CIContext *ciContext = NULL;
ciContext = [CIContext contextWithOptions:nil];
[ciContext retain];
複製代碼

座標問題

#####1. 理解點與像素的對應關係首先CropView須要在手機顯示出來，因此座標系仍是UIKit的座標系，左上角爲原點，寬高分別爲不一樣手機的寬高(如iPhone8 : 375*667, iPhone8P : 414 * 736, iPhoneX : 375 * 816),可是咱們須要算出實際分辨率下CropView的座標，即咱們能夠把當前獲取的cropView的x,y點的位置轉換成對應pixel的位置。

// 注意這裏求的是X的像素座標，以iPhone 8 爲例 （點爲375 * 667），分辨率爲(1920 * 1080)
_cropX  = (int)(_currentResolutionW / _screenWidth  * (cropView.frame.origin.x);
即
_cropX  = (int)(1920 / 375  * 當前cropView的x點座標;
複製代碼

#####2. CPU / GPU 兩種方式切割時座標系的位置不一樣

原點位置

CPU : UIKit爲座標系，原點在左上角

GPU : CoreImage爲座標系，原點在左下角

所以計算時若是使用GPU, y的座標是相反的，咱們須要經過以下公式轉換，即將點對應轉爲正常以左上角爲原點座標系中的點。

_cropY  = (int)(_currentResolutionH / _screenHeight * (_screenHeight - self.frame.origin.y  -  self.frame.size.height)); 
複製代碼

#####3. 當手機屏幕不是16:9時，若是將視頻設置爲填充滿屏幕則會出現誤差

須要注意的是，由於部分手機或iPad屏幕尺寸並不爲16:9(iPhone X, 全部iPad (4 : 3)),若是咱們在2k(1920 * 1080) , 4k (3840 * 2160 ) 分辨率下對顯示的View設置了 captureVideoPreviewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill; 那麼屏幕會犧牲一部分視頻填充視圖，即相機捕獲的視頻數據並無完整展示在手機視圖裏，因此再使用咱們的crop功能時，因爲咱們使用的是UIKit的座標系，也就是說原點（0,0）並非該幀圖片真正像素的(0,0)，而若是計算則須要寫不少額外代碼，因此咱們能夠在Crop功能下設置captureVideoPreviewLayer.videoGravity = AVLayerVideoGravityResizeAspect; 這樣的話video視圖會根據分辨率調整爲顯示完整視頻。可是設置後若是設備是iPhoneX (比例大於16:9,X軸會縮小，黑邊填充),iPad(比例小於16:9，y軸縮小，黑邊填充)。

按照如上解析，咱們以前計算的點會出現誤差，由於至關於x或y軸會縮小一部分，而咱們拿到的cropView的座標仍然是相對於整個父View而言。

這時，若是咱們經過不斷更改cropView則代碼量較大，因此我在這裏定義了一個videoRect屬性用來記錄Video真正的Rect,由於當程序運行時咱們能夠獲得屏幕寬高比例，因此經過肯定寬高比能夠拿到真正Video的rect,此時在後續代碼中咱們只須要傳入videoRect的尺寸進行計算，即時是原先正常16:9的手機後面API也無須更改。

#####4. 爲何用int 在軟切中，咱們在建立pixelBuffer時須要使用

CV_EXPORT CVReturn CVPixelBufferCreateWithBytes(
   CFAllocatorRef CV_NULLABLE allocator,
   size_t width,
   size_t height,
   OSType pixelFormatType,
   void * CV_NONNULL baseAddress,
   size_t bytesPerRow,
   CVPixelBufferReleaseBytesCallback CV_NULLABLE releaseCallback,
   void * CV_NULLABLE releaseRefCon,
   CFDictionaryRef CV_NULLABLE pixelBufferAttributes,
   CV_RETURNS_RETAINED_PARAMETER CVPixelBufferRef CV_NULLABLE * CV_NONNULL pixelBufferOut)
複製代碼

這個API,咱們須要將x,y的點放入baseAddress中，這裏又須要使用公式NSInteger baseAddressStart = _cropY*bytesPerRow+bytesPerPixel*_cropX;,可是這裏根據YUV 420的規則咱們咱們傳入的X的點不能爲奇數，因此咱們須要if (_cropX % 2 != 0) _cropX += 1;，而只有整型才能求餘，因此這裏的點咱們均定義爲int,在視圖展現中忽略小數點的偏差。