iOS-VideoToolbox硬編碼H264

時間 2019-11-30

標籤 ios videotoolbox 編碼 h264 欄目 iOS 简体版

原文原文鏈接

前言

VideoToolBox是iOS8以後，蘋果開發的用於硬解碼編碼H264/H265(iOS11之後支持)的API。git

對於H264還不瞭解的童鞋必定要先看下這邊的H264的簡介。github

H.264基礎簡介bash

編碼流程

咱們實現一個簡單的Demo，從攝像頭獲取到視頻數據，而後再編碼成H264裸數據保存在沙盒中。session

1. 建立初始化VideoToolBox數據結構

核心代碼以下

- (void)initVideoToolBox {
    dispatch_sync(encodeQueue  , ^{
        frameNO = 0;
        int width = 480, height = 640;
        OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self),  &encodingSession);
        NSLog(@"H264: VTCompressionSessionCreate %d", (int)status);
        if (status != 0)
        {
            NSLog(@"H264: Unable to create a H264 session");
            return ;
        }
        
        // 設置實時編碼輸出（避免延遲）
        VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
        VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel);
        
        // 設置關鍵幀（GOPsize)間隔
        int frameInterval = 24;
        CFNumberRef  frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);
        VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef);
        
        //設置指望幀率
        int fps = 24;
        CFNumberRef  fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);
        VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);
        
        
        //設置碼率，均值，單位是byte
        int bitRate = width * height * 3 * 4 * 8;
        CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate);
        VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef);
        
        //設置碼率，上限，單位是bps
        int bitRateLimit = width * height * 3 * 4;
        CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRateLimit);
        VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef);
        
        //開始編碼
        VTCompressionSessionPrepareToEncodeFrames(encodingSession);
    });
}

複製代碼

初始化這裏設置了編碼類型kCMVideoCodecType_H264, 分辨率640 * 480，fps，GOP，碼率。框架

2. 從攝像頭獲取視頻數據丟給VideoToolBox編碼成H264 ide

初始化視頻採集端核心代碼以下post

//初始化攝像頭採集端
- (void)initCapture{
    
    self.captureSession = [[AVCaptureSession alloc]init];
    
    //設置錄製640 * 480
    self.captureSession.sessionPreset = AVCaptureSessionPreset640x480;
    
    AVCaptureDevice *inputCamera = [self cameraWithPostion:AVCaptureDevicePositionBack];
  
    self.captureDeviceInput = [[AVCaptureDeviceInput alloc] initWithDevice:inputCamera error:nil];
    
    if ([self.captureSession canAddInput:self.captureDeviceInput]) {
        [self.captureSession addInput:self.captureDeviceInput];
    }
    
    self.captureDeviceOutput = [[AVCaptureVideoDataOutput alloc] init];
    [self.captureDeviceOutput setAlwaysDiscardsLateVideoFrames:NO];
    
    //設置YUV420p輸出
    [self.captureDeviceOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange] forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
    
    [self.captureDeviceOutput setSampleBufferDelegate:self queue:captureQueue];
    
    if ([self.captureSession canAddOutput:self.captureDeviceOutput]) {
        [self.captureSession addOutput:self.captureDeviceOutput];
    }
    
    //創建鏈接
    AVCaptureConnection *connection = [self.captureDeviceOutput connectionWithMediaType:AVMediaTypeVideo];
    [connection setVideoOrientation:AVCaptureVideoOrientationPortrait];
}
複製代碼

這裏須要注意設置的視頻分辨率和編碼器一致640 * 480. AVCaptureVideoDataOutput類型選用YUV420p。ui

攝像頭數據回調部分編碼

- (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection{
    dispatch_sync(encodeQueue, ^{
        [self encode:sampleBuffer];
    });
}

//編碼sampleBuffer
- (void) encode:(CMSampleBufferRef )sampleBuffer
{
    CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
    // 幀時間，若是不設置會致使時間軸過長。
    CMTime presentationTimeStamp = CMTimeMake(frameNO++, 1000);
    VTEncodeInfoFlags flags;
    OSStatus statusCode = VTCompressionSessionEncodeFrame(encodingSession,
                                                          imageBuffer,
                                                          presentationTimeStamp,
                                                          kCMTimeInvalid,
                                                          NULL, NULL, &flags);
    if (statusCode != noErr) {
        NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode);
        
        VTCompressionSessionInvalidate(encodingSession);
        CFRelease(encodingSession);
        encodingSession = NULL;
        return;
    }
    NSLog(@"H264: VTCompressionSessionEncodeFrame Success");
}
複製代碼

3.框架中出現的數據結構 CMSampleBufferRef 存放一個或者多個壓縮或未壓縮的媒體數據；下圖列舉了兩種CMSampleBuffer。

CMTime 64位的value，32位的scale，media的時間格式；

CMBlockBuffer 這裏能夠叫裸數據；

CVPixelBuffer 包含未壓縮的像素數據，圖像寬度、高度等；

pixelBufferAttributes CFDictionary包括寬高、像素格式（RGBA、YUV）、使用場景（OpenGL ES、Core Animation）

CVPixelBufferPool CVPixelBuffer的緩衝池，由於CVPixelBuffer的建立和銷燬開銷很大

CMVideoFormatDescription video格式，包括寬高、顏色空間、編碼格式信息等；對於H264，還包含sps和pps數據；

4. 編碼完成後的數據寫入H264

這裏編碼完成咱們先判斷的是否爲I幀，若是是須要讀取sps和pps參數集，爲何要這樣呢？

咱們先看一下一個裸數據H264（Elementary Stream）的NALU構成

H.264裸流中，不存在單獨的SPS、PPS包或幀，而是附加在I幀前面，存儲的通常形式爲

00 00 00 01 SPS 00 00 00 01 PPS 00 00 00 01 I幀

前面的這些00 00數據稱爲起始碼（Start Code），它們不屬於SPS、PPS的內容。

SPS（Sequence Parameter Sets）和PPS（Picture Parameter Set）：H.264的SPS和PPS包含了初始化H.264解碼器所須要的信息參數，包括編碼所用的profile，level，圖像的寬和高，deblock濾波器等。

上面介紹了sps和pps是封裝在CMFormatDescriptionRef中，因此咱們得先CMFormatDescriptionRef中取出sps和pps寫入h264裸流中。

這就不難理解寫入H264的流程了。

代碼以下

// 編碼完成回調
void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
    NSLog(@"didCompressH264 called with status %d infoFlags %d", (int)status, (int)infoFlags);
    if (status != 0) {
        return;
    }
    if (!CMSampleBufferDataIsReady(sampleBuffer)) {
        NSLog(@"didCompressH264 data is not ready ");
        return;
    }
    ViewController* encoder = (__bridge ViewController*)outputCallbackRefCon;
    bool keyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync);
    
    // 判斷當前幀是否爲關鍵幀
    // 獲取sps & pps數據
    if (keyframe)
    {
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        size_t sparameterSetSize, sparameterSetCount;
        const uint8_t *sparameterSet;
        OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 );
        if (statusCode == noErr)
        {
            // 得到了sps，再獲取pps
            size_t pparameterSetSize, pparameterSetCount;
            const uint8_t *pparameterSet;
            OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 );
            if (statusCode == noErr)
            {
                // 獲取SPS和PPS data
                NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
                NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
                if (encoder)
                {
                    [encoder gotSpsPps:sps pps:pps];
                }
            }
        }
    }
    
    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length, totalLength;
    char *dataPointer;
    
    //這裏獲取了數據指針，和NALU的幀總長度，前四個字節裏面保存的
    OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
    if (statusCodeRet == noErr) {
        size_t bufferOffset = 0;
        static const int AVCCHeaderLength = 4; // 返回的nalu數據前四個字節不是0001的startcode，而是大端模式的幀長度length
        
        // 循環獲取nalu數據
        while (bufferOffset < totalLength - AVCCHeaderLength) {
            uint32_t NALUnitLength = 0;
            // 讀取NALU長度的數據
            memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
            
            // 從大端轉系統端
            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
            
            NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
            [encoder gotEncodedData:data];
            
            // 移動到下一個NALU單元
            bufferOffset += AVCCHeaderLength + NALUnitLength;
        }
    }
    
}

//填充SPS和PPS數據
- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
    NSLog(@"gotSpsPps %d %d", (int)[sps length], (int)[pps length]);
    const char bytes[] = "\x00\x00\x00\x01";
    size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0'
    NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
  //寫入startcode
    [self.h264FileHandle writeData:ByteHeader];
    [self.h264FileHandle writeData:sps];
  //寫入startcode
    [self.h264FileHandle writeData:ByteHeader];
    [self.h264FileHandle writeData:pps];
    
}

//填充NALU數據
- (void)gotEncodedData:(NSData*)data
{
    NSLog(@"gotEncodedData %d", (int)[data length]);
    if (self.h264FileHandle != NULL)
    {
        const char bytes[] = "\x00\x00\x00\x01";
        size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0'
        NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
        //寫入startcode
        [self.h264FileHandle writeData:ByteHeader];
        //寫入NALU數據
        [self.h264FileHandle writeData:data];
    }
}

複製代碼

結束編碼後銷燬session

- (void)EndVideoToolBox
{
    VTCompressionSessionCompleteFrames(encodingSession, kCMTimeInvalid);
    VTCompressionSessionInvalidate(encodingSession);
    CFRelease(encodingSession);
    encodingSession = NULL;
}
複製代碼

這樣就完成了使用VideoToolbox 的H264編碼。編碼好的H264文件能夠從沙盒中取出。