iOS視頻編碼實戰VideoToolbox

時間 2019-11-06

標籤 ios 視頻編碼實戰 videotoolbox 欄目 iOS 简体版

原文原文鏈接

需求

iOS中編碼視頻數據,通常狀況而言一個項目僅須要一個編碼器,不過有時特殊需求可能須要兩個編碼器同時工做.本例中實現了編碼器類.僅經過指定不一樣編碼器的枚舉值就能夠快速生成須要的編碼器,且支持兩個編碼器一塊兒工做.ios

實現原理:

iOS中利用VideoToolBox框架完成視頻硬編碼操做,支持H.264,H.265編碼器.git

軟編碼：使用CPU進行編碼。github

硬編碼：不使用CPU進行編碼，使用顯卡GPU,專用的DSP、FPGA、ASIC芯片等硬件進行編碼。macos

閱讀前提:

音視頻基礎知識
推薦必讀:H264, H265硬件編解碼基礎及碼流分析
視頻採集:iOS視頻採集實戰(AVCaptureSession)
C,C++基礎

GitHub地址(附代碼) : Video Encoder

掘金地址 : Video Encoder

簡書地址 : Video Encoder

博客地址 : Video Encoder

測試結果

本例經過將編碼後的文件寫成.mov文件, 來測試h264, h265編碼效率, 錄製時間相同,場景基本相同,結果顯示h265僅須要h264一半的內存就能夠完成一樣的畫質.注意,錄製出來的文件只能用ffmpeg相關工具播放.bash

實現步驟

1. 初始化編碼器參數

本例中的編碼器類不是單例,由於咱們能夠生成出h264編碼器,h265編碼器,以及讓生成兩個不一樣類型編碼器對象同時工做.這裏指定的寬高幀率須要與相機保持一致. 比特率即播放過程當中平均碼率,是否支持實時編碼,若是支持實時編碼碼率則沒法控制.最後咱們僅僅能夠經過指定編碼器的類型來決定建立h264編碼器仍是h265編碼器.網絡

判斷是否支持編碼器

判斷是否支持hevc編碼器,並非全部的設備都支持h265編碼器,這由硬件決定,可是沒有直接的API去判斷是否支持h265編碼器,在這裏藉助AVAssetExportPresetHEVCHighestQuality屬性來間接判斷是否支持h265編碼.session

注意: h265編碼的軟件API須要在iOS 11以上的操做系統才能使用. 目前全部流行的iPhone已都支持h264編碼器.數據結構

// You could select h264 / h265 encoder.
    self.videoEncoder = [[XDXVideoEncoder alloc] initWithWidth:1280
                                                        height:720
                                                           fps:30
                                                       bitrate:2048
                                       isSupportRealTimeEncode:NO
                                                   encoderType:XDXH265Encoder]; // XDXH264Encoder
                                                  
-(instancetype)initWithWidth:(int)width height:(int)height fps:(int)fps bitrate:(int)bitrate isSupportRealTimeEncode:(BOOL)isSupportRealTimeEncode encoderType:(XDXVideoEncoderType)encoderType {
    if (self = [super init]) {
        mSession              = NULL;
        mVideoFile            = NULL;
        _width                = width;
        _height               = height;
        _fps                  = fps;
        _bitrate              = bitrate << 10;  //convert to bps
        _errorCount           = 0;
        _isSupportEncoder     = NO;
        _encoderType          = encoderType;
        _lock                 = [[NSLock alloc] init];
        _isSupportRealTimeEncode = isSupportRealTimeEncode;
        _needResetKeyParamSetBuffer = YES;
        if (encoderType == XDXH265Encoder) {
            if (@available(iOS 11.0, *)) {
                if ([[AVAssetExportSession allExportPresets] containsObject:AVAssetExportPresetHEVCHighestQuality]) {
                    _isSupportEncoder = YES;
                }
            }
        }else if (encoderType == XDXH264Encoder){
            _isSupportEncoder = YES;
        }
        
        log4cplus_info("Video Encoder:","Init encoder width:%d, height:%d, fps:%d, bitrate:%d, is support encoder:%d, encoder type:H%lu", width, height, fps, bitrate, isSupportRealTimeEncode, (unsigned long)encoderType);
    }
    
    return self;
}
複製代碼

2. 初始化編碼器

初始化一個編碼器分爲如下三個步驟, 首先新建一個VTCompressionSessionRef引用對象管理編碼器, 而後將編碼器全部屬性賦值給該對象.最後在編碼前預先分配一些資源(即爲要編碼的數據預先分配內存)以便編碼buffer使用.框架

- (void)configureEncoderWithWidth:(int)width height:(int)height {
    log4cplus_info("Video Encoder:", "configure encoder with and height for init,with = %d,height = %d",width, height);
    
    if(width == 0 || height == 0) {
        log4cplus_error("Video Encoder:", "encoder param can't is null. width:%d, height:%d",width, height);
        return;
    }
    
    self.width   = width;
    self.height  = height;
    
    mSession = [self configureEncoderWithEncoderType:self.encoderType
                                            callback:EncodeCallBack
                                               width:self.width
                                              height:self.height
                                                 fps:self.fps
                                             bitrate:self.bitrate
                             isSupportRealtimeEncode:self.isSupportRealTimeEncode
                                      iFrameDuration:30
                                                lock:self.lock];
}

- (VTCompressionSessionRef)configureEncoderWithEncoderType:(XDXVideoEncoderType)encoderType callback:(VTCompressionOutputCallback)callback width:(int)width height:(int)height fps:(int)fps bitrate:(int)bitrate isSupportRealtimeEncode:(BOOL)isSupportRealtimeEncode iFrameDuration:(int)iFrameDuration lock:(NSLock *)lock {
    log4cplus_info("Video Encoder:","configure encoder width:%d, height:%d, fps:%d, bitrate:%d, is support realtime encode:%d, I frame duration:%d", width, height, fps, bitrate, isSupportRealtimeEncode, iFrameDuration);
    
    [lock lock];
    // Create compression session
    VTCompressionSessionRef session = [self createCompressionSessionWithEncoderType:encoderType
                                                                              width:width
                                                                             height:height
                                                                           callback:callback];
    
    // Set compresssion property
    [self setCompressionSessionPropertyWithSession:session
                                               fps:fps
                                           bitrate:bitrate
                           isSupportRealtimeEncode:isSupportRealtimeEncode
                                    iFrameDuration:iFrameDuration
                                       EncoderType:encoderType];
    
    // Prepare to encode
    OSStatus status = VTCompressionSessionPrepareToEncodeFrames(session);
    [lock unlock];
    if(status != noErr) {
        log4cplus_error("Video Encoder:", "create encoder failed, status: %d",(int)status);
        return NULL;
    }else {
        log4cplus_info("Video Encoder:","create encoder success");
        return session;
    }
}

複製代碼

2.1. 建立`VTCompressionSessionRef`對象

VTCompressionSessionCreate: 建立視頻編碼器session, 即管理編碼器上下文的對象.異步

allocator: session的內存分配器.傳遞NULL表示默認的分配器.
width,height: 指定編碼器的像素的寬高,與捕捉到的視頻分辨率保持一致
codecType: 編碼器類型.目前可用h264, h265兩種主流編碼器,h264應用最爲普遍.h265編碼器是h264的下一代,壓縮性能更高,不過剛在iOS11中開放出來,存在一些bug.
encoderSpecification: 指定必須使用特定的編碼器.通常傳NULL便可.video toolbox會本身選擇.
sourceImageBufferAttributes: 原始視頻數據須要的屬性.主要用於建立a pixel buffer pool.
compressedDataAllocator: 壓縮數據的內存分配器.傳NULL表示使用默認的分配器.
outputCallback: 接收壓縮數據的回調.這個回調能夠選擇使用同步或異步方式接收.若是用同步則與VTCompressionSessionEncodeFrame函數線程保持一致,若是用異步會新建一條線程接收.該參數也可傳NULL不過當且僅當咱們使用VTCompressionSessionEncodeFrameWithOutputHandler函數做編碼時.
outputCallbackRefCon: 能夠傳入用戶自定義數據.主要用於回調函數與主類之間的交互.
compressionSessionOut: 傳入要建立的session的內存地址.注意,session不能爲NULL.

VT_EXPORT OSStatus 
VTCompressionSessionCreate(
	CM_NULLABLE CFAllocatorRef							allocator,
	int32_t												width,
	int32_t												height,
	CMVideoCodecType									codecType,
	CM_NULLABLE CFDictionaryRef							encoderSpecification,
	CM_NULLABLE CFDictionaryRef							sourceImageBufferAttributes,
	CM_NULLABLE CFAllocatorRef							compressedDataAllocator,
	CM_NULLABLE VTCompressionOutputCallback				outputCallback,
	void * CM_NULLABLE									outputCallbackRefCon,
	CM_RETURNS_RETAINED_PARAMETER CM_NULLABLE VTCompressionSessionRef * CM_NONNULL compressionSessionOut) API_AVAILABLE(macosx(10.8), ios(8.0), tvos(10.2));
複製代碼

下面是具體用法.注意若是相機採集的分辨率改變,須要銷燬當前編碼器session從新建立.

- (VTCompressionSessionRef)createCompressionSessionWithEncoderType:(XDXVideoEncoderType)encoderType width:(int)width height:(int)height callback:(VTCompressionOutputCallback)callback {
    CMVideoCodecType codecType;
    if (encoderType == XDXH264Encoder) {
        codecType = kCMVideoCodecType_H264;
    }else if (encoderType == XDXH265Encoder) {
        codecType = kCMVideoCodecType_HEVC;
    }else {
        return nil;
    }
    
    VTCompressionSessionRef session;
    OSStatus status = VTCompressionSessionCreate(NULL,
                                                 width,
                                                 height,
                                                 codecType,
                                                 NULL,
                                                 NULL,
                                                 NULL,
                                                 callback,
                                                 (__bridge void *)self,
                                                 &session);
    
    if (status != noErr) {
        log4cplus_error("Video Encoder:", "%s: Create session failed:%d",__func__,(int)status);
        return nil;
    }else {
        return session;
    }
}
複製代碼

2.2. 設置session屬性

查詢session是否支持當前屬性

建立好session後,調用VTSessionCopySupportedPropertyDictionary函數能夠將當前session支持的全部屬性拷貝到指定的字典中,之後在設置屬性前先在字典中查詢是否支持便可.

- (BOOL)isSupportPropertyWithSession:(VTCompressionSessionRef)session key:(CFStringRef)key {
    OSStatus status;
    static CFDictionaryRef supportedPropertyDictionary;
    if (!supportedPropertyDictionary) {
        status = VTSessionCopySupportedPropertyDictionary(session, &supportedPropertyDictionary);
        if (status != noErr) {
            return NO;
        }
    }
    
    BOOL isSupport = [NSNumber numberWithBool:CFDictionaryContainsKey(supportedPropertyDictionary, key)].intValue;
    return isSupport;
}
複製代碼

設置session的屬性

使用VTSessionSetProperty函數指定key, value便可設置屬性.

- (OSStatus)setSessionPropertyWithSession:(VTCompressionSessionRef)session key:(CFStringRef)key value:(CFTypeRef)value {
    if (value == nil || value == NULL || value == 0x0) {
        return noErr;
    }
    
    OSStatus status = VTSessionSetProperty(session, key, value);
    if (status != noErr)  {
        log4cplus_error("Video Encoder:", "Set session of %s Failed, status = %d",CFStringGetCStringPtr(key, kCFStringEncodingUTF8),status);
    }
    return status;
}
複製代碼

kVTCompressionPropertyKey_MaxFrameDelayCount: 編碼器在輸出壓縮幀前容許保留的最大幀數.默認爲kVTUnlimitedFrameDelayCount,即不限制保留幀數.好比當前要編碼10幀數據,最大延遲幀數爲3(M), 那麼在編碼10(N)幀視頻數據時,10-3(N-M)幀數據必須已經發送給編碼回調.即已經編好了N-M幀數據,還保留M幀未編碼的數據.
kVTCompressionPropertyKey_ExpectedFrameRate: 指望幀率,幀率以每秒鐘接收的視頻幀數量來衡量.此屬性沒法控制幀率而僅僅做爲編碼器編碼的指示.以便在編碼前設置內部配置.實際取決於視頻幀的duration而且多是不一樣的.默認是0,表示未知.
kVTCompressionPropertyKey_AverageBitRate: 長期編碼的平均碼率.此屬性不是一個絕對設置,實際產生的碼率可能高於此值.默認爲0,表示編碼器應該自行決定編碼數據的大小.注意,碼率設置僅在爲原始幀提供定時信息時有效，而且某些編解碼器不支持限制到指定的碼率。
kVTCompressionPropertyKey_DataRateLimits: 能夠選擇兩個如下的硬性限制對於碼率.每一個硬限制由以字節爲單位的數據大小和以秒爲單位的持續時間來描述，並要求該持續時間（在解碼時間內）的任何連續段的壓縮數據的總大小不得超過數據大小。默認狀況下，不設置數據速率限制。該屬性是偶數個CFNumber的CFArray，在字節和秒之間交替。請注意，數據速率設置僅在爲原始幀提供定時信息時有效，而且某些編解碼器不支持限制指定的數據速率。
kVTCompressionPropertyKey_RealTime: 是否實時執行壓縮.false表示視頻編碼器能夠比實時更慢地工做，以產生更好的結果.設置爲true能夠更加及時的編碼.默認爲NULL,表示未知.
kVTCompressionPropertyKey_AllowFrameReordering: 若是編碼器開啓B幀,則時間會亂序,編碼器必須從新排序.默認爲True,將其設置爲false以防止幀從新排序.注意: iOS中通常不用相機採集B幀.
kVTCompressionPropertyKey_ProfileLevel: 指定編碼比特流的配置文件和級別。可用的配置文件和級別因格式和視頻編碼器而異。視頻編碼器應該在可用的地方使用標準密鑰，而不是標準模式。
kVTCompressionPropertyKey_H264EntropyMode: H.264壓縮的熵編碼模式。若是H.264編碼器支持，則此屬性控制編碼器是使用基於上下文的自適應可變長度編碼（CAVLC）仍是基於上下文的自適應二進制算術編碼（CABAC）。CABAC一般以更高的計算開銷爲代價提供更好的壓縮。默認值是編碼器特定的，可能會根據其餘編碼器設置而改變。使用此屬性時應當心 - 更改可能會致使配置與請求的配置文件和級別不兼容。這種狀況下的結果是未定義的，可能包括編碼錯誤或不符合要求的輸出流。
kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration: 從一個關鍵幀到下一個關鍵幀的最長持續時間（秒）。默認爲零,沒有限制。當幀速率可變時，此屬性特別有用。此鍵能夠與kVTCompressionPropertyKey\_MaxKeyFrameInterval一塊兒設置，而且將強制執行這兩個限制 - 每X幀或每Y秒須要一個關鍵幀，以先到者爲準。
kVTCompressionPropertyKey_MaxKeyFrameInterval: 關鍵幀之間的最大間隔，以幀的數量爲單位。關鍵幀，也稱爲I幀，重置幀間依賴關係;解碼關鍵幀足以準備解碼器以正確解碼隨後的差別幀。容許視頻編碼器更頻繁地生成關鍵幀，若是這將致使更有效的壓縮。默認關鍵幀間隔爲0，表示視頻編碼器應選擇放置全部關鍵幀的位置。關鍵幀間隔爲1表示每幀必須是關鍵幀，2表示至少每隔一幀必須是關鍵幀等此鍵能夠與kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration一塊兒設置，而且將強制執行這兩個限制 - 每X幀或每Y秒須要一個關鍵幀，以先到者爲準。

// Set compresssion property
    [self setCompressionSessionPropertyWithSession:session
                                               fps:fps
                                           bitrate:bitrate
                           isSupportRealtimeEncode:isSupportRealtimeEncode
                                    iFrameDuration:iFrameDuration
                                       EncoderType:encoderType];

- (void)setCompressionSessionPropertyWithSession:(VTCompressionSessionRef)session fps:(int)fps bitrate:(int)bitrate isSupportRealtimeEncode:(BOOL)isSupportRealtimeEncode iFrameDuration:(int)iFrameDuration EncoderType:(XDXVideoEncoderType)encoderType {
    
    int maxCount = 3;
    if (!isSupportRealtimeEncode) {
        if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_MaxFrameDelayCount]) {
            CFNumberRef ref   = CFNumberCreate(NULL, kCFNumberSInt32Type, &maxCount);
            [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_MaxFrameDelayCount value:ref];
            CFRelease(ref);
        }
    }
    
    if(fps) {
        if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ExpectedFrameRate]) {
            int         value = fps;
            CFNumberRef ref   = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
            [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ExpectedFrameRate value:ref];
            CFRelease(ref);
        }
    }else {
        log4cplus_error("Video Encoder:", "Current fps is 0");
        return;
    }
    
    if(bitrate) {
        if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_AverageBitRate]) {
            int value = bitrate << 10;
            CFNumberRef ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
            [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_AverageBitRate value:ref];
            CFRelease(ref);
        }
    }else {
        log4cplus_error("Video Encoder:", "Current bitrate is 0");
        return;
    }
    
    
    if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_RealTime]) {
        log4cplus_info("Video Encoder:", "use realTimeEncoder");
        [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_RealTime value:isSupportRealtimeEncode ? kCFBooleanTrue : kCFBooleanFalse];
    }
    
    // Ban B frame.
    if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_AllowFrameReordering]) {
        [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_AllowFrameReordering value:kCFBooleanFalse];
    }
    
    if (encoderType == XDXH264Encoder) {
        if (isSupportRealtimeEncode) {
            if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel]) {
                [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel value:kVTProfileLevel_H264_Main_AutoLevel];
            }
        }else {
            if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel]) {
                [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel value:kVTProfileLevel_H264_Baseline_AutoLevel];
            }
            
            if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_H264EntropyMode]) {
                [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_H264EntropyMode value:kVTH264EntropyMode_CAVLC];
            }
        }
    }else if (encoderType == XDXH265Encoder) {
        if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel]) {
            [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_ProfileLevel value:kVTProfileLevel_HEVC_Main_AutoLevel];
        }
    }
    
    
    if([self isSupportPropertyWithSession:session key:kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration]) {
        int         value   = iFrameDuration;
        CFNumberRef ref     = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
        [self setSessionPropertyWithSession:session key:kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration value:ref];
        CFRelease(ref);
    }
    
    log4cplus_info("Video Encoder:", "The compression session max frame delay count = %d, expected frame rate = %d, average bitrate = %d, is support realtime encode = %d, I frame duration = %d",maxCount, fps, bitrate, isSupportRealtimeEncode,iFrameDuration);
}

複製代碼

2.3. 編碼前資源分配

您能夠選擇調用此函數，以便爲編碼器提供在開始編碼幀以前執行任何須要資源分配的機會。此可選調用可用於爲編碼器提供在開始編碼幀以前分配所需資源的機會。若是未調用此方法，則將在第一個VTCompressionSessionEncodeFrame調用上分配任何須要的資源。額外調用此函數將不起做用。

// Prepare to encode
    OSStatus status = VTCompressionSessionPrepareToEncodeFrames(session);
    [lock unlock];
    if(status != noErr) {
        log4cplus_error("Video Encoder:", "create encoder failed, status: %d",(int)status);
        return NULL;
    }else {
        log4cplus_info("Video Encoder:","create encoder success");
        return session;
    }
複製代碼

執行到這裏,初始化編碼器的工做已經作完,接下來咱們須要將視頻幀數據進行編碼. 本例中使用AVCaptureSession採集視頻幀以傳給編碼器編碼.

3.編碼

注意,由於編碼線程與建立,銷燬編碼器過程屬於異步操做,因此須要加鎖.

時間戳同步

首先咱們取第一幀視頻數據爲基準點,取系統當前時間,做爲編碼第一幀數據的基準時間. 此操做主要用於後期的音視頻同步,本例中不做過多說明,另外,時間戳同步生成機制也不像本例中這麼簡單.能夠自行制定生成規則.

時間戳校訂

判斷當前編碼的視頻幀中的時間戳是否大於前一幀, 由於視頻是嚴格按時間戳排序播放的,因此時間戳應該是一直遞增的,可是考慮到傳給編碼器的可能不是一個視頻源,好比一開始是攝像頭採集的,後面換成從網絡流解碼的視頻原始數據,此時時間戳一定不一樣步,若是強行將其傳給編碼器,則畫面會出現卡頓.

編碼視頻幀
- session: 先前配置好的session
- imageBuffer: 原始視頻數據
- presentationTimeStamp: 視頻幀的pts
- duration: 此幀的持續時間，將附加到樣本緩衝區。若是沒有持續時間信息，傳kCMTimeInvalid。
- frameProperties: 指定視頻幀的其餘屬性,這裏以是否強制產生I幀爲例.
- sourceFrameRefcon: 能夠傳遞給回調函數原始幀的引用.
- infoFlagsOut: 指向VTEncodeInfoFlags以接收有關編碼操做的信息。若是編碼是（或正在）異步運行，則能夠設置kVTEncodeInfo_Asynchronous位。若是幀被丟棄（同步），則能夠設置kVTEncodeInfo_FrameDropped位。若是您不想接收此信息，請傳遞NULL。

VT_EXPORT OSStatus
VTCompressionSessionEncodeFrame(
	CM_NONNULL VTCompressionSessionRef	session,
	CM_NONNULL CVImageBufferRef			imageBuffer,
	CMTime								presentationTimeStamp,
	CMTime								duration, // may be kCMTimeInvalid
	CM_NULLABLE CFDictionaryRef			frameProperties,
	void * CM_NULLABLE					sourceFrameRefcon,
	VTEncodeInfoFlags * CM_NULLABLE		infoFlagsOut ) API_AVAILABLE(macosx(10.8), ios(8.0), tvos(10.2));
複製代碼

-(void)startEncodeWithBuffer:(CMSampleBufferRef)sampleBuffer session:(VTCompressionSessionRef)session isNeedFreeBuffer:(BOOL)isNeedFreeBuffer isDrop:(BOOL)isDrop  needForceInsertKeyFrame:(BOOL)needForceInsertKeyFrame lock:(NSLock *)lock {
    [lock lock];
    
    if(session == NULL) {
        log4cplus_error("Video Encoder:", "%s,session is empty",__func__);
        [self handleEncodeFailedWithIsNeedFreeBuffer:isNeedFreeBuffer sampleBuffer:sampleBuffer];
        return;
    }
    
    //the first frame must be iframe then create the reference timeStamp;
    static BOOL isFirstFrame = YES;
    if(isFirstFrame && g_capture_base_time == 0) {
        CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
        g_capture_base_time = CMTimeGetSeconds(pts);// system absolutly time(s)
        //        g_capture_base_time = g_tvustartcaptureTime - (ntp_time_offset/1000);
        isFirstFrame = NO;
        log4cplus_error("Video Encoder:","start capture time = %u",g_capture_base_time);
    }
    
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CMTime presentationTimeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
    
    // Switch different source data will show mosaic because timestamp not sync.
    static int64_t lastPts = 0;
    int64_t currentPts = (int64_t)(CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000);
    if (currentPts - lastPts < 0) {
        log4cplus_error("Video Encoder:","Switch different source data the timestamp < last timestamp, currentPts = %lld, lastPts = %lld, duration = %lld",currentPts, lastPts, currentPts - lastPts);
        [self handleEncodeFailedWithIsNeedFreeBuffer:isNeedFreeBuffer sampleBuffer:sampleBuffer];
        return;
    }
    lastPts = currentPts;
    
    OSStatus status = noErr;
    NSDictionary *properties = @{(__bridge NSString *)kVTEncodeFrameOptionKey_ForceKeyFrame:@(needForceInsertKeyFrame)};
    status = VTCompressionSessionEncodeFrame(session,
                                             imageBuffer,
                                             presentationTimeStamp,
                                             kCMTimeInvalid,
                                             (__bridge CFDictionaryRef)properties,
                                             NULL,
                                             NULL);
    
    if(status != noErr) {
        log4cplus_error("Video Encoder:", "encode frame failed");
        [self handleEncodeFailedWithIsNeedFreeBuffer:isNeedFreeBuffer sampleBuffer:sampleBuffer];
    }
    
    [lock unlock];
    if (isNeedFreeBuffer) {
        if (sampleBuffer != NULL) {
            CFRelease(sampleBuffer);
            log4cplus_debug("Video Encoder:", "release the sample buffer");
        }
    }
}

複製代碼

4. h264碼流 - H264, H265硬件編解碼基礎及碼流分析

如下關於碼流部分的代碼若是看不懂,建議必定要先看下標題推薦的連接,裏面是瞭解編解碼器的基礎知識以及iOS中VideoToolbox框架中數據結構的解析.

5. 回調函數

排錯校驗

若是status中有錯誤信息,表示編碼失敗.能夠作一些特殊處理.

時間戳糾正

咱們須要爲編碼後的數據填充時間戳,這裏咱們能夠根據本身的規則制定一套時間戳生成規則,咱們這裏僅僅用最簡單的偏移量,即用第一幀視頻數據編碼前系統時間爲基準點,而後每幀編碼後的時間取採集到的時間戳減去基準時間獲得的值做爲編碼後數據的時間戳.

尋找I幀.

原始視頻數據通過編碼後分爲I幀,B幀,P幀.iOS端通常不開啓B幀,B幀須要從新排序,咱們拿到編碼後的數據首先經過kCMSampleAttachmentKey_DependsOnOthers屬性判斷是否爲I幀,若是是I幀,要從I幀中讀取NALU頭部關鍵信息,即vps,sps,pps. vps僅在h265編碼器中才有.沒有這些編碼的視頻沒法在另外一端播放,也沒法錄製成文件.

讀取編碼器關鍵信息

從I幀中能夠讀取到vps,sps,pps數據具體的內容.若是是h264編碼器調用CMVideoFormatDescriptionGetH264ParameterSetAtIndex函數,若是是h265編碼器調用CMVideoFormatDescriptionGetHEVCParameterSetAtIndex函數,其中第二個參數的索引值0,1,2就分別表明這些數據的索引值.

找到這些數據後咱們須要將它們拼接起來,由於它們是獨立的NALU,即以0x00, 0x00, 0x00, 0x01做爲隔斷符以區分sps,pps.

因此,咱們按照規則將拿到的vps,sps,pps中間分別以00 00 00 01做爲隔斷符以拼接成一個完整連續的buffer.本例以寫文件爲例,咱們首先要將NALU頭信息寫入文件,也就是將I幀先寫進去,由於I幀表明一個完整圖像,P幀須要依賴I幀才能產生圖像,因此咱們文件的讀取開頭必須是一個I幀數據.

一幀圖片跟NALU的關聯：

一幀圖片通過 H.264 編碼器以後，就被編碼爲一個或多個片（slice），而裝載着這些片（slice）的載體，就是 NALU 了。

注意：片（slice）的概念不一樣與幀（frame），幀（frame）是用做描述一張圖片的，一幀（frame）對應一張圖片，而片（slice），是 H.264 中提出的新概念，是經過編碼圖片後切分經過高效的方式整合出來的概念，一張圖片至少有一個或多個片（slice）。片（slice）都是又 NALU 裝載並進行網絡傳輸的，可是這並不表明 NALU 內就必定是切片，這是充分沒必要要條件，由於 NALU 還有可能裝載着其餘用做描述視頻的信息。

分割碼流中的NALU

首先經過CMBlockBufferGetDataPointer獲取視頻幀數據.該幀表示一段H264/H265碼流,其中可能包含多個NALU,咱們須要找出每一個NALU並用00 00 00 01做爲隔斷符. 即while循環就是尋找碼流中的NALU,由於裸流中不含有start code.咱們要將start code拷貝進去.

CFSwapInt32BigToHost: 從h264編碼的數據的大端模式(字節序)轉系統端模式

static void EncodeCallBack(void *outputCallbackRefCon,void *souceFrameRefCon,OSStatus status,VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
    XDXVideoEncoder *encoder = (__bridge XDXVideoEncoder*)outputCallbackRefCon;
    
    if(status != noErr) {
        NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        NSLog(@"H264: vtCallBack failed with %@", error);
        log4cplus_error("TVUEncoder", "encode frame failured! %s" ,error.debugDescription.UTF8String);
        return;
    }
    
    if (!encoder.isSupportEncoder) {
        return;
    }
    
    CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
    CMTime pts = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
    CMTime dts = CMSampleBufferGetDecodeTimeStamp(sampleBuffer);
    
    // Use our define time. (the time is used to sync audio and video)
    int64_t ptsAfter = (int64_t)((CMTimeGetSeconds(pts) - g_capture_base_time) * 1000);
    int64_t dtsAfter = (int64_t)((CMTimeGetSeconds(dts) - g_capture_base_time) * 1000);
    dtsAfter = ptsAfter;
    
    /*sometimes relative dts is zero, provide a workground to restore dts*/
    static int64_t last_dts = 0;
    if(dtsAfter == 0){
        dtsAfter = last_dts +33;
    }else if (dtsAfter == last_dts){
        dtsAfter = dtsAfter + 1;
    }
    
    BOOL isKeyFrame = NO;
    CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);
    if(attachments != NULL) {
        CFDictionaryRef attachment =(CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
        CFBooleanRef dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers);
        isKeyFrame = (dependsOnOthers == kCFBooleanFalse);
    }
    
    if(isKeyFrame) {
        static uint8_t *keyParameterSetBuffer    = NULL;
        static size_t  keyParameterSetBufferSize = 0;
        
        // Note: the NALU header will not change if video resolution not change.
        if (keyParameterSetBufferSize == 0 || YES == encoder.needResetKeyParamSetBuffer) {
            const uint8_t  *vps, *sps, *pps;
            size_t         vpsSize, spsSize, ppsSize;
            int            NALUnitHeaderLengthOut;
            size_t         parmCount;
            
            if (keyParameterSetBuffer != NULL) {
                free(keyParameterSetBuffer);
            }
            
            CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
            if (encoder.encoderType == XDXH264Encoder) {
                CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, &NALUnitHeaderLengthOut);
                CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, &NALUnitHeaderLengthOut);
                
                keyParameterSetBufferSize = spsSize+4+ppsSize+4;
                keyParameterSetBuffer = (uint8_t*)malloc(keyParameterSetBufferSize);
                memcpy(keyParameterSetBuffer, "\x00\x00\x00\x01", 4);
                memcpy(&keyParameterSetBuffer[4], sps, spsSize);
                memcpy(&keyParameterSetBuffer[4+spsSize], "\x00\x00\x00\x01", 4);
                memcpy(&keyParameterSetBuffer[4+spsSize+4], pps, ppsSize);
                
                log4cplus_info("Video Encoder:", "H264 find IDR frame， spsSize : %zu, ppsSize : %zu",spsSize, ppsSize);
            }else if (encoder.encoderType == XDXH265Encoder) {
                CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 0, &vps, &vpsSize, &parmCount, &NALUnitHeaderLengthOut);
                CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 1, &sps, &spsSize, &parmCount, &NALUnitHeaderLengthOut);
                CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 2, &pps, &ppsSize, &parmCount, &NALUnitHeaderLengthOut);
                
                keyParameterSetBufferSize = vpsSize+4+spsSize+4+ppsSize+4;
                keyParameterSetBuffer = (uint8_t*)malloc(keyParameterSetBufferSize);
                memcpy(keyParameterSetBuffer, "\x00\x00\x00\x01", 4);
                memcpy(&keyParameterSetBuffer[4], vps, vpsSize);
                memcpy(&keyParameterSetBuffer[4+vpsSize], "\x00\x00\x00\x01", 4);
                memcpy(&keyParameterSetBuffer[4+vpsSize+4], sps, spsSize);
                memcpy(&keyParameterSetBuffer[4+vpsSize+4+spsSize], "\x00\x00\x00\x01", 4);
                memcpy(&keyParameterSetBuffer[4+vpsSize+4+spsSize+4], pps, ppsSize);
                log4cplus_info("Video Encoder:", "H265 find IDR frame, vpsSize : %zu, spsSize : %zu, ppsSize : %zu",vpsSize,spsSize, ppsSize);
            }
            
            encoder.needResetKeyParamSetBuffer = NO;
        }
        
        if (encoder.isNeedRecord) {
            if (encoder->mVideoFile == NULL) {
                [encoder initSaveVideoFile];
                log4cplus_info("Video Encoder:", "Start video record.");
            }
            
            fwrite(keyParameterSetBuffer, 1, keyParameterSetBufferSize, encoder->mVideoFile);
        }
        
        log4cplus_info("Video Encoder:", "Load a I frame.");
    }
    
    size_t   blockBufferLength;
    uint8_t  *bufferDataPointer = NULL;
    CMBlockBufferGetDataPointer(block, 0, NULL, &blockBufferLength, (char **)&bufferDataPointer);
    
    size_t bufferOffset = 0;
    while (bufferOffset < blockBufferLength - kStartCodeLength)
    {
        uint32_t NALUnitLength = 0;
        memcpy(&NALUnitLength, bufferDataPointer+bufferOffset, kStartCodeLength);
        NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
        memcpy(bufferDataPointer+bufferOffset, kStartCode, kStartCodeLength);
        bufferOffset += kStartCodeLength + NALUnitLength;
    }
    
    if (encoder.isNeedRecord && encoder->mVideoFile != NULL) {
        fwrite(bufferDataPointer, 1, blockBufferLength, encoder->mVideoFile);
    }else {
        if (encoder->mVideoFile != NULL) {
            fclose(encoder->mVideoFile);
            encoder->mVideoFile = NULL;
            log4cplus_info("Video Encoder:", "Stop video record.");
        }
    }
    
//    log4cplus_debug("Video Encoder:","H265 encoded video:%lld, size:%lu, interval:%lld", dtsAfter,blockBufferLength, dtsAfter - last_dts);
    
    last_dts = dtsAfter;
}

複製代碼