音視頻學習之 - H264編碼

有了前面[音視頻學習之 - 基礎概念和[音視頻學習之 - H264結構與碼流解析的基礎,這篇文章開始寫代碼,前面根據AVFoundation框架作的採集工做流程就不寫了,直接從採集的代理方法**captureOutput: didOutputSampleBuffer: fromConnection:**裏開始對視頻幀就行編碼。大體的流程分爲三步:bash

  1. 準備編碼器,即建立session:VTCompressionSessionCreate,並設置編碼器屬性;
  2. 開始編碼:VTCompressionSessionEncodeFrame
  3. 編碼完成的回調裏處理數據:添加起始碼**"\x00\x00\x00\x01",添加sps pps**等。
  4. 結束編碼,清除數據,釋放資源。

準備編碼器

  1. 建立session : VTCompressionSessionCreate
  2. 設置屬性:VTSessionSetProperty 是否實時編碼輸出、是否產生B幀、設置關鍵幀、設置指望幀率、設置碼率、最大碼率值等等
  3. 準備開始編碼:VTCompressionSessionPrepareToEncodeFrames
-(void)initVideoToolBox
{
    // cEncodeQueue是一個串行隊列
    dispatch_sync(cEncodeQueue, ^{

        frameID = 0;
        int width = 480,height = 640;
        
        //建立編碼session
        OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &cEncodeingSession);
        NSLog(@"H264:VTCompressionSessionCreate:%d",(int)status);
        
        if (status != 0) {
            NSLog(@"H264:Unable to create a H264 session");
            return ;
        }
        
        //設置實時編碼輸出(避免延遲)
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_ProfileLevel,kVTProfileLevel_H264_Baseline_AutoLevel);
        
        //是否產生B幀(由於B幀在解碼時並非必要的,是能夠拋棄B幀的)
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_AllowFrameReordering, kCFBooleanFalse);
        
        //設置關鍵幀(GOPsize)間隔,GOP過小的話圖像會模糊
        int frameInterval = 10;
        CFNumberRef frameIntervalRaf = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRaf);
        
        //設置指望幀率,不是實際幀率
        int fps = 10;
        CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);
        
        //碼率的理解:碼率大了話就會很是清晰,但同時文件也會比較大。碼率小的話,圖像有時會模糊,但也勉強能看
        //碼率計算公式,參考印象筆記
        //設置碼率、上限、單位是bps
        int bitRate = width * height * 3 * 4 * 8;
        CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef);
        
        //設置碼率,均值,單位是byte
        int bigRateLimit = width * height * 3 * 4;
        CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bigRateLimit);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef);
        
        //準備開始編碼
        VTCompressionSessionPrepareToEncodeFrames(cEncodeingSession);

    });
    
}
複製代碼

VTCompressionSessionCreate建立編碼對象參數詳解: 網絡

image.png

  • allocator:NULL 分配器,設置NULL爲默認分配
  • width:width
  • height:height
  • codecType:編碼類型,如kCMVideoCodecType_H264
  • encoderSpecification:NULL encoderSpecification: 編碼規範。設置NULL由videoToolbox本身選擇
  • sourceImageBufferAttributes:NULL sourceImageBufferAttributes: 源像素緩衝區屬性.設置NULL不讓videToolbox建立,而本身建立
  • compressedDataAllocator:壓縮數據分配器.設置NULL,默認的分配
  • outputCallback:編碼回調 , 當VTCompressionSessionEncodeFrame被調用壓縮一次後會被異步調用.這裏設置的函數名是 didCompressH264
  • outputCallbackRefCon:回調客戶定義的參考值,此處把self傳過去,由於咱們須要在C函數中調用self的方法,而C函數沒法直接調self
  • compressionSessionOut: 編碼會話變量

開始編碼

  1. 拿到未編碼的視頻幀: CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
  2. 設置幀時間:CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000);
  3. 開始編碼:調用 VTCompressionSessionEncodeFrame進行編碼
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    //開始視頻錄製,獲取到攝像頭的視頻幀,傳入encode 方法中
    dispatch_sync(cEncodeQueue, ^{
        [self encode:sampleBuffer];
    });
}
複製代碼
- (void) encode:(CMSampleBufferRef )sampleBuffer
{
  //拿到每一幀未編碼數據
  CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);

  //設置幀時間
  CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000);

  //開始編碼 
  OSStatus statusCode = VTCompressionSessionEncodeFrame(cEncodeingSession, imageBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, NULL, &flags);

  if (statusCode != noErr) {
        //編碼失敗
        NSLog(@"H.264:VTCompressionSessionEncodeFrame faild with %d",(int)statusCode);
        
        //釋放資源
        VTCompressionSessionInvalidate(cEncodeingSession);
        CFRelease(cEncodeingSession);
        cEncodeingSession = NULL;
        return;
    }
}

複製代碼

VTCompressionSessionEncodeFrame編碼函數參數詳解: session

image.png

  • session :編碼會話變量
  • imageBuffer:未編碼的數據
  • presentationTimeStamp:獲取到的這個sample buffer數據的展現時間戳。每個傳給這個session的時間戳都要大於前一個展現時間戳
  • duration:對於獲取到sample buffer數據,這個幀的展現時間.若是沒有時間信息,可設置kCMTimeInvalid.
  • frameProperties:包含這個幀的屬性.幀的改變會影響後邊的編碼幀.
  • sourceFrameRefcon:回調函數會引用你設置的這個幀的參考值.
  • infoFlagsOut:指向一個VTEncodeInfoFlags來接受一個編碼操做.若是使用異步運行,kVTEncodeInfo_Asynchronous被設置;同步運行,kVTEncodeInfo_FrameDropped被設置;設置NULL爲不想接受這個信息.

編碼完成後數據處理

  1. 判斷是不是關鍵幀:是的話,CMVideoFormatDescriptionGetH264ParameterSetAtIndex獲取sps和pps信息,並轉換爲二進制寫入文件或者進行上傳
  2. 組裝NALU數據: 獲取編碼後的h264流數據:CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer),經過 首地址 、單個長度、 總長度經過dataPointer指針偏移作遍歷 OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer); 讀取數據時有個大小端模式:網絡傳輸通常都是大端模式
/*
    1.H264硬編碼完成後,回調VTCompressionOutputCallback
    2.將硬編碼成功的CMSampleBuffer轉換成H264碼流,經過網絡傳播
    3.解析出參數集SPS & PPS,加上開始碼組裝成 NALU。提現出視頻數據,將長度碼轉換爲開始碼,組成NALU,將NALU發送出去。
 */
void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer)
{
    NSLog(@"didCompressH264 called with status %d infoFlags %d",(int)status,(int)infoFlags);
    //狀態錯誤
    if (status != 0) {
        return;
    }
    
    //沒準備好
    if (!CMSampleBufferDataIsReady(sampleBuffer)) {
        NSLog(@"didCompressH264 data is not ready");
        return;
    }
    
    ViewController *encoder = (__bridge ViewController *)outputCallbackRefCon;
    
    //判斷當前幀是否爲關鍵幀
    CFArrayRef array = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true);
    CFDictionaryRef dic = CFArrayGetValueAtIndex(array, 0);
    bool keyFrame = !CFDictionaryContainsKey(dic, kCMSampleAttachmentKey_NotSync);
    
    //判斷當前幀是否爲關鍵幀
    //獲取sps & pps 數據 只獲取1次,保存在h264文件開頭的第一幀中
    //sps(sample per second 採樣次數/s),是衡量模數轉換(ADC)時採樣速率的單位
    //pps()
    if (keyFrame) {
        //圖像存儲方式,編碼器等格式描述
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        
        //sps
        size_t sparameterSetSize,sparameterSetCount;
        const uint8_t *sparameterSet;
        OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0);
        
        if (statusCode == noErr) {
            
            //獲取pps
            size_t pparameterSetSize,pparameterSetCount;
            const uint8_t *pparameterSet;
            
            //從第一個關鍵幀獲取sps & pps
            OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0);
            
            //獲取H264參數集合中的SPS和PPS
            if (statusCode == noErr)
            {
                NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
                NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
                
                if(encoder)
                {
                    [encoder gotSpsPps:sps pps:pps];
                }
            }
        }
    }
    
    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length,totalLength;
    char *dataPointer;
    OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
    if (statusCodeRet == noErr) {
        size_t bufferOffset = 0;
        static const int AVCCHeaderLength = 4;//返回的nalu數據前4個字節不是001的startcode,而是大端模式的幀長度length
        
        //循環獲取nalu數據
        while (bufferOffset < totalLength - AVCCHeaderLength) {
            
            uint32_t NALUnitLength = 0;
            
            //讀取 一單元長度的 nalu
            memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
            
            //從大端模式轉換爲系統端模式
            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
            
            //獲取nalu數據
            NSData *data = [[NSData alloc]initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
            
            //將nalu數據寫入到文件
            [encoder gotEncodedData:data isKeyFrame:keyFrame];
            
            //move to the next NAL unit in the block buffer
            //讀取下一個nalu 一次回調可能包含多個nalu數據
            bufferOffset += AVCCHeaderLength + NALUnitLength;
        }
    }
}

//第一幀寫入 sps & pps
- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
    const char bytes[] = "\x00\x00\x00\x01";
    
    size_t length = (sizeof bytes) - 1;    // 最後一位是\0結束符
    
    NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
    
    [fileHandele writeData:ByteHeader];
    [fileHandele writeData:sps];
    [fileHandele writeData:ByteHeader];
    [fileHandele writeData:pps];
}

- (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame
{
    if (fileHandele != NULL) {
        //添加4個字節的H264 協議 start code 分割符
        //通常來講編碼器編出的首幀數據爲PPS & SPS
        //H264編碼時,在每一個NAL前添加起始碼 0x00000001,解碼器在碼流中檢測起始碼,當前NAL結束。
        const char bytes[] ="\x00\x00\x00\x01";
        //長度
        size_t length = (sizeof bytes) - 1;
        
        //頭字節
        NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
        //寫入頭字節
        [fileHandele writeData:ByteHeader];
        
        //寫入H264數據
        [fileHandele writeData:data];
    }
}
複製代碼

結束編碼

-(void)endVideoToolBox
{
    VTCompressionSessionCompleteFrames(cEncodeingSession, kCMTimeInvalid);
    VTCompressionSessionInvalidate(cEncodeingSession);
    CFRelease(cEncodeingSession);
    cEncodeingSession = NULL;  
}
複製代碼
相關文章
相關標籤/搜索