ios平臺實現視頻H264硬編碼及軟編碼(附完整demo)

概述

上篇文章咱們學習了視頻的相關概念及h264編解碼的流程，這篇文章咱們主要是作代碼實現，其內容概要以下：ios

利用VideoToolBox對實時視頻作h264硬編碼
ffmpeg
- 在mac平臺安裝ffmpeg
- 簡單經常使用的ffmpeg命令
- 如何在mac平臺編譯出ios開發所用的ffmpeg庫以及環境搭建
- 簡單介紹ffmpeg庫
利用ffmpeg對實時視頻作h264軟編碼

示例代碼：git

h264硬編碼
h264軟編碼
歡迎star&fork

代碼結構：github

運行截圖：json

若是對視頻編解碼相關概念不清楚，請參看上篇文章--視頻的基本參數及H264編解碼相關概念數組

利用VideoToolBox對實時視頻作H264硬編碼

整體步驟以下：bash

初始化編碼器

第1步：設置視頻的寬高微信

- (void)settingEncoderParametersWithWidth:(int)width height:(int)height fps:(int)fps
{
   self.width  = width;
   self.height = height;
   self.fps    = fps;
}
複製代碼

第2步：設置編碼器類型爲kCMVideoCodecType_H264,經過VTSessionSetProperty方法和 kVTCompressionPropertyKey_ExpectedFrameRate、kVTCompressionPropertyKey_AverageBitRate等key分別設置幀率和比特率等參數。markdown

- (void)prepareForEncoder
{
    if (self.width == 0 || self.height == 0) {
        NSLog(@"AppHWH264Encoder, VTSession need width and height for init, width = %d, height = %d",self.width,self.height);
        return;
    }
    
    [m_lock lock];
    OSStatus status =  noErr;
    status =  VTCompressionSessionCreate(NULL, self.width, self.height, kCMVideoCodecType_H264, NULL, NULL, NULL, miEncoderVideoCallBack, (__bridge void *)self, &compressionSession);
    if (status != noErr) {
        NSLog(@"AppHWH264Encoder , create encoder session failed,res=%d",status);
        return;
    }
    
    if (self.fps) {
        int v = self.fps;
        CFNumberRef ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &v);
        status = VTSessionSetProperty(compressionSession, kVTCompressionPropertyKey_ExpectedFrameRate, ref);
        CFRelease(ref);
        if (status != noErr) {
            NSLog(@"AppHWH264Encoder, create encoder session failed, fps=%d,res=%d",self.fps,status);
            return;
        }
    }
    
    if (self.bitrate) {
        int v = self.bitrate;
        CFNumberRef ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &v);
        status = VTSessionSetProperty(compressionSession, kVTCompressionPropertyKey_AverageBitRate,ref);
        CFRelease(ref);
        if (status != noErr) {
            NSLog(@"AppHWH264Encoder, create encoder session failed, bitrate=%d,res=%d",self.bitrate,status);
            return;
        }
    }
    
    status  = VTCompressionSessionPrepareToEncodeFrames(compressionSession);
    if (status != noErr) {
        NSLog(@"AppHWH264Encoder, create encoder session failed,res=%d",status);
        return;
    }
    
}

複製代碼

利用VTCompressionSession硬編碼CMSampleBufferRef

把攝像頭捕獲的CMSampleBuffer直接傳遞給如下編碼方法：session

- (void)encoder:(CMSampleBufferRef)sampleBuffer
{
    if (!self.isInitHWH264Encoder) {
        [self prepareForEncoder];
        self.isInitHWH264Encoder = YES;
    }
    CVImageBufferRef imageBuffer  = CMSampleBufferGetImageBuffer(sampleBuffer);
    CMTime presentationTime       = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
    
    OSStatus status = VTCompressionSessionEncodeFrame(compressionSession, imageBuffer, presentationTime, kCMTimeInvalid, NULL, NULL, NULL);
    if (status != noErr) {
        VTCompressionSessionInvalidate(compressionSession);
        VTCompressionSessionCompleteFrames(compressionSession, kCMTimeInvalid);
        CFRelease(compressionSession);
        compressionSession = NULL;
        self.isInitHWH264Encoder = NO;
        NSLog(@"AppHWH264Encoder, encoder failed");
        return;
    }
    
}
複製代碼

在回調函數中，將編碼成功的碼流轉化成H264碼流結構

在此處，主要是解析SPS,PPS，而後加上開始碼以後組裝成NALU單元(此處必須十分了解H264碼流結構，若是不清楚，可翻看前面的文章)數據結構

NSLog(@"%s",__func__);
    if (status != noErr) {
        NSLog(@"AppHWH264Encoder, encoder failed, res=%d",status);
        return;
    }
    if (!CMSampleBufferDataIsReady(sampleBuffer)) {
        NSLog(@"AppHWH264Encoder, samplebuffer is not ready");
        return;
    }
    
    MIHWH264Encoder *encoder = (__bridge MIHWH264Encoder*)outputCallbackRefCon;
    
    CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
    
    BOOL isKeyframe = false;
    CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);
    if(attachments != NULL)
    {
        CFDictionaryRef attachment =(CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
        CFBooleanRef dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers);
        isKeyframe = (dependsOnOthers == kCFBooleanFalse);
    }
    
    if(isKeyframe)
    {
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        size_t spsSize, ppsSize;
        size_t parmCount;
        const uint8_t*sps, *pps;
        
        int NALUnitHeaderLengthOut;
        
        CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, &NALUnitHeaderLengthOut );
        CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, &NALUnitHeaderLengthOut );
        
        uint8_t *spsppsNALBuff = (uint8_t*)malloc(spsSize+4+ppsSize+4);
        memcpy(spsppsNALBuff, "\x00\x00\x00\x01", 4);
        memcpy(&spsppsNALBuff[4], sps, spsSize);
        memcpy(&spsppsNALBuff[4+spsSize], "\x00\x00\x00\x01", 4);
        memcpy(&spsppsNALBuff[4+spsSize+4], pps, ppsSize);
        NSLog(@"AppHWH264Encoder, encoder video ,find IDR frame");
        //        AVFormatControl::GetInstance()->addH264Data(spsppsNALBuff, (int)(spsSize+ppsSize+8), dtsAfter, YES, NO);
        
        [encoder.delegate acceptEncoderData:spsppsNALBuff length:(int)(spsSize+ppsSize + 8) naluType:H264Data_NALU_TYPE_IDR];
    }
    
    size_t blockBufferLength;
    uint8_t *bufferDataPointer = NULL;
    CMBlockBufferGetDataPointer(block, 0, NULL, &blockBufferLength, (char **)&bufferDataPointer);
    
    const size_t startCodeLength = 4;
    static const uint8_t startCode[] = {0x00, 0x00, 0x00, 0x01};
    
    size_t bufferOffset = 0;
    static const int AVCCHeaderLength = 4;
    while (bufferOffset < blockBufferLength - AVCCHeaderLength)
    {
        uint32_t NALUnitLength = 0;
        memcpy(&NALUnitLength, bufferDataPointer+bufferOffset, AVCCHeaderLength);
        NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
        memcpy(bufferDataPointer+bufferOffset, startCode, startCodeLength);
        bufferOffset += AVCCHeaderLength + NALUnitLength;
    }
    
    //    AVFormatControl::GetInstance()->addH264Data(bufferDataPointer, (int)blockBufferLength,dtsAfter, NO, isKeyframe);
    
    [encoder.delegate acceptEncoderData:bufferDataPointer length:(int)blockBufferLength naluType:H264Data_NALU_TYPE_NOIDR];
複製代碼

進入到沙盒目錄，播放h264文件：

ffplay hwEncoder.h264 
複製代碼

ffmpeg

在mac平臺安裝ffmpeg

安裝分爲源碼安裝和命令行安裝，關於這部分的教程網上很是多，因此我只介紹一種簡單的安裝->命令行安裝。

咱們使用brew來安裝ffmpeg和ffplay命令。

關於brew以及其用法可參考 Homebrew的安裝及使用

空白安裝：

若是你的電腦上之前歷來沒有安裝過ffmpeg，那麼你能夠直接使用如下命令直接安裝。

brew install ffmpeg --with-ffplay
複製代碼

安裝成功後可使用ffplay --help來檢測是否成功安裝。

非空白安裝：

若是你的電腦上之前安裝了ffmpeg，那麼你須要把之前安裝的ffmpeg卸載乾淨而後再利用上面的命令安裝。

卸載方法：

brew uninstall ffmpeg
複製代碼

你也能夠直接在ffmpeg官網下載ffmpeg,ffplay等命令行工具，直接拷貝到你的bin目錄下，直接運行也能夠。此部分主要目的就是能在mac上利用利用ffmpeg命令行工具來解析音視頻文件。

簡單經常使用的ffmpeg命令

關於ffmpeg的命令，最好的途徑是直接在其官網上查看。網上也有不少的示例，有些比較簡單的命令用的多了天然就習慣性的記着了。如下我貼出的是在一本書上看的最多見的命令，具體以下：

ffprobe

ffprobe是用來查看媒體文件頭信息的工具。

查看音頻文件頭信息
```
ffprobe 黃昏裏.mp3

複製代碼
```

顯示結果：

```
Input #0, mp3, from '黃昏裏.mp3':
  Metadata:
    title           : 黃昏裏
    artist          : 鄧麗君
    album           : 愛的箴言
    Tagging time    : 2012-08-08T02:48:38
    TYER            : 1998-01-01
  Duration: 00:02:45.75, start: 0.025056, bitrate: 131 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 128 kb/s
    Metadata:
      encoder         : LAME3.97 
    Stream #0:1: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 240x240 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn, 90k tbc
    Metadata:
      title           : e
      comment         : Cover (front)
```
複製代碼

查看視頻文件頭信息
```
ffprobe test.mp4
複製代碼
```

以上就是查看音頻文件和視頻文件頭信息的方式。下面介紹幾個更高級的用法。

輸出格式信息format_name、時間長度duration、文件大小size、比特率bit_rate、流的數目nb_streams等
```
ffprobe -show_format test.mp4
複製代碼
```

以json格式輸出每個流的最詳細信息

ffprobe -print_format json -show_streams test.mp4
複製代碼

顯示幀信息

ffprobe -show_frames test.mp4
複製代碼

查看包信息

ffprobe -show_packets test.mp4
複製代碼

ffplay

ffplay是以ffmpeg框架爲基礎，外加渲染音視頻庫libSDL來構建的媒體文件播放器。它因此來的libSDL是1.2版本的。

播放音視頻文件

ffplay test.mp4/黃昏裏.mp3
複製代碼

播放一段視頻，循環10次
```
ffplay test.mp4 -loop 10
複製代碼
```

ffplay能夠指定使用哪一路音頻流或視頻流播放

ffplay test.mkv -ast 1  // 表示播放視頻中的第一路音頻流，若是參數ast後面跟的是2，那麼就播放第二路音頻流，若是沒有第二路音頻流，就會靜音。

ffplay test.mkv -vst 1
//表示播放第一路視頻流，若是參數ast後面跟的是2，那麼就播放第二路視頻流，若是沒有第二路視頻流，就會是黑屏即什麼都不顯示。
複製代碼

播放yuv文件

ffplay -f rawvideo -video_size width*height testVideo.yuv
複製代碼

播放pcm文件

ffplay song.pcm -f s16le -channels 2 -ar 44100
複製代碼

或者

ffplay -f s16le -ar 44100 -ac 1 song.pcm
複製代碼

-f 表示音頻的格式，你可使用ffmpeg -formats命令查看支持的格式列表：

```
qis-Mac-mini:tvuDebug qi$ ffmpeg -formats | grep PCM
 DE alaw            PCM A-law
 DE f32be           PCM 32-bit floating-point big-endian
 DE f32le           PCM 32-bit floating-point little-endian
 DE f64be           PCM 64-bit floating-point big-endian
 DE f64le           PCM 64-bit floating-point little-endian
 DE mulaw           PCM mu-law
 DE s16be           PCM signed 16-bit big-endian
 DE s16le           PCM signed 16-bit little-endian
 DE s24be           PCM signed 24-bit big-endian
 DE s24le           PCM signed 24-bit little-endian
 DE s32be           PCM signed 32-bit big-endian
 DE s32le           PCM signed 32-bit little-endian
 DE s8              PCM signed 8-bit
 DE u16be           PCM unsigned 16-bit big-endian
 DE u16le           PCM unsigned 16-bit little-endian
 DE u24be           PCM unsigned 24-bit big-endian
 DE u24le           PCM unsigned 24-bit little-endian
 DE u32be           PCM unsigned 32-bit big-endian
 DE u32le           PCM unsigned 32-bit little-endian
 DE u8              PCM unsigned 8-bit

```
複製代碼

播放YUV420P格式視頻幀

ffplay -f rawvideo -pixel_format yuv420p -s 480*480 texture.yuv
複製代碼

播放RGB圖像

ffplay -f rawvideo -pixel_format rgb24 -s 480*480 texture.rgb
複製代碼

對於視頻播放器，不得不提一個問題就是音畫同步，在ffplay中音畫同步的實現方式有三種，分別是：以音頻爲主時間軸做爲同步源；以視頻爲主時間軸做爲同步源；之外部時鐘爲主時間軸做爲同步源。

在ffplay中默認的對齊方式就是以音頻爲基準進行對齊的，那麼以音頻爲對齊基準是如何實現的呢？

播放器收到的視頻幀和音頻幀都會有時間戳(PTS時鐘)來標識它實際什麼時刻進行展現。實際的對齊策略以下：比較視頻當前的播放時間和音頻當前的播放時間，若是視頻播放的過快，則經過加大延遲或者重複播放來下降視頻播放速度；若是視頻播放慢了，則經過減小延遲或者丟幀來追趕音頻播放的時間點。關鍵在於音視頻時間的比較以及延遲的計算，固然在比較過程當中會設置一個閾值(Threshold)，若超過預設的閾值就應該作調整(丟幀渲染或者重複渲染),這就是對齊策略。

對於ffplay能夠明確指定是哪種對齊方式：

以音頻爲基準進行音視頻同步

ffplay test.mp4 -sync audio
複製代碼

以視頻爲基準進行音視頻同步

ffplay test.mp4 -sync video
複製代碼

之外部時鐘爲基準進行音視頻同步
```
ffplay test.mp4 -sync ext
複製代碼
```

ffmpeg

ffmpeg是強大的媒體文件轉換工具。它能夠轉換任何格式的媒體文件，而且還能夠利用本身的AudioFilter以及VideoFilter進行處理和編輯，總之一句話，有了它，進行離線處理視頻時能夠作任何你想作的事情。

列出ffmpeg支持的全部格式
```
ffmpeg -formats
複製代碼
```

剪切一段媒體文件，能夠是音頻或視頻文件

ffmpeg -i input.mp4 -ss 00:00:50.0 -codec copy -t 20 output.mp4 // 將文件input.mp4從第50s開始剪切20s的時間，輸出到文件output.mp4中，其中-ss指定偏移時間，-t指定時長

複製代碼

若是在手機中錄製了一個時間比較長的視頻沒法分享到微信中，那麼可使用ffmpeg將該文件分割爲多個文件
```
ffmpeg -i input.mp4 -t 00:00:50 -c copy small-1.mp4 -ss 00:00:50 -codec copy small-2.mp4
複製代碼
```

提取額一個視頻文件中的音頻文件

ffmpeg -i input.mp4 -vn -acodec copy output.m4a
複製代碼

使一個視頻中的音頻靜音，即只保留視頻

ffmpeg -i input.mp4 -an -vcodec copy output.mp4
複製代碼

從MP4文件中抽取視頻流導出爲裸 H264數據

ffmpeg -i output.mp4 -an -vcodec copy -bsf:v h264_mp4toannexb output.h264
複製代碼

使用AAC音頻的數據和H264的視頻生成MP4文件
```
ffmpeg -i test.aac -i test.h264 -acodec copy -bsf:a aac_adtstoasc -vcodec copy -f mp4 output.mp4
複製代碼
```
上述代碼中使用了一個名爲aac_adtstoasc的bitstream filter, AAC格式也有兩種封裝格式。

對音頻文件的編碼格式作轉換

ffmpeg -i input.wav -acodec libfdk_aac output.aac
複製代碼

從WAV音頻文件中處處PCM裸數據

ffmpeg -i input.wav -acodec pcm_s16le -f s16le output.pcm
複製代碼

這樣就能夠導出用16個bit來表示一個sample的pcm數據了，而且每一個sample的字節排列順序都是小尾端表示的格式，聲道數和採樣率使用的都是WAV文件的聲道數和採樣率的PCM數據。

從新編碼視頻文件，複製音頻流，同時封裝到MP4格式的文件中
```
ffmpeg -i input.flv -vcodec libx264 -acodec copy output.mp4
複製代碼
```

將一個MP4格式的視頻轉換成爲git格式的動圖

ffmpeg -i input.mp4 -vf scale=100:-1 -t 5 -r 10 image.gif
複製代碼

上述代碼按照分辨比例不動寬度改成100(使用VideoFilter的scaleFilter)，幀率改成10(-r)，只處理前5秒鐘(-t)的視頻，生成gif。

將一個視頻的畫面部分生成圖片，好比要分析一個視頻裏面的每一幀都是什麼內容的時候，可能就須要用到這個命令了
```
ffmpeg -i output.mp4 -r 0.25 frames_%04d.png
複製代碼
```

上述這個命令每四秒鐘截取一幀視頻畫面生成一張圖片，生成的圖片從frames_0001.png開始一直遞增下去。

使用一組圖片能夠組成一個gif，若是你連拍了一組照片，就可使用下面的命令生成一個gif
```
ffmpeg -i frames_%04.png -r 5 output.gif
複製代碼
```
使用音量效果器，能夠改變一個音頻媒體文件中的音量
```
ffmpeg -i input.wav -af 'volume=0.5' output.wav
複製代碼
```

上述命令是將input.wav中的音量減少一半，輸出到output.wav文件中，能夠直接播放來聽，或者放到一些音頻編輯軟件中直接觀看波形幅度的效果。

淡入效果器的使用

ffmpeg -i input.wav -filter_complex afade=t=in:ss=0:d=5 output.wav
複製代碼

上述命令能夠將input.wav文件中的前5s作一個淡入效果，輸出到output.wav中，能夠將處理以前和處理以後的文件拖到Audacity音頻編輯軟件中查看波形圖。

淡出效果器的使用

ffmpeg -i input.wav -filter_complex afade=t=out:st=200:d=5 output.wav
複製代碼

上述命令能夠將input.wav文件從200s開始，作5s的淡出效果，並放到output.wav文件中

*將兩路聲音進行合併，好比要給一段聲音加上背景音樂

ffmpeg -i vocal.wav -i accompany.wav -filter_complex amix=inputs=2:duration=shortest output.wav
複製代碼

上述命令是將vocal.wav和accompany.wav兩個文件記性mix，按照時間長度較短的音頻文件的時間長度做爲最終輸出的output.wav的時間長度

對聲音進行變速但不變調效果器的使用

ffmpeg -i vocal.wav -filter_complex atempo=0.5 output.wav
複製代碼

上述命令是將vocal.wav按照0.5倍的速度盡心剛處理生成output.wav，時間長度將會變爲輸入的2倍。可是音高是不變的，這就是你們常說的變速不變調。

爲視頻添加水印效果

ffmpeg -i input.mp4 -i changeba_icon.png -filter_complex '[0:v][1:v]overlay=main_w-overlay_w-10:10:1[out]' -map '[out]' output.mp4
複製代碼

上述命令包含了幾個內置參數，main_w表明主視頻寬度，overlay_w表明水印寬度，main_h戴波啊主視頻高度，overlay_h表明水印高度。

視頻提升效果器的使用

ffmpeg -i input.fly -c:v libx264 -b:v 800k -c:a libfdk_aac -vf eq=brightness=0.25 -f mp4 output.mp4
複製代碼

提亮參數是bitrate，取值範圍是從-1.0到1.0,默認值是0

爲視頻增長對比度效果

ffmpeg -i input.flv -c:v libx264 -b:v 800k -c:a libfdk_aac -vf eq=contrast=1.5 -f mp4 output.mp4
複製代碼

對比度參數是contrast,取值範圍是從-2.0到2.0,默認值是1.0

視頻旋轉效果器的使用

ffmpeg -i input.mp4 -vf "transpose=1" -b:v 600k output.mp4
複製代碼

視頻裁剪效果器的使用

ffmpeg -i input.mp4 -an -vf "crop=240:480:120:0" -vcodec libx264 -b:v 600k output.mp4
複製代碼

在mac平臺編譯出ios開發所用的ffmpeg庫

安裝homebrew

Homebrew是一款自由及開放源代碼的軟件包管理系統，用以簡化Mac OS X系統上的軟件安裝過程，最初由馬克斯·霍威爾（Max Howell）寫成。因其可擴展性獲得了一致好評[1]，而在Ruby on Rails社區廣爲人知。

how to install HomeBrew

下載編譯腳本文件

編譯ffmpeg腳本的文件咱們用 gas-preprocessor.

下載後把gas-preprocessor.pl 拷貝到 /usr/local/bin/目錄下，而後爲文件開啓可執行權限：

chmod 777 /usr/local/bin/gas-preprocessor.pl
複製代碼

安裝yasm

在計算機領域中，Yasm是英特爾x86架構下的一個彙編器和反彙編器。它能夠用來編寫16位、32位（IA-32）和64位（x86-64）的程序。Yasm是一個徹底重寫的Netwide彙編器（NASM）。Yasm一般能夠與NASM互換使用，並支持x86和x86-64架構。

安裝 `yasm

brew install yasm
複製代碼

ffmpeg ios編譯腳本

FFmpeg iOS build script

以上是ffmpeg 的編譯腳本，能編譯出ios平臺的庫。具體詳細信息可查看wiki.

直接運行如下命令編譯出咱們須要的庫(可選擇編譯那種cpu架構)：

./build-ffmpeg-iOS-framework.sh 
複製代碼

編譯成功後，目錄結構如圖所示：

在項目中引入ffmpeg

直接把上面咱們編譯成功的ffmpeg庫FFmpeg-iOS總體拖入工程，而後再加入以下庫：

libiconv.tdb
libbz2.tbd
libz.tbd

簡單介紹ffmpeg

ffmpeg庫簡介

ffmpeg一共包含8個庫：

avcodec : 編解碼（最重要的庫）
avformat : 封裝格式處理
avfilter : 濾鏡特效處理
avdevice : 各類設備的輸入輸出
avutil : 工具庫(大部分庫都須要這個庫的支持)
postproc : 後加工
swresample : 音頻採樣數據格式轉換
swscale : 視頻像素數據格式轉換

ffmpeg數據結構分析

AVFormatContext: 封裝格式上下文結構體，也是統領全局的結構體，保存了視頻文件封裝格式相關信息
- iformat: 輸入視頻的 AVInputFormat
- nb_streams : 輸入視頻的 AVStream個數
- streams : 輸入視頻的AVStream[] 數組
- duration : 輸入視頻的時長（以微妙爲單位）
- bit_rate : 輸入視頻的碼率
AVInputFormat:每種封裝格式（例如 FLV,MKV,MP4，AVI）對應一個結構體
- name : 封裝格式名稱
- long_name : 封裝格式的長名稱
- extensions : 封裝格式的擴展名
- id : 封裝格式ID
AVStream:視頻文件中每一個視頻(音頻)流對應一個該結構體
- id : 序號
- codec : 該流對應的AVCodecContext
- time_base : 該流的時基
- r_frame_rate : 該流的幀率
AVCodecContext:編碼器上下文結構體，保存了視頻（音頻）編解碼器相關信息
- codec ：編解碼器的AVCodec
- width,height : 圖像的寬高（只針對視頻）
- pix_fmt : 像素格式（只針對視頻）
- sample_rate : 採樣率（只針對音頻）
- channels : 聲道數（只針對音頻）
- sample_fmt : 採樣格式（只針對音頻）
AVCodec:每種視頻(音頻)編解碼器（例如 H264解碼器）對應一個該結構體
- name :編解碼器名稱
- long_name : 編解碼器長名稱
- type : 編解碼器類型
- id : 編解碼器ID
AVPacket :存儲一幀壓縮編碼數據
- pts : 顯示時間戳
- dts : 解碼時間戳
- data : 壓縮編碼數據
- size : 壓縮編碼數據大小
- stream_index : 所屬的AVStream
AVFrame :存儲一幀解碼後像素（採樣）數據
- data : 解碼後的圖像像素數據（音頻採樣數據）
- linesize : 對視頻來講是圖像中一行像素大小，對音頻來講是整個音頻幀的大小
- width,height : 圖像的寬高(只針對視頻)
- key_frame : 是否爲關鍵幀（只針對視頻）
- pict_type : 幀類型（只針對視頻）。例如 I,B,P

利用ffmpeg對實時視頻作h264軟編碼

大體流程以下：

#import "MISoftH264Encoder.h"



@implementation MISoftH264Encoder
{
    AVFormatContext             *pFormatCtx;
    AVOutputFormat              *out_fmt;
    AVStream                    *video_stream;
    AVCodecContext              *pCodecCtx;
    AVCodec                     *pCodec;
    AVPacket                    pkt;
    uint8_t                     *picture_buf;
    AVFrame                     *pFrame;
    int                         picture_size;
    int                         y_size;
    int                         framecnt;
    char                        *out_file;
    
    int                         encoder_h264_frame_width;
    int                         encoder_h264_frame_height;
}

- (instancetype)init
{
    if (self = [super init]) {

    }
    return self;
}

static MISoftH264Encoder *miSoftEncoder_Instance = nil;
+ (instancetype)getInstance
{
    if (miSoftEncoder_Instance == NULL) {
        miSoftEncoder_Instance = [[MISoftH264Encoder alloc] init];
    }
    return miSoftEncoder_Instance;
}

- (void)setFileSavedPath:(NSString *)path
{
    NSUInteger len = [path length];
    char *filepath = (char*)malloc(sizeof(char) * (len + 1));
    [path getCString:filepath maxLength:len + 1 encoding:[NSString defaultCStringEncoding]];
    out_file = filepath;
}

- (int)setEncoderVideoWidth:(int)width height:(int)height bitrate:(int)bitrate
{
    framecnt = 0;
    encoder_h264_frame_width = width;
    encoder_h264_frame_height = height;
    av_register_all();
    pFormatCtx = avformat_alloc_context();
    
    // 設置輸出文件的路徑
    out_fmt = av_guess_format(NULL, out_file, NULL);
    pFormatCtx->oformat = out_fmt;
    
    // 打開文件的緩衝區輸入輸出，flags 標識爲  AVIO_FLAG_READ_WRITE ，可讀寫
    if (avio_open(&pFormatCtx->pb, out_file, AVIO_FLAG_READ_WRITE) < 0){
        printf("Failed to open output file! \n");
        return -1;
    }
    
    // 建立新的輸出流, 用於寫入文件
    video_stream = avformat_new_stream(pFormatCtx, 0);
    
    // 設置幀率
    video_stream->time_base.num = 1;
    video_stream->time_base.den = 30;
    if (video_stream==NULL){
        return -1;
    }
    
    // 從媒體流中獲取到編碼結構體，他們是一一對應的關係，一個 AVStream 對應一個  AVCodecContext
    pCodecCtx = video_stream->codec;
    
    // 設置編碼器的編碼格式(是一個id)，每個編碼器都對應着本身的 id，例如 h264 的編碼 id 就是 AV_CODEC_ID_H264
    pCodecCtx->codec_id = out_fmt->video_codec;
    pCodecCtx->codec_type = AVMEDIA_TYPE_VIDEO;
    pCodecCtx->pix_fmt = AV_PIX_FMT_YUV420P; // AV_PIX_FMT_YUV420P
    pCodecCtx->width = encoder_h264_frame_width;
    pCodecCtx->height = encoder_h264_frame_height;
    pCodecCtx->time_base.num = 1;
    pCodecCtx->time_base.den = 30;
    pCodecCtx->bit_rate = bitrate;
    
    // 視頻質量度量標準(常見qmin=10, qmax=51)
    pCodecCtx->qmin = 10;
    pCodecCtx->qmax = 51;
    
//    // 設置圖像組層的大小(GOP-->兩個I幀之間的間隔)
//    pCodecCtx->gop_size = 30;
//
//    // 設置 B 幀最大的數量，B幀爲視頻圖片空間的先後預測幀， B 幀相對於 I、P 幀來講，壓縮率比較大，也就是說相同碼率的狀況下，
//    // 越多 B 幀的視頻，越清晰，如今不少打視頻網站的高清視頻，就是採用多編碼 B 幀去提升清晰度，
//    // 但同時對於編解碼的複雜度比較高，比較消耗性能與時間
//    pCodecCtx->max_b_frames = 5;
//
//    // 可選設置
    AVDictionary *param = 0;
    // H.264
    if(pCodecCtx->codec_id == AV_CODEC_ID_H264) {
        // 經過--preset的參數調節編碼速度和質量的平衡。
        av_dict_set(&param, "preset", "slow", 0);

        // 經過--tune的參數值指定片子的類型，是和視覺優化的參數，或有特別的狀況。
        // zerolatency: 零延遲，用在須要很是低的延遲的狀況下，好比視頻直播的編碼
        av_dict_set(&param, "tune", "zerolatency", 0);
    }
    
    // 輸出打印信息，內部是經過printf函數輸出（不須要輸出能夠註釋掉該局）
//    av_dump_format(pFormatCtx, 0, out_file, 1);
    
    // 經過 codec_id 找到對應的編碼器
    pCodec = avcodec_find_encoder(pCodecCtx->codec_id);
    if (!pCodec) {
        printf("Can not find encoder! \n");
        return -1;
    }
    
    // 打開編碼器，並設置參數 param
    if (avcodec_open2(pCodecCtx, pCodec,&param) < 0) {
        printf("Failed to open encoder! \n");
        return -1;
    }
    
    // 初始化原始數據對象: AVFrame
    pFrame = av_frame_alloc();
    
    // 經過像素格式(這裏爲 YUV)獲取圖片的真實大小，例如將 1080 * 1920 轉換成 int 類型
    avpicture_fill((AVPicture *)pFrame, picture_buf, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height);
    
    // h264 封裝格式的文件頭部，基本上每種編碼都有着本身的格式的頭部，想看具體實現的同窗能夠看看 h264 的具體實現
    avformat_write_header(pFormatCtx, NULL);
    
    // 建立編碼後的數據 AVPacket 結構體來存儲 AVFrame 編碼後生成的數據
    av_new_packet(&pkt, picture_size);
    
    return 0;
}

/*
 * 將CMSampleBufferRef格式的數據編碼成h264並寫入文件
 *
 */
- (void)encoderToH264:(CMSampleBufferRef)sampleBuffer
{
    // 經過CMSampleBufferRef對象獲取CVPixelBufferRef對象
    CVPixelBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    
    // 鎖定imageBuffer內存地址開始進行編碼
    if (CVPixelBufferLockBaseAddress(imageBuffer, 0) == kCVReturnSuccess) {
        // 3.從CVPixelBufferRef讀取YUV的值
        UInt8 *bufferPtr = (UInt8 *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0);
        UInt8 *bufferPtr1 = (UInt8 *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer,1);
        
        size_t width = CVPixelBufferGetWidth(imageBuffer);
        size_t height = CVPixelBufferGetHeight(imageBuffer);
        size_t bytesrow0 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer,0);
        size_t bytesrow1  = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer,1);
        UInt8 *yuv420_data = (UInt8 *)malloc(width * height *3/2);
        
        UInt8 *pY = bufferPtr ;
        UInt8 *pUV = bufferPtr1;
        UInt8 *pU = yuv420_data + width*height;
        UInt8 *pV = pU + width*height/4;
        for(int i =0;i<height;i++)
        {
            memcpy(yuv420_data+i*width,pY+i*bytesrow0,width);
        }
        for(int j = 0;j<height/2;j++)
        {
            for(int i =0;i<width/2;i++)
            {
                *(pU++) = pUV[i<<1];
                *(pV++) = pUV[(i<<1) + 1];
            }
            pUV+=bytesrow1;
        }
        
        
        
        // 分別讀取YUV的數據
        picture_buf = yuv420_data;
        y_size = pCodecCtx->width * pCodecCtx->height;
        pFrame->data[0] = picture_buf;              // Y
        pFrame->data[1] = picture_buf+ y_size;      // U
        pFrame->data[2] = picture_buf+ y_size*5/4;  // V
        
        // 4.設置當前幀
        pFrame->pts = framecnt;
        int got_picture = 0;
        
        // 4.設置寬度高度以及YUV各式
        pFrame->width = encoder_h264_frame_width;
        pFrame->height = encoder_h264_frame_height;
        pFrame->format = AV_PIX_FMT_YUV420P;
        
        // 對編碼前的原始數據(AVFormat)利用編碼器進行編碼，將 pFrame 編碼後的數據傳入pkt 中
        int ret = avcodec_encode_video2(pCodecCtx, &pkt, pFrame, &got_picture);
        if(ret < 0) {
            printf("Failed to encode! \n");
        }else if (ret == 0){
            if (pkt.buf) {
                printf("encode success, data length: %d \n",pkt.buf->size);
            }
            
        }
        
        // 編碼成功後寫入 AVPacket 到output文件中
        if (got_picture == 1) {  // 說明不爲空，此時把數據寫到輸出文件中
            framecnt++;
            pkt.stream_index = video_stream->index;
            ret = av_write_frame(pFormatCtx, &pkt);
            
            av_free_packet(&pkt);
        }
        free(yuv420_data);
    }
    
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
}

/*
 * 釋放資源
 */
- (void)freeH264Resource
{
    // 1.釋放AVFormatContext
    int ret = flush_encoder(pFormatCtx,0);
    if (ret < 0) {
        printf("Flushing encoder failed\n");
    }
    
    // 將還未輸出的AVPacket輸出出來
    av_write_trailer(pFormatCtx);
    
    // 關閉資源
    if (video_stream){
        avcodec_close(video_stream->codec);
        av_free(pFrame);
    }
    avio_close(pFormatCtx->pb);
    avformat_free_context(pFormatCtx);
}

int flush_encoder(AVFormatContext *fmt_ctx,unsigned int stream_index)
{
    int ret;
    int got_frame;
    AVPacket enc_pkt;
    if (!(fmt_ctx->streams[stream_index]->codec->codec->capabilities &
          CODEC_CAP_DELAY))
        return 0;
    
    while (1) {
        enc_pkt.data = NULL;
        enc_pkt.size = 0;
        av_init_packet(&enc_pkt);
        ret = avcodec_encode_video2 (fmt_ctx->streams[stream_index]->codec, &enc_pkt,
                                     NULL, &got_frame);
        av_frame_free(NULL);
        if (ret < 0)
            break;
        if (!got_frame){
            ret=0;
            break;
        }
        ret = av_write_frame(fmt_ctx, &enc_pkt);
        if (ret < 0)
            break;
    }
    return ret;
}


@end

複製代碼

進入到沙盒目錄裏面，利用ffplay播放h264文件：

ffplay softEncoder.h264
複製代碼

下篇文章咱們將介紹如何對音視頻作mux。