iOS完整文件拉流解析解碼同步渲染音視頻流

時間 2019-11-06

標籤 ios 完整文件解析解碼同步渲染音視頻欄目 iOS 简体版

原文原文鏈接

需求

解析文件中的音視頻流以解碼同步並將視頻渲染到屏幕上,音頻經過揚聲器輸出.對於僅僅須要單純播放一個視頻文件可直接使用AVFoundation中上層播放器,這裏是用最底層的方式實現,可獲取原始音視頻幀數據.ios

實現原理

本文主要分爲三大塊,解析模塊使用FFmpeg parse文件中的音視頻流,解碼模塊使用FFmpeg或蘋果原生解碼器解碼音視頻,渲染模塊使用OpenGL將視頻流渲染到屏幕,使用Audio Queue Player將音頻以揚聲器形式輸出.git

閱讀前提

注意: 本文涉及到的全部模塊具體實現均在以下連接中,可根據需求自行查看講解部分.

整體架構

本文以解碼一個.MOV媒體文件爲例, 該文件中包含H.264編碼的視頻數據, AAC編碼的音頻數據,首先要經過FFmpeg去parse文件中的音視頻流信息,parse出來的結果保存在AVPacket結構體中,而後分別提取音視頻幀數據,音頻幀經過FFmpeg解碼器或蘋果原生框架中的Audio Converter進行解碼,視頻經過FFmpeg或蘋果原生框架VideoToolbox中的解碼器可將數據解碼,解碼後的音頻數據格式爲PCM,解碼後的視頻數據格式爲YUV原始數據,根據時間戳對音視頻數據進行同步,最後將PCM數據音頻傳給Audio Queue以實現音頻的播放,將YUV視頻原始數據封裝爲CMSampleBufferRef數據結構並傳給OpenGL以將視頻渲染到屏幕上,至此一個完整拉取文件視頻流的操做完成.github

注意: 經過網址拉取一個RTMP流進行解碼播放的流程與拉取文件流基本相同, 只是須要經過socket接收音視頻數據後再完成解碼及後續流程.設計模式

簡易流程

Parse

建立AVFormatContext上下文對象: AVFormatContext *avformat_alloc_context(void);
從文件中獲取上下文對象並賦值給指定對象: int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options)
讀取文件中的流信息: int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);
獲取文件中音視頻流: m_formatContext->streams[audio/video index]e
開始parse以獲取文件中視頻幀幀: int av_read_frame(AVFormatContext *s, AVPacket *pkt);
若是是視頻幀經過av_bitstream_filter_filter生成sps,pps等關鍵信息.
讀取到的AVPacket即包含文件中全部的音視頻壓縮數據.

解碼

經過FFmpeg解碼bash

獲取文件流的解碼器上下文: formatContext->streams[a/v index]->codec;
經過解碼器上下文找到解碼器: AVCodec *avcodec_find_decoder(enum AVCodecID id);
打開解碼器: int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options);
將文件中音視頻數據發送給解碼器: int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);
循環接收解碼後的音視頻數據: int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame);
若是是音頻數據可能須要從新採樣以便轉成設備支持的格式播放.(藉助SwrContext)

經過VideoToolbox解碼視頻網絡

將從FFmpeg中parse到的extra data中分離提取中NALU頭關鍵信息sps,pps等
經過上面提取的關鍵信息建立視頻描述信息:CMVideoFormatDescriptionRef, CMVideoFormatDescriptionCreateFromH264ParameterSets / CMVideoFormatDescriptionCreateFromHEVCParameterSets
建立解碼器:VTDecompressionSessionCreate,並指定一系列相關參數.
將壓縮數據放入CMBlockBufferRef中:CMBlockBufferCreateWithMemoryBlock
開始解碼: VTDecompressionSessionDecodeFrame
在回調中接收解碼後的視頻數據

經過AudioConvert解碼音頻數據結構

經過原始數據與解碼後數據格式的ASBD結構體建立解碼器: AudioConverterNewSpecific
指定解碼器類型AudioClassDescription
開始解碼: AudioConverterFillComplexBuffer
注意: 解碼的前提是每次須要有1024個採樣點才能完成一次解碼操做.

同步

由於這裏解碼的是本地文件中的音視頻, 也就是說只要本地文件中音視頻的時間戳打的徹底正確,咱們解碼出來的數據是能夠直接播放以實現同步的效果.而咱們要作的僅僅是保證音視頻解碼後同時渲染.架構

注意: 好比經過一個RTMP地址拉取的流由於存在網絡緣由可能形成某個時間段數據丟失,形成音視頻不一樣步,因此須要有一套機制來糾正時間戳.大致機制即爲視頻追趕音頻,後面會有文件專門介紹,這裏不做過多說明.框架

渲染

經過上面的步驟獲取到的視頻原始數據便可經過封裝好的OpenGL ES直接渲染到屏幕上,蘋果原生框架中也有GLKViewController能夠完成屏幕渲染.音頻這裏經過Audio Queue接收音頻幀數據以完成播放.socket

文件結構

快速使用

使用FFmpeg解碼

首先根據文件地址初始化FFmpeg以實現parse音視頻流.而後利用FFmpeg中的解碼器解碼音視頻數據,這裏須要注意的是,咱們將從讀取到的第一個I幀開始做爲起點,以實現音視頻同步.解碼後的音頻要先裝入傳輸隊列中,由於audio queue player設計模式是不斷從傳輸隊列中取數據以實現播放.視頻數據便可直接進行渲染.

- (void)startRenderAVByFFmpegWithFileName:(NSString *)fileName {
    NSString *path = [[NSBundle mainBundle] pathForResource:fileName ofType:@"MOV"];
    
    XDXAVParseHandler *parseHandler = [[XDXAVParseHandler alloc] initWithPath:path];
    
    XDXFFmpegVideoDecoder *videoDecoder = [[XDXFFmpegVideoDecoder alloc] initWithFormatContext:[parseHandler getFormatContext] videoStreamIndex:[parseHandler getVideoStreamIndex]];
    videoDecoder.delegate = self;
    
    XDXFFmpegAudioDecoder *audioDecoder = [[XDXFFmpegAudioDecoder alloc] initWithFormatContext:[parseHandler getFormatContext] audioStreamIndex:[parseHandler getAudioStreamIndex]];
    audioDecoder.delegate = self;
    
    static BOOL isFindIDR = NO;
    
    [parseHandler startParseGetAVPackeWithCompletionHandler:^(BOOL isVideoFrame, BOOL isFinish, AVPacket packet) {
        if (isFinish) {
            isFindIDR = NO;
            [videoDecoder stopDecoder];
            [audioDecoder stopDecoder];
            dispatch_async(dispatch_get_main_queue(), ^{
                self.startWorkBtn.hidden = NO;
            });
            return;
        }
        
        if (isVideoFrame) { // Video
            if (packet.flags == 1 && isFindIDR == NO) {
                isFindIDR = YES;
            }
            
            if (!isFindIDR) {
                return;
            }
            
            [videoDecoder startDecodeVideoDataWithAVPacket:packet];
        }else {             // Audio
            [audioDecoder startDecodeAudioDataWithAVPacket:packet];
        }
    }];
}

-(void)getDecodeVideoDataByFFmpeg:(CMSampleBufferRef)sampleBuffer {
    CVPixelBufferRef pix = CMSampleBufferGetImageBuffer(sampleBuffer);
    [self.previewView displayPixelBuffer:pix];
}

- (void)getDecodeAudioDataByFFmpeg:(void *)data size:(int)size pts:(int64_t)pts isFirstFrame:(BOOL)isFirstFrame {
//    NSLog(@"demon test - %d",size);
    // Put audio data from audio file into audio data queue
    [self addBufferToWorkQueueWithAudioData:data size:size pts:pts];

    // control rate
    usleep(14.5*1000);
}
複製代碼

使用原生框架解碼

首先根據文件地址初始化FFmpeg以實現parse音視頻流.這裏首先根據文件中實際的音頻流數據構造ASBD結構體以初始化音頻解碼器,而後將解碼後的音視頻數據分別渲染便可.這裏須要注意的是,若是要拉取的文件視頻是H.265編碼格式的,解碼出來的數據的由於含有B幀因此時間戳是亂序的,咱們須要藉助一個鏈表對其排序,而後再將排序後的數據渲染到屏幕上.

- (void)startRenderAVByOriginWithFileName:(NSString *)fileName {
    NSString *path = [[NSBundle mainBundle] pathForResource:fileName ofType:@"MOV"];
    XDXAVParseHandler *parseHandler = [[XDXAVParseHandler alloc] initWithPath:path];
    
    XDXVideoDecoder *videoDecoder = [[XDXVideoDecoder alloc] init];
    videoDecoder.delegate = self;

    // Origin file aac format
    AudioStreamBasicDescription audioFormat = {
        .mSampleRate         = 48000,
        .mFormatID           = kAudioFormatMPEG4AAC,
        .mChannelsPerFrame   = 2,
        .mFramesPerPacket    = 1024,
    };
    
    XDXAduioDecoder *audioDecoder = [[XDXAduioDecoder alloc] initWithSourceFormat:audioFormat
                                                                     destFormatID:kAudioFormatLinearPCM
                                                                       sampleRate:48000
                                                              isUseHardwareDecode:YES];
    
    [parseHandler startParseWithCompletionHandler:^(BOOL isVideoFrame, BOOL isFinish, struct XDXParseVideoDataInfo *videoInfo, struct XDXParseAudioDataInfo *audioInfo) {
        if (isFinish) {
            [videoDecoder stopDecoder];
            [audioDecoder freeDecoder];
            
            dispatch_async(dispatch_get_main_queue(), ^{
                self.startWorkBtn.hidden = NO;
            });
            return;
        }
        
        if (isVideoFrame) {
            [videoDecoder startDecodeVideoData:videoInfo];
        }else {
            [audioDecoder decodeAudioWithSourceBuffer:audioInfo->data
                                     sourceBufferSize:audioInfo->dataSize
                                      completeHandler:^(AudioBufferList * _Nonnull destBufferList, UInt32 outputPackets, AudioStreamPacketDescription * _Nonnull outputPacketDescriptions) {
                                          // Put audio data from audio file into audio data queue
                                          [self addBufferToWorkQueueWithAudioData:destBufferList->mBuffers->mData size:destBufferList->mBuffers->mDataByteSize pts:audioInfo->pts];

                                          // control rate
                                          usleep(16.8*1000);
                                      }];
        }
    }];
}

- (void)getVideoDecodeDataCallback:(CMSampleBufferRef)sampleBuffer isFirstFrame:(BOOL)isFirstFrame {
    if (self.hasBFrame) {
        // Note : the first frame not need to sort.
        if (isFirstFrame) {
            CVPixelBufferRef pix = CMSampleBufferGetImageBuffer(sampleBuffer);
            [self.previewView displayPixelBuffer:pix];
            return;
        }
        
        [self.sortHandler addDataToLinkList:sampleBuffer];
    }else {
        CVPixelBufferRef pix = CMSampleBufferGetImageBuffer(sampleBuffer);
        [self.previewView displayPixelBuffer:pix];
    }
}

#pragma mark - Sort Callback
- (void)getSortedVideoNode:(CMSampleBufferRef)sampleBuffer {
    int64_t pts = (int64_t)(CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000);
    static int64_t lastpts = 0;
//    NSLog(@"Test marigin - %lld",pts - lastpts);
    lastpts = pts;
    
    [self.previewView displayPixelBuffer:CMSampleBufferGetImageBuffer(sampleBuffer)];
}


複製代碼