FFmpeg封裝格式處理

時間 2019-11-11

原文原文鏈接

本文爲做者原創，轉載請註明出處：http://www.javashuo.com/article/p-sulhdzuv-ge.htmlhtml

FFmpeg封裝格式處理相關內容分爲以下幾篇文章：
[1]. FFmpeg封裝格式處理-簡介
[2]. FFmpeg封裝格式處理-解複用例程
[3]. FFmpeg封裝格式處理-複用例程
[4]. FFmpeg封裝格式處理-轉封裝例程
這幾篇文章內容聯繫緊密，但放在一篇文章裏內容太長，遂做拆分。章節號不做調整。基於FFmpeg 4.1版本。git

1. 概述

1.1 封裝格式簡介

封裝格式(container format)能夠看做是編碼流(音頻流、視頻流等)數據的一層外殼，將編碼後的數據存儲於此封裝格式的文件以內。封裝又稱容器，容器的稱法更爲形象，所謂容器，就是存放內容的器具，飲料是內容，那麼裝飲料的瓶子就是容器。github

不一樣封裝格式適用於不一樣的場合，支持的編碼格式不同，幾個經常使用的封裝格式以下：
下表引用自「視音頻編解碼技術零基礎學習方法」shell

名稱(文件擴展名)	推出機構	流媒體	支持的視頻編碼	支持的音頻編碼	目前使用領域
AVI(.avi)	Microsoft 公司	不支持	幾乎全部格式	幾乎全部格式	BT 下載影視
Flash Video(.flv)	Adobe 公司	支持	Sorenson/VP6/H.264	MP3/ADPCM/Linear PCM/AAC 等	互聯網視頻網站
MP4(.mp4)	MPEG 組織	支持	MPEG-2/MPEG-4/H.264/H.263 等	AAC/MPEG-1 Layers I,II,III/AC-3 等	互聯網視頻網站
MPEGTS(.ts)	MPEG 組織	支持	MPEG-1/MPEG-2/MPEG-4/H.264	MPEG-1 Layers I,II,III/AAC	IPTV，數字電視
Matroska(.mkv)	CoreCodec 公司	支持	幾乎全部格式	幾乎全部格式	互聯網視頻網站
Real Video(.rmvb)	Real Networks 公司	支持	RealVideo 8,9,10	AAC/Cook Codec/RealAudio Lossless	BT 下載影視

1.2 FFmpeg中的封裝格式

FFmpeg關於封裝格式的處理涉及打開輸入文件、打開輸出文件、從輸入文件讀取編碼幀、往輸出文件寫入編碼幀這幾個步驟，這些都不涉及編碼解碼層面。api

在FFmpeg中，mux指複用，是multiplex的縮寫，表示將多路流(視頻、音頻、字幕等)混入一路輸出中(普通文件、流等)。demux指解複用，是mux的反操做，表示從一路輸入中分離出多路流(視頻、音頻、字幕等)。mux處理的是輸入格式，demux處理的輸出格式。輸入/輸出媒體格式涉及文件格式和封裝格式兩個概念。文件格式由文件擴展名標識，主要起提示做用，經過擴展名提示文件類型(或封裝格式)信息。封裝格式則是存儲媒體內容的實際容器格式，不一樣的封裝格式對應不一樣的文件擴展名，不少時候也用文件格式代指封裝格式，例如經常使用ts格式(文件格式)代指mpegts格式(封裝格式)。數組

例如，咱們把test.ts更名爲test.mkv，mkv擴展名提示了此文件封裝格式爲Matroska，但文件內容並沒有任何變化，使用ffprobe工具仍能正確探測出封裝格式爲mpegts。緩存

1.2.1 查看FFmpeg支持的封裝格式

使用ffmpeg -formats命令能夠查看FFmpeg支持的封裝格式。FFmpeg支持的封裝很是多，下面僅列出最經常使用的幾種：less

think@opensuse> ffmpeg -formats
File formats:
 D. = Demuxing supported
 .E = Muxing supported
 --
 DE flv             FLV (Flash Video)
 D  aac             raw ADTS AAC (Advanced Audio Coding)
 DE h264            raw H.264 video
 DE hevc            raw HEVC video
  E mp2             MP2 (MPEG audio layer 2)
 DE mp3             MP3 (MPEG audio layer 3)
  E mpeg2video      raw MPEG-2 video
 DE mpegts          MPEG-TS (MPEG-2 Transport Stream)

1.2.2 h264/aac裸流封裝格式

h264裸流封裝格式和aac裸流封裝格式在後面的解複用和複用例程中會用到，這裏先討論一下。ide

h264原本是編碼格式，看成封裝格式時表示的是H.264裸流格式，所謂裸流就是不含封裝信息也流，也就是沒穿衣服的流。aac等封裝格式相似。函數

咱們看一下FFmpeg工程源碼中h264編碼格式以及h264封裝格式的定義：
FFmpeg工程包含h264解碼器，而不包含h264編碼器(通常使用第三方libx264編碼器用做h264編碼)，因此只有解碼器定義：

AVCodec ff_h264_decoder = {
    .name                  = "h264",
    .long_name             = NULL_IF_CONFIG_SMALL("H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10"),
    .type                  = AVMEDIA_TYPE_VIDEO,
    .id                    = AV_CODEC_ID_H264,
    ......
};

h264封裝格式定義以下：

AVOutputFormat ff_h264_muxer = {
    .name              = "h264",
    .long_name         = NULL_IF_CONFIG_SMALL("raw H.264 video"),
    .extensions        = "h264,264",
    .audio_codec       = AV_CODEC_ID_NONE,
    .video_codec       = AV_CODEC_ID_H264,
    .write_header      = force_one_stream,
    .write_packet      = ff_raw_write_packet,
    .check_bitstream   = h264_check_bitstream,
    .flags             = AVFMT_NOTIMESTAMPS,
};
AVOutputFormat ff_h264_muxer = {
    .name              = "h264",
    .long_name         = NULL_IF_CONFIG_SMALL("raw H.264 video"),
    .extensions        = "h264,264",
    .audio_codec       = AV_CODEC_ID_NONE,
    .video_codec       = AV_CODEC_ID_H264,
    .write_header      = force_one_stream,
    .write_packet      = ff_raw_write_packet,
    .check_bitstream   = h264_check_bitstream,
    .flags             = AVFMT_NOTIMESTAMPS,
};

1.2.3 mpegts封裝格式

再看一下mpegts封裝格式定義，AVInputFormat用於定義輸入封裝格式，AVOutputFormat用於定義輸出封裝格式。mpegts輸入封裝格式中並未指定文件擴展名，而mpegts輸出封裝格式中則指定了文件擴展名爲"ts,m2t,m2ts,mts"。

AVInputFormat ff_mpegts_demuxer = {
    .name           = "mpegts",
    .long_name      = NULL_IF_CONFIG_SMALL("MPEG-TS (MPEG-2 Transport Stream)"),
    .priv_data_size = sizeof(MpegTSContext),
    .read_probe     = mpegts_probe,
    .read_header    = mpegts_read_header,
    .read_packet    = mpegts_read_packet,
    .read_close     = mpegts_read_close,
    .read_timestamp = mpegts_get_dts,
    .flags          = AVFMT_SHOW_IDS | AVFMT_TS_DISCONT,
    .priv_class     = &mpegts_class,
};
AVOutputFormat ff_mpegts_muxer = {
    .name           = "mpegts",
    .long_name      = NULL_IF_CONFIG_SMALL("MPEG-TS (MPEG-2 Transport Stream)"),
    .mime_type      = "video/MP2T",
    .extensions     = "ts,m2t,m2ts,mts",
    .priv_data_size = sizeof(MpegTSWrite),
    .audio_codec    = AV_CODEC_ID_MP2,
    .video_codec    = AV_CODEC_ID_MPEG2VIDEO,
    .init           = mpegts_init,
    .write_packet   = mpegts_write_packet,
    .write_trailer  = mpegts_write_end,
    .deinit         = mpegts_deinit,
    .check_bitstream = mpegts_check_bitstream,
    .flags          = AVFMT_ALLOW_FLUSH | AVFMT_VARIABLE_FPS | AVFMT_NODIMENSIONS,
    .priv_class     = &mpegts_muxer_class,
};

1.2.4 文件擴展名與封裝格式

在FFmpeg命令行中，輸入文件擴展名是錯的也沒有關係，由於FFmpeg會讀取一小段文件來探測出真正的封裝格式；可是若是未顯式的指定輸出封裝格式，就只能經過輸出文件擴展名來肯定封裝格式，就必須確保擴展名是正確的。

作幾個實驗，來研究一下FFmpeg中文件擴展名與封裝格式的關係：

測試文件下載(右鍵另存爲)：tnhaoxc.flv

文件信息以下：

think@opensuse> ffprobe tnhaoxc.flv 
ffprobe version 4.1 Copyright (c) 2007-2018 the FFmpeg developers
Input #0, flv, from 'tnhaoxc.flv':
  Metadata:
    encoder         : Lavf58.20.100
  Duration: 00:02:13.68, start: 0.000000, bitrate: 838 kb/s
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 784x480, 25 fps, 25 tbr, 1k tbn, 50 tbc
    Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp

實驗1：將flv封裝格式轉換爲mpegts封裝格式
使用轉封裝指令將flv封裝格式轉換爲mpegts封裝格式，在SHELL中依次運行以下兩條命令：

ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.ts
ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.m2t

生成tnhaoxc.ts和tnhaoxc.m2t文件，比較一下兩文件有無不一樣：

diff tnhaoxc.ts tnhaoxc.m2t

命令行無輸出，表示兩文件內容相同。即兩文件僅是擴展名不一樣，封裝格式都是mpegts，文件內容並沒有任何不一樣。

實驗2：爲輸出文件指定錯誤的擴展名
指定一個錯誤的擴展名再試一下(誤把封裝格式名稱看成文件擴展名)：

ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.mpegts

命令行輸出以下錯誤信息：

ffmpeg version 4.1 Copyright (c) 2000-2018 the FFmpeg developers
Input #0, flv, from 'tnhaoxc.flv':
  Metadata:
    encoder         : Lavf58.20.100
  Duration: 00:02:13.68, start: 0.000000, bitrate: 838 kb/s
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 784x480, 25 fps, 25 tbr, 1k tbn, 50 tbc
    Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp
[NULL @ 0x1d62e80] Unable to find a suitable output format for 'tnhaoxc.mpegts'
tnhaoxc.mpegts: Invalid argument

提示沒法肯定輸出格式。FFmpeg沒法根據此擴展名肯定輸出文件的封裝格式。

實驗3：爲輸出文件指定錯誤的擴展名但顯式指定封裝格式
經過-f mpegts選項顯式指定封裝格式爲mpegts：

ffmpeg -i tnhaoxc.flv -map 0 -c copy -f mpegts tnhaoxc.mpegts

命令執行成功，看一下文件內容是否正確：

diff tnhaoxc.mpegts tnhaoxc.ts

發現tnhaoxc.mpegts和tnhaoxc.ts文件內容徹底同樣，雖然tnhaoxc.mpegts有錯誤的文件擴展名，仍然獲得了咱們指望的封裝格式。

不知道什麼命令能夠查到封裝格式對應的擴展名。能夠在FFmpeg工程源碼中搜索封裝格式名稱，如搜索「mpegts」，能夠看到其擴展名爲「ts,m2t,m2ts,mts」。

2. API介紹

最主要的API有以下幾個。FFmpeg中將編碼幀及未編碼幀均稱做frame，本文爲方便，將編碼幀稱做packet，未編碼幀稱做frame。

2.1 avformat_open_input()

/**
 * Open an input stream and read the header. The codecs are not opened.
 * The stream must be closed with avformat_close_input().
 *
 * @param ps Pointer to user-supplied AVFormatContext (allocated by avformat_alloc_context).
 *           May be a pointer to NULL, in which case an AVFormatContext is allocated by this
 *           function and written into ps.
 *           Note that a user-supplied AVFormatContext will be freed on failure.
 * @param url URL of the stream to open.
 * @param fmt If non-NULL, this parameter forces a specific input format.
 *            Otherwise the format is autodetected.
 * @param options  A dictionary filled with AVFormatContext and demuxer-private options.
 *                 On return this parameter will be destroyed and replaced with a dict containing
 *                 options that were not found. May be NULL.
 *
 * @return 0 on success, a negative AVERROR on failure.
 *
 * @note If you want to use custom IO, preallocate the format context and set its pb field.
 */
int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options);

這個函數會打開輸入媒體文件，讀取文件頭，將文件格式信息存儲在第一個參數AVFormatContext中。

2.2 avformat_find_stream_info()

/**
 * Read packets of a media file to get stream information. This
 * is useful for file formats with no headers such as MPEG. This
 * function also computes the real framerate in case of MPEG-2 repeat
 * frame mode.
 * The logical file position is not changed by this function;
 * examined packets may be buffered for later processing.
 *
 * @param ic media file handle
 * @param options  If non-NULL, an ic.nb_streams long array of pointers to
 *                 dictionaries, where i-th member contains options for
 *                 codec corresponding to i-th stream.
 *                 On return each dictionary will be filled with options that were not found.
 * @return >=0 if OK, AVERROR_xxx on error
 *
 * @note this function isn't guaranteed to open all the codecs, so
 *       options being non-empty at return is a perfectly normal behavior.
 *
 * @todo Let the user decide somehow what information is needed so that
 *       we do not waste time getting stuff the user does not need.
 */
int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);

這個函數會讀取一段視頻文件數據並嘗試解碼，將取到的流信息填入AVFormatContext.streams中。AVFormatContext.streams是一個指針數組，數組大小是AVFormatContext.nb_streams

2.3 av_read_frame()

/**
 * Return the next frame of a stream.
 * This function returns what is stored in the file, and does not validate
 * that what is there are valid frames for the decoder. It will split what is
 * stored in the file into frames and return one for each call. It will not
 * omit invalid data between valid frames so as to give the decoder the maximum
 * information possible for decoding.
 *
 * If pkt->buf is NULL, then the packet is valid until the next
 * av_read_frame() or until avformat_close_input(). Otherwise the packet
 * is valid indefinitely. In both cases the packet must be freed with
 * av_packet_unref when it is no longer needed. For video, the packet contains
 * exactly one frame. For audio, it contains an integer number of frames if each
 * frame has a known fixed size (e.g. PCM or ADPCM data). If the audio frames
 * have a variable size (e.g. MPEG audio), then it contains one frame.
 *
 * pkt->pts, pkt->dts and pkt->duration are always set to correct
 * values in AVStream.time_base units (and guessed if the format cannot
 * provide them). pkt->pts can be AV_NOPTS_VALUE if the video format
 * has B-frames, so it is better to rely on pkt->dts if you do not
 * decompress the payload.
 *
 * @return 0 if OK, < 0 on error or end of file
 */
int av_read_frame(AVFormatContext *s, AVPacket *pkt);

本函數用於解複用過程。

本函數將存儲在輸入文件中的數據分割爲多個packet，每次調用將獲得一個packet。packet多是視頻幀、音頻幀或其餘數據，解碼器只會解碼視頻幀或音頻幀，非音視頻數據並不會被扔掉、從而能向解碼器提供儘量多的信息。

對於視頻來講，一個packet只包含一個視頻幀；對於音頻來講，如果幀長固定的格式則一個packet可包含整數個音頻幀，如果幀長可變的格式則一個packet只包含一個音頻幀。

讀取到的packet每次使用完以後應調用av_packet_unref(AVPacket *pkt)清空packet。不然會形成內存泄露。

2.4 av_write_frame()

/**
 * Write a packet to an output media file.
 *
 * This function passes the packet directly to the muxer, without any buffering
 * or reordering. The caller is responsible for correctly interleaving the
 * packets if the format requires it. Callers that want libavformat to handle
 * the interleaving should call av_interleaved_write_frame() instead of this
 * function.
 *
 * @param s media file handle
 * @param pkt The packet containing the data to be written. Note that unlike
 *            av_interleaved_write_frame(), this function does not take
 *            ownership of the packet passed to it (though some muxers may make
 *            an internal reference to the input packet).
 *            <br>
 *            This parameter can be NULL (at any time, not just at the end), in
 *            order to immediately flush data buffered within the muxer, for
 *            muxers that buffer up data internally before writing it to the
 *            output.
 *            <br>
 *            Packet's @ref AVPacket.stream_index "stream_index" field must be
 *            set to the index of the corresponding stream in @ref
 *            AVFormatContext.streams "s->streams".
 *            <br>
 *            The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
 *            must be set to correct values in the stream's timebase (unless the
 *            output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
 *            they can be set to AV_NOPTS_VALUE).
 *            The dts for subsequent packets passed to this function must be strictly
 *            increasing when compared in their respective timebases (unless the
 *            output format is flagged with the AVFMT_TS_NONSTRICT, then they
 *            merely have to be nondecreasing).  @ref AVPacket.duration
 *            "duration") should also be set if known.
 * @return < 0 on error, = 0 if OK, 1 if flushed and there is no more data to flush
 *
 * @see av_interleaved_write_frame()
 */
int av_write_frame(AVFormatContext *s, AVPacket *pkt);

本函數用於複用過程，將packet寫入輸出媒體。

packet交織是指：不一樣流的packet在輸出媒體文件中應嚴格按照packet中dts遞增的順序交錯存放。

本函數直接將packet寫入複用器(muxer)，不會緩存或記錄任何packet。本函數不負責不一樣流的packet交織問題。由調用者負責。

若是調用者不肯處理packet交織問題，應調用av_interleaved_write_frame()替代本函數。

2.5 av_interleaved_write_frame()

/**
 * Write a packet to an output media file ensuring correct interleaving.
 *
 * This function will buffer the packets internally as needed to make sure the
 * packets in the output file are properly interleaved in the order of
 * increasing dts. Callers doing their own interleaving should call
 * av_write_frame() instead of this function.
 *
 * Using this function instead of av_write_frame() can give muxers advance
 * knowledge of future packets, improving e.g. the behaviour of the mp4
 * muxer for VFR content in fragmenting mode.
 *
 * @param s media file handle
 * @param pkt The packet containing the data to be written.
 *            <br>
 *            If the packet is reference-counted, this function will take
 *            ownership of this reference and unreference it later when it sees
 *            fit.
 *            The caller must not access the data through this reference after
 *            this function returns. If the packet is not reference-counted,
 *            libavformat will make a copy.
 *            <br>
 *            This parameter can be NULL (at any time, not just at the end), to
 *            flush the interleaving queues.
 *            <br>
 *            Packet's @ref AVPacket.stream_index "stream_index" field must be
 *            set to the index of the corresponding stream in @ref
 *            AVFormatContext.streams "s->streams".
 *            <br>
 *            The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
 *            must be set to correct values in the stream's timebase (unless the
 *            output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
 *            they can be set to AV_NOPTS_VALUE).
 *            The dts for subsequent packets in one stream must be strictly
 *            increasing (unless the output format is flagged with the
 *            AVFMT_TS_NONSTRICT, then they merely have to be nondecreasing).
 *            @ref AVPacket.duration "duration") should also be set if known.
 *
 * @return 0 on success, a negative AVERROR on error. Libavformat will always
 *         take care of freeing the packet, even if this function fails.
 *
 * @see av_write_frame(), AVFormatContext.max_interleave_delta
 */
int av_interleaved_write_frame(AVFormatContext *s, AVPacket *pkt);

本函數用於複用過程，將packet寫入輸出媒體。

本函數將按需在內部緩存packet，從而確保輸出媒體中不一樣流的packet能按照dts增加的順序正確交織。

2.6 avio_open()

/**
 * Create and initialize a AVIOContext for accessing the
 * resource indicated by url.
 * @note When the resource indicated by url has been opened in
 * read+write mode, the AVIOContext can be used only for writing.
 *
 * @param s Used to return the pointer to the created AVIOContext.
 * In case of failure the pointed to value is set to NULL.
 * @param url resource to access
 * @param flags flags which control how the resource indicated by url
 * is to be opened
 * @return >= 0 in case of success, a negative value corresponding to an
 * AVERROR code in case of failure
 */
int avio_open(AVIOContext **s, const char *url, int flags);

建立並初始化一個AVIOContext，用於訪問輸出媒體文件。

2.7 avformat_write_header()

/**
 * Allocate the stream private data and write the stream header to
 * an output media file.
 *
 * @param s Media file handle, must be allocated with avformat_alloc_context().
 *          Its oformat field must be set to the desired output format;
 *          Its pb field must be set to an already opened AVIOContext.
 * @param options  An AVDictionary filled with AVFormatContext and muxer-private options.
 *                 On return this parameter will be destroyed and replaced with a dict containing
 *                 options that were not found. May be NULL.
 *
 * @return AVSTREAM_INIT_IN_WRITE_HEADER on success if the codec had not already been fully initialized in avformat_init,
 *         AVSTREAM_INIT_IN_INIT_OUTPUT  on success if the codec had already been fully initialized in avformat_init,
 *         negative AVERROR on failure.
 *
 * @see av_opt_find, av_dict_set, avio_open, av_oformat_next, avformat_init_output.
 */
av_warn_unused_result
int avformat_write_header(AVFormatContext *s, AVDictionary **options);

向輸出文件寫入文件頭信息。

2.8 av_write_trailer()

/**
 * Write the stream trailer to an output media file and free the
 * file private data.
 *
 * May only be called after a successful call to avformat_write_header.
 *
 * @param s media file handle
 * @return 0 if OK, AVERROR_xxx on error
 */
int av_write_trailer(AVFormatContext *s);

向輸出文件寫入文件尾信息。