本文爲做者原創,轉載請註明出處:http://www.javashuo.com/article/p-cweaorzk-q.htmlhtml
本文基於FFmpeg 4.1版本。express
struct AVFrame定義於<libavutil/frame.h>數組
struct AVFrame frame;
AVFrame中存儲的是通過解碼後的原始數據。在解碼中,AVFrame是解碼器的輸出;在編碼中,AVFrame是編碼器的輸入。下圖中,「decoded frames」的數據類型就是AVFrame:安全
_______ ______________ | | | | | input | demuxer | encoded data | decoder | file | ---------> | packets | -----+ |_______| |______________| | v _________ | | | decoded | | frames | |_________| ________ ______________ | | | | | | | output | <-------- | encoded data | <----+ | file | muxer | packets | encoder |________| |______________|
AVFrame數據結構很是重要,它的成員很是多,致使數據結構定義篇幅很長。下面引用的數據結構定義中省略冗長的註釋以及大部分紅員,先整體說明AVFrame的用法,而後再將一些重要成員摘錄出來單獨進行說明:數據結構
/** * This structure describes decoded (raw) audio or video data. * * AVFrame must be allocated using av_frame_alloc(). Note that this only * allocates the AVFrame itself, the buffers for the data must be managed * through other means (see below). * AVFrame must be freed with av_frame_free(). * * AVFrame is typically allocated once and then reused multiple times to hold * different data (e.g. a single AVFrame to hold frames received from a * decoder). In such a case, av_frame_unref() will free any references held by * the frame and reset it to its original clean state before it * is reused again. * * The data described by an AVFrame is usually reference counted through the * AVBuffer API. The underlying buffer references are stored in AVFrame.buf / * AVFrame.extended_buf. An AVFrame is considered to be reference counted if at * least one reference is set, i.e. if AVFrame.buf[0] != NULL. In such a case, * every single data plane must be contained in one of the buffers in * AVFrame.buf or AVFrame.extended_buf. * There may be a single buffer for all the data, or one separate buffer for * each plane, or anything in between. * * sizeof(AVFrame) is not a part of the public ABI, so new fields may be added * to the end with a minor bump. * * Fields can be accessed through AVOptions, the name string used, matches the * C structure field name for fields accessible through AVOptions. The AVClass * for AVFrame can be obtained from avcodec_get_frame_class() */ typedef struct AVFrame { uint8_t *data[AV_NUM_DATA_POINTERS]; int linesize[AV_NUM_DATA_POINTERS]; uint8_t **extended_data; int width, height; int nb_samples; int format; int key_frame; enum AVPictureType pict_type; AVRational sample_aspect_ratio; int64_t pts; ...... } AVFrame;
AVFrame的用法:less
下面將一些重要的成員摘錄出來進行說明:
dataide
/** * pointer to the picture/channel planes. * This might be different from the first allocated byte * * Some decoders access areas outside 0,0 - width,height, please * see avcodec_align_dimensions2(). Some filters and swscale can read * up to 16 bytes beyond the planes, if these filters are to be used, * then 16 extra bytes must be allocated. * * NOTE: Except for hwaccel formats, pointers not needed by the format * MUST be set to NULL. */ uint8_t *data[AV_NUM_DATA_POINTERS];
存儲原始幀數據(未編碼的原始圖像或音頻格式,做爲解碼器的輸出或編碼器的輸入)。
data是一個指針數組,數組的每個元素是一個指針,指向視頻中圖像的某一plane或音頻中某一聲道的plane。
關於圖像plane的詳細說明參考「色彩空間與像素格式」,音頻plane的詳細說明參數「ffplay源碼解析6-音頻重採樣 6.1.1節」。下面簡單說明:
對於packet格式,一幅YUV圖像的Y、U、V交織存儲在一個plane中,形如YUVYUV...,data[0]指向這個plane;
一個雙聲道的音頻幀其左聲道L、右聲道R交織存儲在一個plane中,形如LRLRLR...,data[0]指向這個plane。
對於planar格式,一幅YUV圖像有Y、U、V三個plane,data[0]指向Y plane,data[1]指向U plane,data[2]指向V plane;
一個雙聲道的音頻幀有左聲道L和右聲道R兩個plane,data[0]指向L plane,data[1]指向R plane。函數
linesize佈局
/** * For video, size in bytes of each picture line. * For audio, size in bytes of each plane. * * For audio, only linesize[0] may be set. For planar audio, each channel * plane must be the same size. * * For video the linesizes should be multiples of the CPUs alignment * preference, this is 16 or 32 for modern desktop CPUs. * Some code requires such alignment other code can be slower without * correct alignment, for yet other it makes no difference. * * @note The linesize may be larger than the size of usable data -- there * may be extra padding present for performance reasons. */ int linesize[AV_NUM_DATA_POINTERS];
對於視頻來講,linesize是每行圖像的大小(字節數)。注意有對齊要求。
對於音頻來講,linesize是每一個plane的大小(字節數)。音頻只使用linesize[0]。對於planar音頻來講,每一個plane的大小必須同樣。
linesize可能會因性能上的考慮而填充一些額外的數據,所以linesize可能比實際對應的音視頻數據尺寸要大。性能
extended_data
/** * pointers to the data planes/channels. * * For video, this should simply point to data[]. * * For planar audio, each channel has a separate data pointer, and * linesize[0] contains the size of each channel buffer. * For packed audio, there is just one data pointer, and linesize[0] * contains the total size of the buffer for all channels. * * Note: Both data and extended_data should always be set in a valid frame, * but for planar audio with more channels that can fit in data, * extended_data must be used in order to access all channels. */ uint8_t **extended_data;
????extended_data是幹啥的????
對於視頻來講,直接指向data[]成員。
對於音頻來講,packet格式音頻只有一個plane,一個音頻幀中各個聲道的採樣點交織存儲在此plane中;planar格式音頻每一個聲道一個plane。在多聲道planar格式音頻中,必須使用extended_data才能訪問全部聲道,什麼意思?
在有效的視頻/音頻frame中,data和extended_data兩個成員都必須設置有效值。
width, height
/** * @name Video dimensions * Video frames only. The coded dimensions (in pixels) of the video frame, * i.e. the size of the rectangle that contains some well-defined values. * * @note The part of the frame intended for display/presentation is further * restricted by the @ref cropping "Cropping rectangle". * @{ */ int width, height;
視頻幀寬和高(像素)。
nb_samples
/** * number of audio samples (per channel) described by this frame */ int nb_samples;
音頻幀中單個聲道中包含的採樣點數。
format
/** * format of the frame, -1 if unknown or unset * Values correspond to enum AVPixelFormat for video frames, * enum AVSampleFormat for audio) */ int format;
幀格式。若是是未知格式或未設置,則值爲-1。
對於視頻幀,此值對應於「enum AVPixelFormat」結構:
enum AVPixelFormat { AV_PIX_FMT_NONE = -1, AV_PIX_FMT_YUV420P, ///< planar YUV 4:2:0, 12bpp, (1 Cr & Cb sample per 2x2 Y samples) AV_PIX_FMT_YUYV422, ///< packed YUV 4:2:2, 16bpp, Y0 Cb Y1 Cr AV_PIX_FMT_RGB24, ///< packed RGB 8:8:8, 24bpp, RGBRGB... AV_PIX_FMT_BGR24, ///< packed RGB 8:8:8, 24bpp, BGRBGR... ...... }
對於音頻幀,此值對應於「enum AVSampleFormat」格式:
enum AVSampleFormat { AV_SAMPLE_FMT_NONE = -1, AV_SAMPLE_FMT_U8, ///< unsigned 8 bits AV_SAMPLE_FMT_S16, ///< signed 16 bits AV_SAMPLE_FMT_S32, ///< signed 32 bits AV_SAMPLE_FMT_FLT, ///< float AV_SAMPLE_FMT_DBL, ///< double AV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planar AV_SAMPLE_FMT_S16P, ///< signed 16 bits, planar AV_SAMPLE_FMT_S32P, ///< signed 32 bits, planar AV_SAMPLE_FMT_FLTP, ///< float, planar AV_SAMPLE_FMT_DBLP, ///< double, planar AV_SAMPLE_FMT_S64, ///< signed 64 bits AV_SAMPLE_FMT_S64P, ///< signed 64 bits, planar AV_SAMPLE_FMT_NB ///< Number of sample formats. DO NOT USE if linking dynamically };
key_frame
/** * 1 -> keyframe, 0-> not */ int key_frame;
視頻幀是不是關鍵幀的標識,1->關鍵幀,0->非關鍵幀。
pict_type
/** * Picture type of the frame. */ enum AVPictureType pict_type;
視頻幀類型(I、B、P等)。以下:
/** * @} * @} * @defgroup lavu_picture Image related * * AVPicture types, pixel formats and basic image planes manipulation. * * @{ */ enum AVPictureType { AV_PICTURE_TYPE_NONE = 0, ///< Undefined AV_PICTURE_TYPE_I, ///< Intra AV_PICTURE_TYPE_P, ///< Predicted AV_PICTURE_TYPE_B, ///< Bi-dir predicted AV_PICTURE_TYPE_S, ///< S(GMC)-VOP MPEG-4 AV_PICTURE_TYPE_SI, ///< Switching Intra AV_PICTURE_TYPE_SP, ///< Switching Predicted AV_PICTURE_TYPE_BI, ///< BI type };
sample_aspect_ratio
/** * Sample aspect ratio for the video frame, 0/1 if unknown/unspecified. */ AVRational sample_aspect_ratio;
視頻幀的寬高比。
pts
/** * Presentation timestamp in time_base units (time when frame should be shown to user). */ int64_t pts;
顯示時間戳。單位是time_base。
pkt_pts
#if FF_API_PKT_PTS /** * PTS copied from the AVPacket that was decoded to produce this frame. * @deprecated use the pts field instead */ attribute_deprecated int64_t pkt_pts; #endif
此frame對應的packet中的顯示時間戳。是從對應packet(解碼生成此frame)中拷貝PTS獲得此值。
pkt_dts
/** * DTS copied from the AVPacket that triggered returning this frame. (if frame threading isn't used) * This is also the Presentation time of this AVFrame calculated from * only AVPacket.dts values without pts values. */ int64_t pkt_dts;
此frame對應的packet中的解碼時間戳。是從對應packet(解碼生成此frame)中拷貝DTS獲得此值。
若是對應的packet中只有dts而未設置pts,則此值也是此frame的pts。
coded_picture_number
/** * picture number in bitstream order */ int coded_picture_number;
在編碼流中當前圖像的序號。
display_picture_number
/** * picture number in display order */ int display_picture_number;
在顯示序列中當前圖像的序號。
interlaced_frame
/** * The content of the picture is interlaced. */ int interlaced_frame;
圖像逐行/隔行模式標識。
sample_rate
/** * Sample rate of the audio data. */ int sample_rate;
音頻採樣率。
channel_layout
/** * Channel layout of the audio data. */ uint64_t channel_layout;
音頻聲道佈局。每bit表明一個特定的聲道,參考channel_layout.h中的定義,一目瞭然:
/** * @defgroup channel_masks Audio channel masks * * A channel layout is a 64-bits integer with a bit set for every channel. * The number of bits set must be equal to the number of channels. * The value 0 means that the channel layout is not known. * @note this data structure is not powerful enough to handle channels * combinations that have the same channel multiple times, such as * dual-mono. * * @{ */ #define AV_CH_FRONT_LEFT 0x00000001 #define AV_CH_FRONT_RIGHT 0x00000002 #define AV_CH_FRONT_CENTER 0x00000004 #define AV_CH_LOW_FREQUENCY 0x00000008 ...... /** * @} * @defgroup channel_mask_c Audio channel layouts * @{ * */ #define AV_CH_LAYOUT_MONO (AV_CH_FRONT_CENTER) #define AV_CH_LAYOUT_STEREO (AV_CH_FRONT_LEFT|AV_CH_FRONT_RIGHT) #define AV_CH_LAYOUT_2POINT1 (AV_CH_LAYOUT_STEREO|AV_CH_LOW_FREQUENCY)
buf
/** * AVBuffer references backing the data for this frame. If all elements of * this array are NULL, then this frame is not reference counted. This array * must be filled contiguously -- if buf[i] is non-NULL then buf[j] must * also be non-NULL for all j < i. * * There may be at most one AVBuffer per data plane, so for video this array * always contains all the references. For planar audio with more than * AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in * this array. Then the extra AVBufferRef pointers are stored in the * extended_buf array. */ AVBufferRef *buf[AV_NUM_DATA_POINTERS];
此幀的數據能夠由AVBufferRef管理,AVBufferRef提供AVBuffer引用機制。這裏涉及到緩衝區引用計數概念:
AVBuffer是FFmpeg中很經常使用的一種緩衝區,緩衝區使用引用計數(reference-counted)機制。
AVBufferRef則對AVBuffer緩衝區提供了一層封裝,最主要的是做引用計數處理,實現了一種安全機制。用戶不該直接訪問AVBuffer,應經過AVBufferRef來訪問AVBuffer,以保證安全。
FFmpeg中不少基礎的數據結構都包含了AVBufferRef成員,來間接使用AVBuffer緩衝區。
相關內容參考「FFmpeg數據結構AVBuffer」
????幀的數據緩衝區AVBuffer就是前面的data成員,用戶不該直接使用data成員,應經過buf成員間接使用data成員。那extended_data又是作什麼的呢????
若是buf[]的全部元素都爲NULL,則此幀不會被引用計數。必須連續填充buf[] - 若是buf[i]爲非NULL,則對於全部j<i,buf[j]也必須爲非NULL。
每一個plane最多能夠有一個AVBuffer,一個AVBufferRef指針指向一個AVBuffer,一個AVBuffer引用指的就是一個AVBufferRef指針。
對於視頻來講,buf[]包含全部AVBufferRef指針。對於具備多於AV_NUM_DATA_POINTERS個聲道的planar音頻來講,可能buf[]存不下全部的AVBbufferRef指針,多出的AVBufferRef指針存儲在extended_buf數組中。
extended_buf&nb_extended_buf
/** * For planar audio which requires more than AV_NUM_DATA_POINTERS * AVBufferRef pointers, this array will hold all the references which * cannot fit into AVFrame.buf. * * Note that this is different from AVFrame.extended_data, which always * contains all the pointers. This array only contains the extra pointers, * which cannot fit into AVFrame.buf. * * This array is always allocated using av_malloc() by whoever constructs * the frame. It is freed in av_frame_unref(). */ AVBufferRef **extended_buf; /** * Number of elements in extended_buf. */ int nb_extended_buf;
對於具備多於AV_NUM_DATA_POINTERS個聲道的planar音頻來講,可能buf[]存不下全部的AVBbufferRef指針,多出的AVBufferRef指針存儲在extended_buf數組中。
注意此處的extended_buf和AVFrame.extended_data的不一樣,AVFrame.extended_data包含全部指向各plane的指針,而extended_buf只包含AVFrame.buf中裝不下的指針。
extended_buf是構造frame時av_frame_alloc()中自動調用av_malloc()來分配空間的。調用av_frame_unref會釋放掉extended_buf。
nb_extended_buf是extended_buf中的元素數目。
best_effort_timestamp
/** * frame timestamp estimated using various heuristics, in stream time base * - encoding: unused * - decoding: set by libavcodec, read by user. */ int64_t best_effort_timestamp;
????
pkt_pos
/** * reordered pos from the last AVPacket that has been input into the decoder * - encoding: unused * - decoding: Read by user. */ int64_t pkt_pos;
記錄最後一個扔進解碼器的packet在輸入文件中的位置偏移量。
pkt_duration
/** * duration of the corresponding packet, expressed in * AVStream->time_base units, 0 if unknown. * - encoding: unused * - decoding: Read by user. */ int64_t pkt_duration;
對應packet的時長,單位是AVStream->time_base。
channels
/** * number of audio channels, only used for audio. * - encoding: unused * - decoding: Read by user. */ int channels;
音頻聲道數量。
pkt_size
/** * size of the corresponding packet containing the compressed * frame. * It is set to a negative value if unknown. * - encoding: unused * - decoding: set by libavcodec, read by user. */ int pkt_size;
對應packet的大小。
crop_
/** * @anchor cropping * @name Cropping * Video frames only. The number of pixels to discard from the the * top/bottom/left/right border of the frame to obtain the sub-rectangle of * the frame intended for presentation. * @{ */ size_t crop_top; size_t crop_bottom; size_t crop_left; size_t crop_right; /** * @} */
用於視頻幀圖像裁切。四個值分別爲從frame的上/下/左/右邊界裁切的像素數。
/** * Allocate an AVFrame and set its fields to default values. The resulting * struct must be freed using av_frame_free(). * * @return An AVFrame filled with default values or NULL on failure. * * @note this only allocates the AVFrame itself, not the data buffers. Those * must be allocated through other means, e.g. with av_frame_get_buffer() or * manually. */ AVFrame *av_frame_alloc(void);
構造一個frame,對象各成員被設爲默認值。
此函數只分配AVFrame對象自己,而不分配AVFrame中的數據緩衝區。
/** * Free the frame and any dynamically allocated objects in it, * e.g. extended_data. If the frame is reference counted, it will be * unreferenced first. * * @param frame frame to be freed. The pointer will be set to NULL. */ void av_frame_free(AVFrame **frame);
釋放一個frame。
/** * Set up a new reference to the data described by the source frame. * * Copy frame properties from src to dst and create a new reference for each * AVBufferRef from src. * * If src is not reference counted, new buffers are allocated and the data is * copied. * * @warning: dst MUST have been either unreferenced with av_frame_unref(dst), * or newly allocated with av_frame_alloc() before calling this * function, or undefined behavior will occur. * * @return 0 on success, a negative AVERROR on error */ int av_frame_ref(AVFrame *dst, const AVFrame *src);
爲src中的數據創建一個新的引用。
將src中幀的各屬性拷到dst中,而且爲src中每一個AVBufferRef建立一個新的引用。
若是src未使用引用計數,則dst中會分配新的數據緩衝區,將將src中緩衝區的數據拷貝到dst中的緩衝區。
/** * Create a new frame that references the same data as src. * * This is a shortcut for av_frame_alloc()+av_frame_ref(). * * @return newly created AVFrame on success, NULL on error. */ AVFrame *av_frame_clone(const AVFrame *src);
建立一個新的frame,新的frame和src使用同一數據緩衝區,緩衝區管理使用引用計數機制。
本函數至關於av_frame_alloc()+av_frame_ref()
/** * Unreference all the buffers referenced by frame and reset the frame fields. */ void av_frame_unref(AVFrame *frame);
解除本frame對本frame中全部緩衝區的引用,並復位frame中各成員。
/** * Move everything contained in src to dst and reset src. * * @warning: dst is not unreferenced, but directly overwritten without reading * or deallocating its contents. Call av_frame_unref(dst) manually * before calling this function to ensure that no memory is leaked. */ void av_frame_move_ref(AVFrame *dst, AVFrame *src);
將src中全部數據拷貝到dst中,並復位src。
爲避免內存泄漏,在調用av_frame_move_ref(dst, src)
以前應先調用av_frame_unref(dst)
。
/** * Allocate new buffer(s) for audio or video data. * * The following fields must be set on frame before calling this function: * - format (pixel format for video, sample format for audio) * - width and height for video * - nb_samples and channel_layout for audio * * This function will fill AVFrame.data and AVFrame.buf arrays and, if * necessary, allocate and fill AVFrame.extended_data and AVFrame.extended_buf. * For planar formats, one buffer will be allocated for each plane. * * @warning: if frame already has been allocated, calling this function will * leak memory. In addition, undefined behavior can occur in certain * cases. * * @param frame frame in which to store the new buffers. * @param align Required buffer size alignment. If equal to 0, alignment will be * chosen automatically for the current CPU. It is highly * recommended to pass 0 here unless you know what you are doing. * * @return 0 on success, a negative AVERROR on error. */ int av_frame_get_buffer(AVFrame *frame, int align);
爲音頻或視頻數據分配新的緩衝區。
調用本函數前,幀中的以下成員必須先設置好:
本函數會填充AVFrame.data和AVFrame.buf數組,若是有須要,還會分配和填充AVFrame.extended_data和AVFrame.extended_buf。
對於planar格式,會爲每一個plane分配一個緩衝區。
/** * Copy the frame data from src to dst. * * This function does not allocate anything, dst must be already initialized and * allocated with the same parameters as src. * * This function only copies the frame data (i.e. the contents of the data / * extended data arrays), not any other properties. * * @return >= 0 on success, a negative AVERROR on error. */ int av_frame_copy(AVFrame *dst, const AVFrame *src);
將src中的幀數據拷貝到dst中。
本函數並不會有任何分配緩衝區的動做,調用此函數前dst必須已經使用了和src一樣的參數完成了初始化。
本函數只拷貝幀中的數據緩衝區的內容(data/extended_data數組中的內容),而不涉及幀中任何其餘的屬性。
[1] FFMPEG結構體分析:AVFrame, https://blog.csdn.net/leixiaohua1020/article/details/14214577
2019-01-13 V1.0 初稿