來源:http://blog.csdn.net/wangruihit/article/details/46550853css
VideoToolbox是iOS平臺在iOS8以後開放的一個Framework,提供了在iOS平臺利用硬件實現H264編解碼的能力。git
這套接口的合成主要我一我的參與,花費了四五天的時間,中間主要參考了WWDC 2014 513關於hardware codec的視頻教程,github
OpenWebrtc的vtenc/vtdec模塊,web
chromium的一部分代碼xcode
https://src.chromium.org/svn/trunk/src/content/common/gpu/media/vt_video_decode_accelerator.cc,session
https://chromium.googlesource.com/chromium/src/media/+/cea1808de66191f7f1eb48b5579e602c0c781146/cast/sender/h264_vt_encoder.cc
app
還有stackoverflow的一些帖子,如 ide
http://stackoverflow.com/questions/29525000/how-to-use-videotoolbox-to-decompress-h-264-video-stream svn
http://stackoverflow.com/questions/24884827/possible-locations-for-sequence-picture-parameter-sets-for-h-264-stream
性能
另外還有apple forum的帖子如:
https://devforums.apple.com/message/1063536#1063536
中間須要注意的是,
1,YUV數據格式
Webrtc傳遞給Encoder的是數據是I420,對應VT裏的kCVPixelFormatType_420YpCbCr8Planar,若是VT使用
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange格式(即NV12),那麼須要將I420轉換爲NV12再進行編碼。轉換可以使用libyuv庫。
I420格式有3個Planar,分別存放YUV數據,而且數據連續存放。相似:YYYYYYYYYY.......UUUUUU......VVVVVV......
NV12格式只有2個Planar,分別存放YUV數據,首先是連續的Y數據,而後是UV數據。相似YYYYYYYYY......UVUVUV......
選擇使用I420格式編碼仍是NV12進行編碼,取決於在初始化VT時所作的設置。設置代碼以下:
- CFMutableDictionaryRef source_attrs = CFDictionaryCreateMutable (NULL, 0, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
-
- CFNumberRef number;
-
- number = CFNumberCreate (NULL, kCFNumberSInt16Type, &codec_settings->width);
- CFDictionarySetValue (source_attrs, kCVPixelBufferWidthKey, number);
- CFRelease (number);
-
- number = CFNumberCreate (NULL, kCFNumberSInt16Type, &codec_settings->height);
- CFDictionarySetValue (source_attrs, kCVPixelBufferHeightKey, number);
- CFRelease (number);
-
- OSType pixelFormat = kCVPixelFormatType_420YpCbCr8Planar;
- number = CFNumberCreate (NULL, kCFNumberSInt32Type, &pixelFormat);
- CFDictionarySetValue (source_attrs, kCVPixelBufferPixelFormatTypeKey, number);
- CFRelease (number);
-
- CFDictionarySetValue(source_attrs, kCVPixelBufferOpenGLESCompatibilityKey, kCFBooleanTrue);
-
- OSStatus ret = VTCompressionSessionCreate(NULL, codec_settings->width, codec_settings->height, kCMVideoCodecType_H264, NULL, source_attrs, NULL, EncodedFrameCallback, this, &encoder_session_);
- if (ret != 0) {
- WEBRTC_TRACE(webrtc::kTraceError, webrtc::kTraceVideoCoding, -1,
- "vt_encoder::InitEncode() fails to create encoder ret_val %d",
- ret);
- return WEBRTC_VIDEO_CODEC_ERROR;
- }
-
- CFRelease(source_attrs);
2,VT編碼出來的數據是AVCC格式,須要轉換爲Annex-B格式,才能回調給Webrtc。主要區別在於數據開頭是長度字段仍是startCode,具體見stackoverflow的帖子。
同理,編碼時,須要將webrtc的Annex-B格式轉換爲AVCC格式。
Annex-B:StartCode + Nalu1 + StartCode + Nalu2 + ...
AVCC :Nalu1 length + Nalu1 + Nalu2 length + Nalu2 + ...
注意⚠:AVCC格式中的length字段須要是big endian順序。length字段的長度可定製,通常爲1/2/4byte,須要經過接口配置給解碼器。
3,建立VideoFormatDescription
解碼時須要建立VTDecompressionSession,須要一個VideoFormatDescription參數。
建立VideoFormatDescription須要首先從碼流中獲取到SPS和PPS,而後使用以下接口建立VideoFormatDescription
- CM_EXPORT
- OSStatus CMVideoFormatDescriptionCreateFromH264ParameterSets(
- CFAllocatorRef allocator,
- size_t parameterSetCount,
- const uint8_t * constconst * parameterSetPointers,
- const size_tsize_t * parameterSetSizes,
- int NALUnitHeaderLength,
- CMFormatDescriptionRef *formatDescriptionOut )
- __OSX_AVAILABLE_STARTING(__MAC_10_9,__IPHONE_7_0);
4,判斷VT編碼出來的數據是不是keyframe
這個代碼取自OpenWebrtc from Ericsson
- static bool
- vtenc_buffer_is_keyframe (CMSampleBufferRef sbuf)
- {
- bool result = FALSE;
- CFArrayRef attachments_for_sample;
-
- attachments_for_sample = CMSampleBufferGetSampleAttachmentsArray (sbuf, 0);
- if (attachments_for_sample != NULL) {
- CFDictionaryRef attachments;
- CFBooleanRef depends_on_others;
-
- attachments = (CFDictionaryRef)CFArrayGetValueAtIndex (attachments_for_sample, 0);
- depends_on_others = (CFBooleanRef)CFDictionaryGetValue (attachments,
- kCMSampleAttachmentKey_DependsOnOthers);
- result = (depends_on_others == kCFBooleanFalse);
- }
-
- return result;
- }
4,SPS和PPS變化後判斷VT是否還能正確解碼
經過下面的接口判斷是否須要須要更新VT
- VT_EXPORT Boolean
- VTDecompressionSessionCanAcceptFormatDescription(
- <span style="white-space:pre"> </span>VTDecompressionSessionRef<span style="white-space:pre"> </span>session,
- <span style="white-space:pre"> </span>CMFormatDescriptionRef<span style="white-space:pre"> </span>newFormatDesc ) __OSX_AVAILABLE_STARTING(__MAC_10_8,__IPHONE_8_0);
5,PTS
PTS會影響VT編碼質量,通常狀況下,duration參數表示每幀數據的時長,用樣點數表示,通常視頻採樣頻率爲90KHz,幀率爲30fps,則duration就是sampleRate / frameRate = 90K/30 = 3000.
而pts表示當前幀的顯示時間,也用樣點數表示,即 n_samples * sampleRate / frameRate.
- VT_EXPORT OSStatus
- VTCompressionSessionEncodeFrame(
- VTCompressionSessionRef session,
- CVImageBufferRef imageBuffer,
- CMTime presentationTimeStamp,
- CMTime duration,
- CFDictionaryRef frameProperties,
- voidvoid * sourceFrameRefCon,
- VTEncodeInfoFlags *infoFlagsOut
6,編碼選項
- kVTCompressionPropertyKey_AllowTemporalCompression
- kVTCompressionPropertyKey_AllowFrameReordering
TemporalCompression控制是否產生P幀。
FrameReordering控制是否產生B幀。
7,使用自帶的PixelBufferPool提升性能。
建立VTSession以後會自動建立一個PixelBufferPool,用作循環緩衝區,下降頻繁申請釋放內存區域形成的額外開銷。
- VT_EXPORT CVPixelBufferPoolRef
- VTCompressionSessionGetPixelBufferPool(
- VTCompressionSessionRef session ) __OSX_AVAILABLE_STARTING(__MAC_10_8, __IPHONE_8_0);
- CV_EXPORT CVReturn CVPixelBufferPoolCreatePixelBuffer(CFAllocatorRef allocator,
- CVPixelBufferPoolRef pixelBufferPool,
- CVPixelBufferRef *pixelBufferOut) __OSX_AVAILABLE_STARTING(__MAC_10_4,__IPHONE_4_0);
中間還有不少不少的細節,任何一處錯誤都是致使千奇百怪的crash/編碼或解碼失敗等
多看看我提供的那幾個連接,會頗有幫助。
通過測試,iOS8 硬件編解碼效果確實很好,比OpenH264出來的視頻質量更清晰,而且能輕鬆達到30幀,碼率控制的精確性也更高。