音頻編碼 Audio Converter

需求

iOS中將採集到的原始音頻數據(PCM)進行編碼以獲得壓縮數據類型(AAC...).ios

本例最終實現的是經過Audio Unit採集到PCM數據,將其壓縮轉爲AAC數據,並以錄製的形式保存在沙盒中.可調整編碼後音頻數據格式,採樣率,編碼器類型等參數.git


實現原理

利用Audio Toolbox Framework中的Audio Converter能夠實現音頻數據編碼,即將PCM數據轉爲其餘壓縮格式.github


閱讀前提:


GitHub地址(附代碼) : 音頻編碼

簡書地址 : 音頻編碼

掘金地址 : 音頻編碼

博客地址 : 音頻編碼


1.初始化

1.1. 初始化編碼器

初始化編碼器實例, 經過指定原始數據格式,最終編碼後的格式,採樣率,以及使用硬編仍是軟編,如下是具體步驟.數組

- (instancetype)initWithSourceFormat:(AudioStreamBasicDescription)sourceFormat destFormatID:(AudioFormatID)destFormatID sampleRate:(float)sampleRate isUseHardwareEncode:(BOOL)isUseHardwareEncode {
    if (self = [super init]) {
        mSourceFormat   = sourceFormat;
        mAudioConverter = [self configureEncoderBySourceFormat:sourceFormat
                                                    destFormat:&mDestinationFormat
                                                  destFormatID:destFormatID
                                                    sampleRate:sampleRate
                                           isUseHardwareEncode:isUseHardwareEncode];
    }
    return self;
}

複製代碼

1.2. 配置編碼後ASBD音頻流信息

AudioStreamBasicDescription destinationFormat = {};
    destinationFormat.mSampleRate = sampleRate;
    if (destFormatID == kAudioFormatLinearPCM) {
        NSLog(@"Not get PCM format after encoding !");
        return NULL;
    } else {
        destinationFormat.mFormatID = destFormatID;
        
        // For iLBC, the number of channels must be 1.
        destinationFormat.mChannelsPerFrame = (destFormatID == kAudioFormatiLBC ? 1 : sourceFormat.mChannelsPerFrame);
        
        // Use AudioFormat API to fill out the rest of the description.
        size = sizeof(destinationFormat);
        if (![self checkError:AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &destinationFormat) withErrorString:@"AudioFormatGetProperty couldn't fill out the destination data format"]) {
            return NULL;
        }
    }
    memcpy(destFormat, &destinationFormat, sizeof(AudioStreamBasicDescription));
複製代碼

對音頻作編碼操做,實際就是將PCM格式轉爲如AAC等音頻壓縮格式(VBR格式),經過kAudioFormatProperty_FormatInfo屬性能夠自動獲取指定音頻格式的參數信息.bash

注意: 若是音頻格式是iLBC, 聲道數只能爲1.app

1.3. 選擇編碼器類型

AudioClassDescription結構體描述了系統使用音頻編碼器信息,其中最重要的就是指定使用硬編或軟編。而後編碼器的數量,即數組的個數,由當前的聲道數決定。less

// encoder conut by channels.
    AudioClassDescription requestedCodecs[destinationFormat.mChannelsPerFrame];
    const OSType subtype = destFormatID;
    for (int i = 0; i < destinationFormat.mChannelsPerFrame; i++) {
        AudioClassDescription codec = {
            kAudioEncoderComponentType,
            subtype,
            isUseHardwareEncode ? kAppleHardwareAudioCodecManufacturer : kAppleSoftwareAudioCodecManufacturer,
        };
        requestedCodecs[i] = codec;
    }
複製代碼

注意:硬編即利用設備GPU硬件完成高效編碼,下降CPU消耗. 軟編就是傳統的經過CPU計算。函數

1.4. 建立編碼器

AudioConverterNewSpecific: 經過指定編碼器來建立audio converter實例對象.第3,4個 分別是編碼器的數量與編碼器描述,同上,與聲道數保持一致.post

// Create the AudioConverterRef.
    AudioConverterRef converter = NULL;
    if (![self checkError:AudioConverterNewSpecific(&sourceFormat, &destinationFormat, destinationFormat.mChannelsPerFrame, requestedCodecs, &converter) withErrorString:@"AudioConverterNew failed"]) {
        return NULL;
    }else {
        printf("Audio converter create successful \n");
    }
複製代碼

1.5. 設置碼率

咱們能夠手動設置須要的碼率,若是沒有特殊要求通常能夠根據採樣率使用建議值,以下.ui

/*
     If encoding to AAC set the bitrate kAudioConverterEncodeBitRate is a UInt32 value containing
     the number of bits per second to aim for when encoding data when you explicitly set the bit rate
     and the sample rate, this tells the encoder to stick with both bit rate and sample rate
     but there are combinations (also depending on the number of channels) which will not be allowed
     if you do not explicitly set a bit rate the encoder will pick the correct value for you depending
     on samplerate and number of channels bit rate also scales with the number of channels,
     therefore one bit rate per sample rate can be used for mono cases and if you have stereo or more,
     you can multiply that number by the number of channels.
     */
    
    if (destinationFormat.mFormatID == kAudioFormatMPEG4AAC) {
        UInt32 outputBitRate = 64000;
        
        UInt32 propSize = sizeof(outputBitRate);
        
        if (destinationFormat.mSampleRate >= 44100) {
            outputBitRate = 192000;
        } else if (destinationFormat.mSampleRate < 22000) {
            outputBitRate = 32000;
        }
        outputBitRate *= destinationFormat.mChannelsPerFrame;
        
        // Set the bit rate depending on the sample rate chosen.
        if (![self checkError:AudioConverterSetProperty(converter, kAudioConverterEncodeBitRate, propSize, &outputBitRate) withErrorString:@"AudioConverterSetProperty kAudioConverterEncodeBitRate failed!"]) {
            return NULL;
        }
        
        // Get it back and print it out.
        AudioConverterGetProperty(converter, kAudioConverterEncodeBitRate, &propSize, &outputBitRate);
        printf ("AAC Encode Bitrate: %u\n", (unsigned int)outputBitRate);
    }
複製代碼

1.6. 設置中斷後是否可恢復

kAudioConverterPropertyCanResumeFromInterruption: 設置converter可否在中斷後恢復.

若是沒有顯式實現該屬性或get此屬性返回錯誤,說明當前不是硬編,若是此查詢返回1代表編碼器能夠在中斷後恢復.不然不能恢復.

/*
     Can the Audio Converter resume after an interruption?
     this property may be queried at any time after construction of the Audio Converter after setting its output format
     there's no clear reason to prefer construction time, interruption time, or potential resumption time but we prefer construction time since it means less code to execute during or after interruption time. */ BOOL canResumeFromInterruption = YES; UInt32 canResume = 0; size = sizeof(canResume); OSStatus error = AudioConverterGetProperty(converter, kAudioConverterPropertyCanResumeFromInterruption, &size, &canResume); if (error == noErr) { /* we recieved a valid return value from the GetProperty call if the property's value is 1, then the codec CAN resume work following an interruption
         if the property's value is 0, then interruptions destroy the codec's state and we're done */ if (canResume == 0) { canResumeFromInterruption = NO; } printf("Audio Converter %s continue after interruption!\n", (!canResumeFromInterruption ? "CANNOT" : "CAN")); } else { /* if the property is unimplemented (kAudioConverterErr_PropertyNotSupported, or paramErr returned in the case of PCM), then the codec being used is not a hardware codec so we're not concerned about codec state
         we are always going to be able to resume conversion after an interruption
         */
        
        if (error == kAudioConverterErr_PropertyNotSupported) {
            printf("kAudioConverterPropertyCanResumeFromInterruption property not supported - see comments in source for more info.\n");
            
        } else {
            printf("AudioConverterGetProperty kAudioConverterPropertyCanResumeFromInterruption result %d, paramErr is OK if PCM\n", (int)error);
        }
        
        error = noErr;
    }
複製代碼

2.編碼

2.1. 估算音頻大小

kAudioConverterPropertyMaximumOutputPacketSize: 能夠查詢編碼後音頻數據最大數值.此值經常使用來估算音頻編碼後最大值.能夠經過此值爲音頻數據分配空間.

UInt32 outputSizePerPacket = destFormat.mBytesPerPacket;
    if (outputSizePerPacket == 0) {
        // if the destination format is VBR, we need to get max size per packet from the converter
        UInt32 size = sizeof(outputSizePerPacket);
        if (![self checkError:AudioConverterGetProperty(audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &outputSizePerPacket) withErrorString:@"AudioConverterGetProperty kAudioConverterPropertyMaximumOutputPacketSize failed!"]) {
            return;
        }
    }
複製代碼

2.2. 爲編碼後音頻數據預分配內存

咱們能夠將2.1中算出的最大size爲這個Buffer list分配內存,也可用原始音頻數據的大小爲其分配內存,由於咱們沒法直接得知編碼後數據究竟是多大,因此用估算出來的最大值或原始數據大小分配內存均可以生效,由於最終編碼器會將有效大小的值賦值進去.

// Set up output buffer list.
    AudioBufferList fillBufferList = {};
    fillBufferList.mNumberBuffers = 1;
    fillBufferList.mBuffers[0].mNumberChannels  = destFormat.mChannelsPerFrame;
    fillBufferList.mBuffers[0].mDataByteSize    = theOutputBufferSize;
    fillBufferList.mBuffers[0].mData            = malloc(theOutputBufferSize * sizeof(char));
複製代碼

2.3. 編碼音頻數據

解析AudioConverterFillComplexBuffer:用來編碼音頻數據.同時須要指定回調函數(C語言函數),

第二個參數即指定回調函數,此回調函數中主要作的是爲即將編碼的數據進行賦值,即咱們要把原始音頻數據賦值給回調函數中的ioData參數,這是咱們在編碼前最後一次控制原始音頻數據,此回調函數執行後即完成了編碼的過程,新的數據會填充到第五個參數中,也就是咱們上面預約義的fillBufferList.

  • userInfo: 自定義一個結構體,用來與編碼回調函數間交互以傳遞數據.在這裏是將原始音頻數據信息傳給編碼回調函數中.
  • ioOutputDataPackets: 填入函數中時表示原始音頻數據包的數量,而函數調用完成時表示轉換後輸出的音頻數據包總數
  • outputPacketDescriptions: 轉換完成後,若是此參數非空,表示轉換器輸出使用的音頻數據包描述,它必須提早分配好內存,以讓轉換器賦值到其中.

最終,咱們將轉換後獲得的AAC數據以回調函數的形式傳給調用者.

OSStatus EncodeConverterComplexInputDataProc(AudioConverterRef              inAudioConverter,
                                             UInt32                         *ioNumberDataPackets,
                                             AudioBufferList                *ioData,
                                             AudioStreamPacketDescription   **outDataPacketDescription,
                                             void                           *inUserData) {
    XDXConverterInfoType *info = (XDXConverterInfoType *)inUserData;
    ioData->mNumberBuffers              = 1;
    ioData->mBuffers[0].mData           = info->sourceBuffer;
    ioData->mBuffers[0].mNumberChannels = info->sourceChannelsPerFrame;
    ioData->mBuffers[0].mDataByteSize   = info->sourceDataSize;
    
    return noErr;
}

- (void)encodeFormatByConverter:(AudioConverterRef)audioConverter sourceBuffer:(void *)sourceBuffer sourceBufferSize:(UInt32)sourceBufferSize sourceFormat:(AudioStreamBasicDescription)sourceFormat dest:(AudioStreamBasicDescription)destFormat completeHandler:(void(^)(AudioBufferList *destBufferList, UInt32 outputPackets, AudioStreamPacketDescription *outputPacketDescriptions))completeHandler {
    ...
    
    XDXConverterInfoType userInfo   = {0};
    userInfo.sourceBuffer           = sourceBuffer;
    userInfo.sourceDataSize         = sourceBufferSize;
    userInfo.sourceChannelsPerFrame = sourceFormat.mChannelsPerFrame;
    
    UInt32 numberOutputPackets = 1;
    UInt32 theOutputBufferSize = sourceBufferSize;
    UInt32 ioOutputDataPackets = numberOutputPackets;
    AudioStreamPacketDescription outputPacketDescriptions;
    // Convert data
    OSStatus status = AudioConverterFillComplexBuffer(audioConverter,
                                                      EncodeConverterComplexInputDataProc,
                                                      &userInfo,
                                                      &ioOutputDataPackets,
                                                      &fillBufferList,
                                                      &outputPacketDescriptions);
    
    
    
    // if interrupted in the process of the conversion call, we must handle the error appropriately
    if (status != noErr) {
        if (status == kAudioConverterErr_HardwareInUse) {
            printf("Audio Converter returned kAudioConverterErr_HardwareInUse!\n");
        } else {
            if (![self checkError:status withErrorString:@"AudioConverterFillComplexBuffer error!"]) {
                return;
            }
        }
    } else {
        if (ioOutputDataPackets == 0) {
            // This is the EOF condition.
            status = noErr;
        }
        
        completeHandler(&fillBufferList, ioOutputDataPackets, &outputPacketDescriptions);
    }
}

複製代碼

3. 模塊對接

由於音頻編碼要依賴音頻採集,因此咱們這裏以audio unit採集爲例做示範,即便用audio unit採集pcm數據而後使用此模塊編碼獲得aac數據.如需瞭解請參考以下連接

3.1. 初始化編碼器

以下,在音頻採集的類中聲明一個編碼器實例變量,而後初始化它. 僅僅須要設置原始數據格式,編碼後的格式,採樣率,使用硬編,軟編便可.

@property (nonatomic, strong) XDXAduioEncoder *audioEncoder;

...

        self->_audioEncoder = [[XDXAduioEncoder alloc] initWithSourceFormat:m_audioDataFormat
                                                               destFormatID:kAudioFormatMPEG4AAC
                                                                 sampleRate:44100
                                                        isUseHardwareEncode:YES];
複製代碼

3.2. 編碼音頻數據

在Audio Unit採集PCM音頻數據的回調中將PCM數據送入編碼器,而後在回調函數中將獲得的AAC數據其寫入文件.

static OSStatus AudioCaptureCallback(void                       *inRefCon,
                                     AudioUnitRenderActionFlags *ioActionFlags,
                                     const AudioTimeStamp       *inTimeStamp,
                                     UInt32                     inBusNumber,
                                     UInt32                     inNumberFrames,
                                     AudioBufferList            *ioData) {
    AudioUnitRender(m_audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, m_buffList);
    
    XDXAudioCaptureManager *manager = (__bridge XDXAudioCaptureManager *)inRefCon;

    void    *bufferData = m_buffList->mBuffers[0].mData;
    UInt32   bufferSize = m_buffList->mBuffers[0].mDataByteSize;
    
    [manager.audioEncoder encodeAudioWithSourceBuffer:bufferData
                                       sourceBufferSize:bufferSize
                                        completeHandler:^(AudioBufferList * _Nonnull destBufferList, UInt32 outputPackets, AudioStreamPacketDescription * _Nonnull outputPacketDescriptions) {
                                            if (manager.isRecordVoice) {
                                                [[XDXAudioFileHandler getInstance] writeFileWithInNumBytes:destBufferList->mBuffers->mDataByteSize
                                                                                              ioNumPackets:outputPackets
                                                                                                  inBuffer:destBufferList->mBuffers->mData
                                                                                              inPacketDesc:outputPacketDescriptions];
                                            }
                                            
                                            free(destBufferList->mBuffers->mData);
                                        }];
    

    
    return noErr;
}
複製代碼

3.4. 釋放內存

使用完編碼後的音頻數據,記得釋放內存.

free(destBufferList->mBuffers->mData);
複製代碼

4. 文件錄製

此部分可參考另外一篇文章: 音頻文件錄製

5. 釋放編碼器資源

如需釋放內存,請保證編碼器工做完全結束後再釋放內存.

- (void)freeEncoder {
    if (mAudioConverter) {
        AudioConverterDispose(mAudioConverter);
        mAudioConverter = NULL;
    }
}
複製代碼
相關文章
相關標籤/搜索