音頻解碼 Audio Converter

需求

iOS中將壓縮音頻數據(如AAC)進行解碼以獲得原始音頻數據類型:線性PCM.ios

本例最終實現的是經過Audio Queue採集到AAC壓縮數據,將其解碼爲PCM數據,並將解碼後的PCM數據以錄製的形式保存在沙盒中.可調整解碼後採樣率,解碼器類型等參數.git

本例可拓展,不單單解碼AAC音頻數據流,還能夠是音頻文件,視頻文件中的音頻等等.github


實現原理

利用Audio Toolbox Framework中的Audio Converter能夠實現音頻數據解碼,即將AAC數據轉爲原始音頻數據PCM.數組


閱讀前提:


GitHub地址(附代碼) : 音頻解碼

簡書地址 : 音頻解碼

掘金地址 : 音頻解碼

博客地址 : 音頻解碼


1.初始化

1.1. 初始化解碼器

初始化解碼器實例, 經過指定原始數據格式,最終解碼後的格式,採樣率,以及使用硬編仍是軟編,如下是具體步驟.bash

- (instancetype)initWithSourceFormat:(AudioStreamBasicDescription)sourceFormat destFormatID:(AudioFormatID)destFormatID sampleRate:(float)sampleRate isUseHardwareDecode:(BOOL)isUseHardwareDecode {
    if (self = [super init]) {
        mSourceFormat   = sourceFormat;
        mAudioConverter = [self configureDecoderBySourceFormat:sourceFormat
                                                    destFormat:&mDestinationFormat
                                                  destFormatID:destFormatID
                                                    sampleRate:sampleRate
                                           isUseHardwareDecode:isUseHardwareDecode];
    }
    return self;
}

複製代碼

1.2. 配置解碼後ASBD音頻流信息

AudioStreamBasicDescription destinationFormat = {0};
    destinationFormat.mSampleRate = sampleRate;
    if (destFormatID != kAudioFormatLinearPCM) {
        NSLog(@"Not get compression format after decoding !");
        return NULL;
    } else {
        destinationFormat.mFormatID = destFormatID;
        destinationFormat.mChannelsPerFrame = sourceFormat.mChannelsPerFrame;
        destinationFormat.mFormatID          = kAudioFormatLinearPCM;
        destinationFormat.mFormatFlags       = (kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked);
        destinationFormat.mFramesPerPacket   = kXDXAudioPCMFramesPerPacket;
        destinationFormat.mBitsPerChannel    = KXDXAudioBitsPerChannel;
        destinationFormat.mBytesPerFrame     = destinationFormat.mBitsPerChannel / 8 *destinationFormat.mChannelsPerFrame;
        destinationFormat.mBytesPerPacket    = destinationFormat.mBytesPerFrame * destinationFormat.mFramesPerPacket;
        destinationFormat.mReserved          =  0;
    }
    memcpy(destFormat, &destinationFormat, sizeof(AudioStreamBasicDescription));
複製代碼

對音頻作解碼操做,實際就是將壓縮數據格式如AAC格式轉爲線性PCM原始音頻數據,經過kAudioFormatProperty_FormatInfo屬性能夠自動獲取指定音頻格式的參數信息.app

1.3. 選擇解碼器類型

AudioClassDescription結構體描述了系統使用音頻解碼器信息,其中最重要的就是使用硬編或軟編。而後解碼器的數量,即數組的個數,由當前的聲道數決定。函數

//獲取解碼器的描述信息
    AudioClassDescription *audioClassDesc = [self getAudioCalssDescriptionWithType:destFormatID fromManufacture:kAppleHardwareAudioCodecManufacturer];
...

- (AudioClassDescription *)getAudioCalssDescriptionWithType:(AudioFormatID)type fromManufacture:(uint32_t)manufacture {
    static AudioClassDescription desc;
    UInt32 decoderSpecific = type;
    UInt32 size;
    OSStatus status = AudioFormatGetPropertyInfo(kAudioFormatProperty_Decoders,
                                                 sizeof(decoderSpecific),
                                                 &decoderSpecific,
                                                 &size);
    
    if (status != noErr) {
        NSLog(@"Error!:硬解碼AAC get info 失敗, status= %d", (int)status);
        return nil;
    }
    
    //計算aac解碼器的個數
    unsigned int count = size / sizeof(AudioClassDescription);
    //建立一個包含count個解碼器的數組
    AudioClassDescription description[count];
    //將知足aac解碼的解碼器的信息寫入數組
    status = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
                                    sizeof(decoderSpecific),
                                    &decoderSpecific,
                                    &size,
                                    &description);
    
    if (status != noErr) {
        NSLog(@"Error!:硬解碼AAC get propery 失敗, status= %d", (int)status);
        return nil;
    }
    
    for (unsigned int i = 0; i < count; i++) {
        if (type == description[i].mSubType && manufacture == description[i].mManufacturer) {
            desc = description[i];
            return &desc;
        }
    }
    return nil;
}

複製代碼

注意:硬解即利用設備GPU硬件完成高效解碼,下降CPU消耗. 軟解就是傳統的經過CPU計算。post

1.4. 建立解碼器

AudioConverterNewSpecific: 經過指定解碼器來建立audio converter實例對象.第3,4個 分別是解碼器的數量與解碼器描述,同上,與聲道數保持一致.ui

// Create the AudioConverterRef.
    AudioConverterRef converter = NULL;
    if (![self checkError:AudioConverterNewSpecific(&sourceFormat, &destinationFormat, destinationFormat.mChannelsPerFrame, audioClassDesc, &converter) withErrorString:@"Audio Converter New failed"]) {
        return NULL;
    }else {
        printf("Audio converter create successful \n");
    }
複製代碼

2.解碼

2.1. 計算解碼數據大小

注意,當使用Audio Convert不管作編解碼,每次都須要1024個採樣點才能完成一次轉換,此值是固定的.

根據解碼器的採樣點,計算解碼出音頻數據的大小.由於線性PCM的數據能夠經過公式算出,即數據包數量*聲道數*每一個數據包中字節數.編碼

// Note: audio convert must set 1024.
    UInt32 ioOutputDataPackets = kIOOutputDataPackets;
    UInt32 outputBufferSize = (UInt32)(ioOutputDataPackets * destFormat.mChannelsPerFrame * destFormat.mBytesPerFrame);
複製代碼

2.2. 爲解碼後音頻數據預分配內存

咱們能夠將2.1中算出的size爲這個Buffer list分配內存.

// Set up output buffer list.
    // Set up output buffer list.
    AudioBufferList fillBufferList = {0};
    fillBufferList.mNumberBuffers = 1;
    fillBufferList.mBuffers[0].mNumberChannels  = destFormat.mChannelsPerFrame;
    fillBufferList.mBuffers[0].mDataByteSize    = outputBufferSize;
    fillBufferList.mBuffers[0].mData            = malloc(outputBufferSize * sizeof(char));
複製代碼

2.3. 解碼音頻數據

解析AudioConverterFillComplexBuffer:用來解碼音頻數據.同時須要指定回調函數(C語言函數),

第二個參數即指定回調函數,此回調函數中主要作的是爲即將解碼的數據進行賦值,即咱們要把原始音頻數據賦值給回調函數中的ioData參數,這是咱們在解碼前最後一次控制原始音頻數據,此回調函數執行後即完成了解碼的過程,新的數據會填充到第五個參數中,也就是咱們上面預約義的fillBufferList.

  • userInfo: 自定義一個結構體,用來與解碼回調函數間交互以傳遞數據.在這裏是將原始音頻數據信息傳給解碼回調函數中.
  • ioOutputDataPackets: 填入函數中時表示原始音頻數據包的數量,而函數調用完成時表示轉換後輸出的音頻數據包總數,注意,當咱們作解碼時,輸出確定爲PCM類型數據,因此須要提供1024個AAC採樣點.而作編碼時會將PCM數據壓縮成不少音頻數據包,僅僅須要1個完整的PCM數據包便可.
  • outputPacketDescriptions: 轉換完成後,若是此參數非空,表示轉換器輸出使用的音頻數據包描述,它必須提早分配好內存,以讓轉換器賦值到其中.

最終,咱們將轉換後獲得的AAC數據以回調函數的形式傳給調用者.

OSStatus DecodeConverterComplexInputDataProc(AudioConverterRef              inAudioConverter,
                                             UInt32                         *ioNumberDataPackets,
                                             AudioBufferList                *ioData,
                                             AudioStreamPacketDescription   **outDataPacketDescription,
                                             void                           *inUserData) {
    XDXConverterInfoType *info = (XDXConverterInfoType *)inUserData;
    
    if (info->sourceDataSize <= 0) {
        ioNumberDataPackets = 0;
        return -1;
    }
    
    *outDataPacketDescription = &info->packetDesc;
    (*outDataPacketDescription)[0].mStartOffset             = 0;
    (*outDataPacketDescription)[0].mDataByteSize            = info->sourceDataSize;
    (*outDataPacketDescription)[0].mVariableFramesInPacket  = 0;
    
    ioData->mNumberBuffers              = 1;
    ioData->mBuffers[0].mData           = info->sourceBuffer;
    ioData->mBuffers[0].mNumberChannels = info->sourceChannelsPerFrame;
    ioData->mBuffers[0].mDataByteSize   = info->sourceDataSize;
    
    return noErr;
}


- (void)decodeFormatByConverter:(AudioConverterRef)audioConverter sourceBuffer:(void *)sourceBuffer sourceBufferSize:(UInt32)sourceBufferSize sourceFormat:(AudioStreamBasicDescription)sourceFormat dest:(AudioStreamBasicDescription)destFormat completeHandler:(void(^)(AudioBufferList *destBufferList, UInt32 outputPackets, AudioStreamPacketDescription *outputPacketDescriptions))completeHandler {
    ...
    
    XDXConverterInfoType userInfo        = {0};
    userInfo.sourceBuffer                = sourceBuffer;
    userInfo.sourceDataSize              = sourceBufferSize;
    userInfo.sourceChannelsPerFrame      = sourceFormat.mChannelsPerFrame;
    userInfo.packetDesc.mDataByteSize    = (UInt32)sourceBufferSize;
    userInfo.packetDesc.mStartOffset     = 0;
    userInfo.packetDesc.mVariableFramesInPacket = 0;
    
    AudioStreamPacketDescription outputPacketDesc;
    OSStatus status = AudioConverterFillComplexBuffer(audioConverter,
                                                      DecodeConverterComplexInputDataProc,
                                                      &userInfo,
                                                      &ioOutputDataPackets,
                                                      &fillBufferList,
                                                      &outputPacketDesc);
    
    // if interrupted in the process of the conversion call, we must handle the error appropriately
    if (status != noErr) {
        if (status == kAudioConverterErr_HardwareInUse) {
            printf("Audio Converter returned kAudioConverterErr_HardwareInUse!\n");
        } else {
            if (![self checkError:status withErrorString:@"AudioConverterFillComplexBuffer error!"]) {
                return;
            }
        }
    } else {
        if (ioOutputDataPackets == 0) {
            // This is the EOF condition.
            status = noErr;
        }
        
        if (completeHandler) {
            completeHandler(&fillBufferList, ioOutputDataPackets, &outputPacketDesc);
        }
    }
}

複製代碼

3. 模塊對接

由於音頻解碼要依賴音頻採集,因此咱們這裏以audio unit採集爲例做示範,即便用audio unit採集pcm數據而後使用此模塊解碼獲得aac數據.如需瞭解請參考以下連接

3.1. 初始化解碼器

以下,在音頻採集的類中聲明一個解碼器實例變量,而後初始化它. 僅僅須要設置原始數據格式,解碼後的格式,採樣率,使用硬編,軟編便可.

@property (nonatomic, strong) XDXAduioDecoder *audioDecoder;

...

        // audio decode: aac->pcm
        self.audioDecoder = [[XDXAduioDecoder alloc] initWithSourceFormat:m_audioInfo->mDataFormat
                                                             destFormatID:kAudioFormatLinearPCM
                                                               sampleRate:48000
                                                      isUseHardwareDecode:YES];
複製代碼

3.2. 解碼音頻數據

在Audio Queue採集AAC音頻數據的回調中將AAC數據送入解碼器,而後在回調函數中將獲得的PCM數據其寫入文件.

注意: 直接用Audio Queue採集AAC類型音頻數據,實際系統在其內部作了一次轉換,即直接採集其實只能採原始PCM數據,直接用Audio Queue設置採集AAC至關於系統在內部爲咱們作了一次轉換.

static void CaptureAudioDataCallback(void *                                 inUserData,
                                     AudioQueueRef                          inAQ,
                                     AudioQueueBufferRef                    inBuffer,
                                     const AudioTimeStamp *                 inStartTime,
                                     UInt32                                 inNumPackets,
                                     const AudioStreamPacketDescription*    inPacketDesc) {
    
    XDXAudioQueueCaptureManager *instance = (__bridge XDXAudioQueueCaptureManager *)inUserData;
    
    [instance.audioDecoder decodeAudioWithSourceBuffer:inBuffer->mAudioData
                                      sourceBufferSize:inBuffer->mAudioDataByteSize
                                       completeHandler:^(AudioBufferList * _Nonnull destBufferList, UInt32 outputPackets, AudioStreamPacketDescription * _Nonnull outputPacketDescriptions) {
                                           if (instance.isRecordVoice) {
                                               [[XDXAudioFileHandler getInstance] writeFileWithInNumBytes:destBufferList->mBuffers->mDataByteSize
                                                                                             ioNumPackets:outputPackets
                                                                                                 inBuffer:destBufferList->mBuffers->mData
                                                                                             inPacketDesc:outputPacketDescriptions];
                                           }
                                           
                                           free(destBufferList->mBuffers->mData);
                                       }];
    
    if (instance.isRunning) {
        AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL);
    }
}
複製代碼

4. 文件錄製

此部分可參考另外一篇文章: 音頻文件錄製

5. 釋放解碼器資源

如需釋放內存,請保證解碼器工做完全結束後再釋放內存.

- (void)freeEncoder {
    if (mAudioConverter) {
        AudioConverterDispose(mAudioConverter);
        mAudioConverter = NULL;
    }
}
複製代碼
相關文章
相關標籤/搜索