RTMP直播到FMS中的AAC音頻直播

時間 2019-11-18

標籤 rtmp 直播 fms aac 音頻简体版

原文原文鏈接

本文引用了下面幾個網友的文章：php

http://sun3eyes.blog.163.com/blog/#m=0&t=3&c=rtmpwindows

http://sun3eyes.blog.163.com/blog/static/1070797922012913337667/ide

http://sun3eyes.blog.163.com/blog/static/107079792201291112451996/測試

http://blog.csdn.net/helunlixing/article/details/7417778this

直播的視頻用H264，音頻用AAC，從FAAC裏面壓縮出來的一幀音頻數據，要通過簡單處理才能打包用RTMP協議發送到FMS上，包括保存成FLV文件，都要稍微處理一下，主要是把AAC的幀頭去掉，並提取出相應的信息。編碼

1024字節的G.711A數據，AAC通常也就300多個字節。spa

能夠把FAAC壓縮出來的幀直接保存成AAC文件，用windows7自帶的播放器能夠播放的，方便測試。.net

AAC的幀頭通常7個字節，或者包含CRC校驗的話9個字節，這裏麪包括了聲音的相關參數。ssr

結構以下：code

Structure

AAAAAAAA AAAABCCD EEFFFFGH HHIJKLMM MMMMMMMM MMMOOOOO OOOOOOPP (QQQQQQQQ QQQQQQQQ)

Header consists of 7 or 9 bytes (without or with CRC).

RTMP直播到FMS中的AAC音頻頭 AAC Frame Header (轉) - niulei20012001 - niulei20012001的博客

Letter	Length (bits)	Description
A	12	syncword 0xFFF, all bits must be 1
B	1	MPEG Version: 0 for MPEG-4, 1 for MPEG-2
C	2	Layer: always 0
D	1	protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC
E	2	profile, the MPEG-4 Audio Object Type minus 1
F	4	MPEG-4 Sampling Frequency Index (15 is forbidden)
G	1	private stream, set to 0 when encoding, ignore when decoding
H	3	MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband PCE)
I	1	originality, set to 0 when encoding, ignore when decoding
J	1	home, set to 0 when encoding, ignore when decoding
K	1	copyrighted stream, set to 0 when encoding, ignore when decoding
L	1	copyright start, set to 0 when encoding, ignore when decoding
M	13	frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame)
O	11	Buffer fullness
P	2	Number of AAC frames (RDBs) in ADTS frame minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame
Q	16	CRC if protection absent is 0

http://wiki.multimedia.cx/index.php?title=ADTS

其中最重要的就是E，F，H。

E就是類型了

1: AAC Main
2: AAC LC (Low Complexity)
3: AAC SSR (Scalable Sample Rate)
4: AAC LTP (Long Term Prediction)

F就是採樣頻率

0: 96000 Hz

1: 88200 Hz
2: 64000 Hz
3: 48000 Hz
4: 44100 Hz
5: 32000 Hz
6: 24000 Hz
7: 22050 Hz
8: 16000 Hz
9: 12000 Hz
10: 11025 Hz
11: 8000 Hz
12: 7350 Hz
H就是聲道

1: 1 channel: front-center
2: 2 channels: front-left, front-right
3: 3 channels: front-center, front-left, front-right
4: 4 channels: front-center, front-left, front-right, back-center
5: 5 channels: front-center, front-left, front-right, back-left, back-right
6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
有了這三個參數，就能夠發送音頻的第一幀了，而後後面的幀，把包頭的7個字節去掉？，打包到RTMP協議發送就能夠了。

http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Audio_Object_Types

固然發送的時候要打上RTMP的包頭，數據部分用 AF 00 代替AAC的包頭。長度再計算一下。時間戳用採樣的時間也能夠，本身另算也能夠。
//--------------------------------------------------------------------------------------------------------------//

第一個音頻包那就是AAC header.

如：af 00 13 90。包長4個字節，解釋以下，

1）第一個字節af，a就是10表明的意思是AAC，

Format of SoundData. The following values are defined:
0 = Linear PCM, platform endian
1 = ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16 kHz mono
5 = Nellymoser 8 kHz mono
6 = Nellymoser
7 = G.711 A-law logarithmic PCM
8 = G.711 mu-law logarithmic PCM
9 = reserved
10 = AAC
11 = Speex
14 = MP3 8 kHz
15 = Device-specific sound
Formats 7, 8, 14, and 15 are reserved.
AAC is supported in Flash Player 9,0,115,0 and higher.
Speex is supported in Flash Player 10 and higher.

2）第一個字節中的後四位f表明以下

前2個bit的含義抽樣頻率，這裏是二進制11，表明44kHZ

Sampling rate. The following values are defined:
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz

第3個bit，表明音頻用16位的

Size of each audio sample. This parameter only pertains to
uncompressed formats. Compressed formats always decode
to 16 bits internally.
0 = 8-bit samples
1 = 16-bit samples

第4個bit表明聲道

Mono or stereo sound
0 = Mono sound
1 = Stereo sound

3）第2個字節

AACPacketType，這個字段來表示AACAUDIODATA的類型：0 = AAC sequence header，1 = AAC raw。第一個音頻包用0，後面的都用1

4）第3，4個字節內容AudioSpecificConfig以下

AAC sequence header存放的是AudioSpecificConfig結構，該結構則在「ISO-14496-3 Audio」中描述。AudioSpecificConfig結構的描述很是複雜，這裏我作一下簡化，事先設定要將要編碼的音頻格式，其中，選擇"AAC-LC"爲音頻編碼，音頻採樣率爲44100，因而AudioSpecificConfig簡化爲下表：

0x13 0x90（1001110010000）表示 ObjectProfile=2， AAC-LC，SamplingFrequencyIndex=7，ChannelConfiguration=聲道2
AudioSpecificConfig，即爲ObjectProfile，SamplingFrequencyIndex，ChannelConfiguration，TFSpecificConfig。

其中，ObjectProfile (AAC main ~1, AAC lc ~2, AAC ssr ~3)；

SamplingFrequencyIndex (0 ~ 96000， 1~88200， 2~64000， 3~48000， 4~44100， 5~32000， 6~24000， 7~ 22050， 8~16000...)，一般aac固定選中44100，即應該對應爲4，可是試驗結果代表，當音頻採樣率小於等於44100時，應該選擇3，而當音頻採樣率爲48000時，應該選擇2；

ChannelConfiguration對應的是音頻的頻道數目。單聲道對應1，雙聲道對應2，依次類推。

TFSpecificConfig的說明見標準14496-3中（1.2 T/F Audio Specific Configuration）的講解，這裏恆定的設置爲0；

索引值以下含義：

There are 13 supported frequencies:
- 0: 96000 Hz
- 1: 88200 Hz
- 2: 64000 Hz
- 3: 48000 Hz
- 4: 44100 Hz
- 5: 32000 Hz
- 6: 24000 Hz
- 7: 22050 Hz
- 8: 16000 Hz
- 9: 12000 Hz
- 10: 11025 Hz
- 11: 8000 Hz
- 12: 7350 Hz
- 13: Reserved
- 14: Reserved
- 15: frequency is written explictly
channel_configuration: 表示聲道數
- 0: Defined in AOT Specifc Config
- 1: 1 channel: front-center
- 2: 2 channels: front-left, front-right
- 3: 3 channels: front-center, front-left, front-right
- 4: 4 channels: front-center, front-left, front-right, back-center
- 5: 5 channels: front-center, front-left, front-right, back-left, back-right
- 6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
- 7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
- 8-15: Reserved
後面的視頻包都是AF 01 + 去除7個字節頭的音頻AAC數據

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。