本文引用了下面幾個網友的文章:php
http://sun3eyes.blog.163.com/blog/#m=0&t=3&c=rtmpwindows
http://sun3eyes.blog.163.com/blog/static/1070797922012913337667/ide
http://sun3eyes.blog.163.com/blog/static/107079792201291112451996/測試
http://blog.csdn.net/helunlixing/article/details/7417778this
直播的視頻用H264,音頻用AAC,從FAAC裏面壓縮出來的一幀音頻數據,要通過簡單處理才能打包用RTMP協議發送到FMS上,包括保存成FLV文件,都要稍微處理一下,主要是把AAC的幀頭去掉,並提取出相應的信息。編碼
1024字節的G.711A數據,AAC通常也就300多個字節。spa
能夠把FAAC壓縮出來的幀直接保存成AAC文件,用windows7自帶的播放器能夠播放的,方便測試。.net
AAC的幀頭通常7個字節,或者包含CRC校驗的話9個字節,這裏麪包括了聲音的相關參數。ssr
結構以下:code
Structure
AAAAAAAA AAAABCCD EEFFFFGH HHIJKLMM MMMMMMMM MMMOOOOO OOOOOOPP (QQQQQQQQ QQQQQQQQ)
Header consists of 7 or 9 bytes (without or with CRC).
Letter | Length (bits) | Description |
---|---|---|
A | 12 | syncword 0xFFF, all bits must be 1 |
B | 1 | MPEG Version: 0 for MPEG-4, 1 for MPEG-2 |
C | 2 | Layer: always 0 |
D | 1 | protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC |
E | 2 | profile, the MPEG-4 Audio Object Type minus 1 |
F | 4 | MPEG-4 Sampling Frequency Index (15 is forbidden) |
G | 1 | private stream, set to 0 when encoding, ignore when decoding |
H | 3 | MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband PCE) |
I | 1 | originality, set to 0 when encoding, ignore when decoding |
J | 1 | home, set to 0 when encoding, ignore when decoding |
K | 1 | copyrighted stream, set to 0 when encoding, ignore when decoding |
L | 1 | copyright start, set to 0 when encoding, ignore when decoding |
M | 13 | frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame) |
O | 11 | Buffer fullness |
P | 2 | Number of AAC frames (RDBs) in ADTS frame minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame |
Q | 16 | CRC if protection absent is 0 |
http://wiki.multimedia.cx/index.php?title=ADTS
其中最重要的就是E,F,H。
E就是類型了
1: AAC Main
2: AAC LC (Low Complexity)
3: AAC SSR (Scalable Sample Rate)
4: AAC LTP (Long Term Prediction)
F就是採樣頻率
0: 96000 Hz
H就是聲道
1: 1 channel: front-center
有了這三個參數,就能夠發送音頻的第一幀了,而後後面的幀,把包頭的7個字節去掉?,打包到RTMP協議發送就能夠了。
http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Audio_Object_Types
固然發送的時候要打上RTMP的包頭,數據部分用 AF 00 代替AAC的包頭。長度再計算一下。時間戳用採樣的時間也能夠,本身另算也能夠。
//--------------------------------------------------------------------------------------------------------------//第一個音頻包那就是AAC header.
如:af 00 13 90。包長4個字節,解釋以下,
1)第一個字節af,a就是10表明的意思是AAC,
Format of SoundData. The following values are defined:
0 = Linear PCM, platform endian
1 = ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16 kHz mono
5 = Nellymoser 8 kHz mono
6 = Nellymoser
7 = G.711 A-law logarithmic PCM
8 = G.711 mu-law logarithmic PCM
9 = reserved
10 = AAC
11 = Speex
14 = MP3 8 kHz
15 = Device-specific sound
Formats 7, 8, 14, and 15 are reserved.
AAC is supported in Flash Player 9,0,115,0 and higher.
Speex is supported in Flash Player 10 and higher.
2)第一個字節中的後四位f表明以下
前2個bit的含義 抽樣頻率,這裏是二進制11,表明44kHZ
Sampling rate. The following values are defined:
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz
第3個bit,表明 音頻用16位的
Size of each audio sample. This parameter only pertains to
uncompressed formats. Compressed formats always decode
to 16 bits internally.
0 = 8-bit samples
1 = 16-bit samples
第4個bit表明聲道
Mono or stereo sound
0 = Mono sound
1 = Stereo sound
3)第2個字節
AACPacketType,這個字段來表示AACAUDIODATA的類型:0 = AAC sequence header,1 = AAC raw。第一個音頻包用0,後面的都用1
4)第3,4個字節內容AudioSpecificConfig以下
AAC sequence header存放的是AudioSpecificConfig結構,該結構則在「ISO-14496-3 Audio」中描述。AudioSpecificConfig結構的描述很是複雜,這裏我作一下簡化,事先設定要將要編碼的音頻格式,其中,選擇"AAC-LC"爲音頻編碼,音頻採樣率爲44100,因而AudioSpecificConfig簡化爲下表:
0x13 0x90(1001110010000) 表示 ObjectProfile=2, AAC-LC,SamplingFrequencyIndex=7,ChannelConfiguration=聲道2
AudioSpecificConfig,即爲ObjectProfile,SamplingFrequencyIndex,ChannelConfiguration,TFSpecificConfig。
其中,ObjectProfile (AAC main ~1, AAC lc ~2, AAC ssr ~3);
SamplingFrequencyIndex (0 ~ 96000, 1~88200, 2~64000, 3~48000, 4~44100, 5~32000, 6~24000, 7~ 22050, 8~16000...),一般aac固定選中44100,即應該對應爲4,可是試驗結果代表,當音頻採樣率小於等於44100時,應該選擇3,而當音頻採樣率爲48000時,應該選擇2;
ChannelConfiguration對應的是音頻的頻道數目。單聲道對應1,雙聲道對應2,依次類推。
TFSpecificConfig的說明見標準14496-3中(1.2 T/F Audio Specific Configuration)的講解,這裏恆定的設置爲0;
索引值以下含義:
There are 13 supported frequencies:
channel_configuration: 表示聲道數
後面的視頻包都是AF 01 + 去除7個字節頭的音頻AAC數據