總結網頁音頻直播的方案和遇到的問題。html
代碼:(github,待整理)git
結果: 使用opus音頻編碼,web audio api 播放,能夠達到100ms之內延時,高質量,低流量的音頻直播。github
背景: VDI(虛擬桌面) h264網頁版預研,繼h264視頻直播方案解決以後的又一個對延時有高要求的音頻直播方案(交互性,音視頻同步)。web
前提: flexVDI開源項目對音頻的支持只實現了對未編碼壓縮的PCM音頻數據。而且效果很差,要麼卡頓,要麼延時,流量在2~3Mbps(根據緩衝的大小)。api
解決方案: 在spice server端對音頻採用opus進行編碼,flexVDI playback通道拿到opus packet數據後,調用opus js解碼庫解碼成PCM數據,餵給audioContext進行播放。瀏覽器
流程簡介:flexVDI palyback通道接收opus音頻數據,調用libopus.js解碼獲得PCM數據,保存到buffer。建立scriptProcessorNode, 在onaudioprocess函數中從buffer裏面拿到PCM數據,app
按聲道填充outputBuffer, 把scriptProcessorNode鏈接到audioContext.destination進行播放。具體代碼見後文或者github。函數
opus編解碼接口介紹:flex
參考: http://opus-codec.org/docs/opus_api-1.2/index.htmlui
1、下面是我用opus c庫解碼opus音頻,再用ffplay播放PCM數據的一個demo,能夠看看opus解碼接口是怎麼使用的:
#include <stdio.h> #include <stdlib.h> #include <string.h> #include "opus.h" /* static void int_to_char(opus_uint32 i, unsigned char ch[4]) { ch[0] = i>>24; ch[1] = (i>>16)&0xFF; ch[2] = (i>>8)&0xFF; ch[3] = i&0xFF; }*/ static opus_uint32 char_to_int(unsigned char ch[4]) { return ((opus_uint32)ch[0]<<24) | ((opus_uint32)ch[1]<<16) | ((opus_uint32)ch[2]<< 8) | (opus_uint32)ch[3]; } int main(int argc, char** argv) { opus_int32 sampleRate = 0; int channels = 0, err = 0, len = 0; int max_payload_bytes = 1500; int max_frame_size = 48000*2; OpusDecoder* dec = NULL; sampleRate = (opus_int32)atol(argv[1]); channels = atoi(argv[2]); FILE* fin = fopen(argv[3], "rb"); FILE* fout = fopen(argv[4], "wb+"); short *out; unsigned char* fbytes, *data; //in = (short*)malloc(max_frame_size*channels*sizeof(short)); out = (short*)malloc(max_frame_size*channels*sizeof(short)); /* We need to allocate for 16-bit PCM data, but we store it as unsigned char. */ fbytes = (unsigned char*)malloc(max_frame_size*channels*sizeof(short)); data = (unsigned char*)calloc(max_payload_bytes, sizeof(unsigned char)); dec = opus_decoder_create(sampleRate, channels, &err); int nBytesRead = 0; opus_uint64 tot_out = 0; while(1){ unsigned char ch[4] = {0}; nBytesRead = fread(ch, 1, 4, fin); if(nBytesRead != 4) break; len = char_to_int(ch); nBytesRead = fread(data, 1, len, fin); if(nBytesRead != len) break; opus_int32 output_samples = max_frame_size; output_samples = opus_decode(dec, data, len, out, output_samples, 0); int i; for(i=0; i < output_samples*channels; i++) { short s; s=out[i]; fbytes[2*i]=s&0xFF; fbytes[2*i+1]=(s>>8)&0xFF; } if (fwrite(fbytes, sizeof(short)*channels, output_samples, fout) != (unsigned)output_samples){ fprintf(stderr, "Error writing.\n"); return EXIT_FAILURE; } tot_out += output_samples; } printf("tot_out: %llu \n", tot_out); return 0; }
這個程序對opus packets組成的文件(簡單的length+packet格式)解碼後獲得PCM數據,再用ffplay播放PCM數據,看可否正常播放:
ffplay -f f32le -ac 1 -ar 48000 input_audio // 播放float32型PCM數據
ffplay -f s16le -ac 1 -ar 48000 input_audio //播放short16型PCM數據
ac表示聲道數, ar表示採樣率, input_audio是PCM音頻文件。
2、要獲取PCM數據文件,首先要獲得opus packet二進制文件, 因此這裏涉及到瀏覽器如何保存二進制文件到本地的問題:
參考代碼:
var saveFile = (function(){ var a = document.createElement("a"); document.body.appendChild(a); a.style = "display:none"; return function(data, name){ var blob = new Blob([data]); var url = window.URL.createObjectURL(blob); a.href = url; a.download = name; a.click(); window.URL.revokeObjectURL(url); }; }()); saveFile(data, 'test.pcm');
說明:首先把二進制數據寫到typedArray中,而後用這個buffer構造Blob對象,生成URL, 再使用a標籤把這個blob下載到本地。
3、利用audioContext播放PCM音頻數據的兩種方案:
(1)flexVDI的實現
參考:https://github.com/flexVDI/spice-web-client
function play(buffer, dataTimestamp) { // Each data packet is 16 bits, the first being left channel data and the second being right channel data (LR-LR-LR-LR...) //var audio = new Int16Array(buffer); var audio = new Float32Array(buffer); // We split the audio buffer in two channels. Float32Array is the type required by Web Audio API var left = new Float32Array(audio.length / 2); var right = new Float32Array(audio.length / 2); var channelCounter = 0; var audioContext = this.audioContext; var len = audio.length; for (var i = 0; i < len; ) { //because the audio data spice gives us is 16 bits signed int (32768) and we wont to get a float out of it (between -1.0 and 1.0) left[channelCounter] = audio[i++] / 32768; right[channelCounter] = audio[i++] / 32768; channelCounter++; } var source = audioContext['createBufferSource'](); // creates a sound source var audioBuffer = audioContext['createBuffer'](2, channelCounter, this.frequency); audioBuffer['getChannelData'](0)['set'](left); audioBuffer['getChannelData'](1)['set'](right); source['buffer'] = audioBuffer; source['connect'](this.audioContext['destination']); source['start'](0); }
注: buffer中保存的是short 型PCM數據,這裏爲了簡單,去掉了對時間戳的處理,由於source.start(0)表示當即播放。若是是float型數據,不須要除以32768.
(2)ws-audio-api的實現
參考:https://github.com/Ivan-Feofanov/ws-audio-api
var bufL = new Float32Array(this.config.codec.bufferSize); var bufR = new Float32Array(this.config.codec.bufferSize); this.scriptNode = audioContext.createScriptProcessor(this.config.codec.bufferSize, 0, 2); if (typeof AudioBuffer.prototype.copyToChannel === "function") { this.scriptNode.onaudioprocess = function(e) { var buf = e.outputBuffer; _this.process(bufL, bufR); //獲取PCM數據到bufL, bufR buf.copyToChannel(bufL, 0); buf.copyToChannel(bufR, 1); }; } else { this.scriptNode.onaudioprocess = function(e) { var buf = e.outputBuffer; _this.process(bufL, bufR); buf.getChannelData(0).set(bufL); buf.getChannelData(1).set(bufR); }; } this.scriptNode.connect(audioContext.destination);
延時卡頓的問題:audioContext有的瀏覽器默認是48000採樣率,有的瀏覽器默認是44100的採樣率,若是餵給audioContext的PCM數據的採樣率不匹配,就會產生延時和卡頓的問題。