相關源碼:https://github.com/YeDaxia/MusicPlusphp
在實現以前,咱們先來了解一下數字音頻的有關屬性。html
採樣頻率(Sample Rate):每秒採集聲音的數量,它用赫茲(Hz)來表示。(採樣率越高越靠近原聲音的波形)
採樣精度(Bit Depth):指記錄聲音的動態範圍,它以位(Bit)爲單位。(聲音的幅度差)
聲音通道(Channel):聲道數。好比左聲道右聲道。java
採樣量化後的音頻最終是一串數字,聲音的大小(幅度)會體如今這個每一個數字數值大小上;而聲音的高低(頻率)和聲音的音色(Timbre)都和時間維度有關,會體如今數字之間的差別上。android
在編碼解碼以前,咱們先來感覺一下原始的音頻數據到底是什麼樣的。咱們知道wav文件裏面放的就是原始的PCM數據,下面咱們經過AudioTrack來直接把這些數據write進去播放出來。下面是某個wav文件的格式,關於wav的格式內容能夠看:http://soundfile.sapp.org/doc/WaveFormat/ ,能夠經過Binary Viewer等工具去查看一下wav文件的二進制內容。ios
播放wav文件:git
int sampleRateInHz = 44100; int channelConfig = AudioFormat.CHANNEL_OUT_STEREO; int audioFormat = AudioFormat.ENCODING_PCM_16BIT; int bufferSizeInBytes = AudioTrack.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat); AudioTrack audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRateInHz, channelConfig, audioFormat, bufferSizeInBytes, AudioTrack.MODE_STREAM); audioTrack.play(); FileInputStream audioInput = null; try { audioInput = new FileInputStream(audioFile);//put your wav file in audioInput.read(new byte[44]);//skid 44 wav header byte[] audioData = new byte[512]; while(audioInput.read(audioData)!= -1){ audioTrack.write(audioData, 0, audioData.length); //play raw audio bytes } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); }finally{ audioTrack.stop(); audioTrack.release(); if(audioInput != null) try { audioInput.close(); } catch (IOException e) { e.printStackTrace(); } }
若是你有試過一下上面的例子,那你應該對音頻的源數據有了一個概念了。github
經過上面的介紹,咱們不難知道,解碼的目的就是讓編碼後的數據恢復成wav中的源數據。算法
利用MediaExtractor和MediaCodec來提取編碼後的音頻數據並解壓成音頻源數據:編程
final String encodeFile = "your encode audio file path"; MediaExtractor extractor = new MediaExtractor(); extractor.setDataSource(encodeFile); MediaFormat mediaFormat = null; for (int i = 0; i < extractor.getTrackCount(); i++) { MediaFormat format = extractor.getTrackFormat(i); String mime = format.getString(MediaFormat.KEY_MIME); if (mime.startsWith("audio/")) { extractor.selectTrack(i); mediaFormat = format; break; } } if(mediaFormat == null){ DLog.e("not a valid file with audio track.."); extractor.release(); return null; } FileOutputStream fosDecoder = new FileOutputStream(outDecodeFile);//your out file path String mediaMime = mediaFormat.getString(MediaFormat.KEY_MIME); MediaCodec codec = MediaCodec.createDecoderByType(mediaMime); codec.configure(mediaFormat, null, null, 0); codec.start(); ByteBuffer[] codecInputBuffers = codec.getInputBuffers(); ByteBuffer[] codecOutputBuffers = codec.getOutputBuffers(); final long kTimeOutUs = 5000; MediaCodec.BufferInfo info = new MediaCodec.BufferInfo(); boolean sawInputEOS = false; boolean sawOutputEOS = false; int totalRawSize = 0; try{ while (!sawOutputEOS) { if (!sawInputEOS) { int inputBufIndex = codec.dequeueInputBuffer(kTimeOutUs); if (inputBufIndex >= 0) { ByteBuffer dstBuf = codecInputBuffers[inputBufIndex]; int sampleSize = extractor.readSampleData(dstBuf, 0); if (sampleSize < 0) { DLog.i(TAG, "saw input EOS."); sawInputEOS = true; codec.queueInputBuffer(inputBufIndex,0,0,0,MediaCodec.BUFFER_FLAG_END_OF_STREAM ); } else { long presentationTimeUs = extractor.getSampleTime(); codec.queueInputBuffer(inputBufIndex,0,sampleSize,presentationTimeUs,0); extractor.advance(); } } } int res = codec.dequeueOutputBuffer(info, kTimeOutUs); if (res >= 0) { int outputBufIndex = res; // Simply ignore codec config buffers. if ((info.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG)!= 0) { DLog.i(TAG, "audio encoder: codec config buffer"); codec.releaseOutputBuffer(outputBufIndex, false); continue; } if(info.size != 0){ ByteBuffer outBuf = codecOutputBuffers[outputBufIndex]; outBuf.position(info.offset); outBuf.limit(info.offset + info.size); byte[] data = new byte[info.size]; outBuf.get(data); totalRawSize += data.length; fosDecoder.write(data); } codec.releaseOutputBuffer(outputBufIndex, false); if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) { DLog.i(TAG, "saw output EOS."); sawOutputEOS = true; } } else if (res == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) { codecOutputBuffers = codec.getOutputBuffers(); DLog.i(TAG, "output buffers have changed."); } else if (res == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) { MediaFormat oformat = codec.getOutputFormat(); DLog.i(TAG, "output format has changed to " + oformat); } } }finally{ fosDecoder.close(); codec.stop(); codec.release(); extractor.release(); }
解壓以後,能夠用AudioTrack來播放驗證一下這些數據是否正確。app
音頻混音的原理: 量化的語音信號的疊加等價於空氣中聲波的疊加。
反應到音頻數據上,也就是把同一個聲道的數值進行簡單的相加,可是這樣同時會產生一個問題,那就是相加的結果可能會溢出,固然爲了解決這個問題已經有不少方案了,在這裏咱們採用簡單的平均算法(average audio mixing algorithm, 簡稱V算法)。在下面的演示程序中,咱們假設音頻文件是的採樣率,通道和採樣精度都是同樣的,這樣會便於處理。另外要注意的是,在源音頻數據中是按照little-endian的順序來排放的,PCM值爲0表示沒聲音(振幅爲0)。
public void mixAudios(File[] rawAudioFiles){ final int fileSize = rawAudioFiles.length; FileInputStream[] audioFileStreams = new FileInputStream[fileSize]; File audioFile = null; FileInputStream inputStream; byte[][] allAudioBytes = new byte[fileSize][]; boolean[] streamDoneArray = new boolean[fileSize]; byte[] buffer = new byte[512]; int offset; try { for (int fileIndex = 0; fileIndex < fileSize; ++fileIndex) { audioFile = rawAudioFiles[fileIndex]; audioFileStreams[fileIndex] = new FileInputStream(audioFile); } while(true){ for(int streamIndex = 0 ; streamIndex < fileSize ; ++streamIndex){ inputStream = audioFileStreams[streamIndex]; if(!streamDoneArray[streamIndex] && (offset = inputStream.read(buffer)) != -1){ allAudioBytes[streamIndex] = Arrays.copyOf(buffer,buffer.length); }else{ streamDoneArray[streamIndex] = true; allAudioBytes[streamIndex] = new byte[512]; } } byte[] mixBytes = mixRawAudioBytes(allAudioBytes); //mixBytes 就是混合後的數據 boolean done = true; for(boolean streamEnd : streamDoneArray){ if(!streamEnd){ done = false; } } if(done){ break; } } } catch (IOException e) { e.printStackTrace(); if(mOnAudioMixListener != null) mOnAudioMixListener.onMixError(1); }finally{ try { for(FileInputStream in : audioFileStreams){ if(in != null) in.close(); } } catch (IOException e) { e.printStackTrace(); } } } /** * 每一行是一個音頻的數據 */ byte[] averageMix(byte[][] bMulRoadAudioes) { if (bMulRoadAudioes == null || bMulRoadAudioes.length == 0) return null; byte[] realMixAudio = bMulRoadAudioes[0]; if(bMulRoadAudioes.length == 1) return realMixAudio; for(int rw = 0 ; rw < bMulRoadAudioes.length ; ++rw){ if(bMulRoadAudioes[rw].length != realMixAudio.length){ Log.e("app", "column of the road of audio + " + rw +" is diffrent."); return null; } } int row = bMulRoadAudioes.length; int coloum = realMixAudio.length / 2; short[][] sMulRoadAudioes = new short[row][coloum]; for (int r = 0; r < row; ++r) { for (int c = 0; c < coloum; ++c) { sMulRoadAudioes[r][c] = (short) ((bMulRoadAudioes[r][c * 2] & 0xff) | (bMulRoadAudioes[r][c * 2 + 1] & 0xff) << 8); } } short[] sMixAudio = new short[coloum]; int mixVal; int sr = 0; for (int sc = 0; sc < coloum; ++sc) { mixVal = 0; sr = 0; for (; sr < row; ++sr) { mixVal += sMulRoadAudioes[sr][sc]; } sMixAudio[sc] = (short) (mixVal / row); } for (sr = 0; sr < coloum; ++sr) { realMixAudio[sr * 2] = (byte) (sMixAudio[sr] & 0x00FF); realMixAudio[sr * 2 + 1] = (byte) ((sMixAudio[sr] & 0xFF00) >> 8); } return realMixAudio; }
一樣,你能夠把混音後的數據用AudioTrack播放出來,驗證一下混音的效果。
對音頻進行編碼的目的用更少的空間來存儲和傳輸,有有損編碼和無損編碼,其中咱們常見的Mp3和ACC格式就是有損編碼。在下面的例子中,咱們經過MediaCodec來對混音後的數據進行編碼,在這裏,咱們將採用ACC格式來進行。
ACC音頻有ADIF和ADTS兩種,第一種適用於磁盤,第二種則能夠用於流的傳輸,它是一種幀序列。咱們這裏用ADTS這種來進行編碼,首先要了解一下它的幀序列的構成:
ADTS的幀結構:
header |
body |
ADTS幀的Header組成:
Length (bits) | Description |
12 | syncword 0xFFF, all bits must be 1 |
1 | MPEG Version: 0 for MPEG-4, 1 for MPEG-2 |
2 | Layer: always 0 |
1 | protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC |
2 | profile, the MPEG-4 Audio Object Type minus 1 |
4 | MPEG-4 Sampling Frequency Index (15 is forbidden) |
1 | private bit, guaranteed never to be used by MPEG, set to 0 when encoding, ignore when decoding |
3 | MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband PCE) |
1 | originality, set to 0 when encoding, ignore when decoding |
1 | home, set to 0 when encoding, ignore when decoding |
1 | copyrighted id bit, the next bit of a centrally registered copyright identifier, set to 0 when encoding, ignore when decoding |
1 | copyright id start, signals that this frame's copyright id bit is the first bit of the copyright id, set to 0 when encoding, ignore when decoding |
13 | frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame) |
11 | Buffer fullness |
2 | Number of AAC frames (RDBs) in ADTS frame minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame |
16 | CRC if protection absent is 0 |
咱們的思路就很明確了,把編碼後的每一幀數據加上header寫到文件中,保存後的.acc文件應該是能夠被播放器識別播放的。爲了簡單,咱們仍是假設以前生成的混音數據源的採樣率是44100Hz,通道數是2,採樣精度是16Bit。
把音頻源數據編碼成ACC格式完成源代碼:
class AACAudioEncoder{ private final static String TAG = "AACAudioEncoder"; private final static String AUDIO_MIME = "audio/mp4a-latm"; private final static long audioBytesPerSample = 44100*16/8; private String rawAudioFile; AACAudioEncoder(String rawAudioFile) { this.rawAudioFile = rawAudioFile; } @Override public void encodeToFile(String outEncodeFile) { FileInputStream fisRawAudio = null; FileOutputStream fosAccAudio = null; try { fisRawAudio = new FileInputStream(rawAudioFile); fosAccAudio = new FileOutputStream(outEncodeFile); final MediaCodec audioEncoder = createACCAudioDecoder(); audioEncoder.start(); ByteBuffer[] audioInputBuffers = audioEncoder.getInputBuffers(); ByteBuffer[] audioOutputBuffers = audioEncoder.getOutputBuffers(); boolean sawInputEOS = false; boolean sawOutputEOS = false; long audioTimeUs = 0 ; BufferInfo outBufferInfo = new BufferInfo(); boolean readRawAudioEOS = false; byte[] rawInputBytes = new byte[4096]; int readRawAudioCount = 0; int rawAudioSize = 0; long lastAudioPresentationTimeUs = 0; int inputBufIndex, outputBufIndex; while(!sawOutputEOS){ if (!sawInputEOS) { inputBufIndex = audioEncoder.dequeueInputBuffer(10000); if (inputBufIndex >= 0) { ByteBuffer inputBuffer = audioInputBuffers[inputBufIndex]; inputBuffer.clear(); int bufferSize = inputBuffer.remaining(); if(bufferSize != rawInputBytes.length){ rawInputBytes = new byte[bufferSize]; } if(!readRawAudioEOS){ readRawAudioCount = fisRawAudio.read(rawInputBytes); if(readRawAudioCount == -1){ readRawAudioEOS = true; } } if(readRawAudioEOS){ audioEncoder.queueInputBuffer(inputBufIndex,0 , 0 , 0 ,MediaCodec.BUFFER_FLAG_END_OF_STREAM); sawInputEOS = true; }else{ inputBuffer.put(rawInputBytes, 0, readRawAudioCount); rawAudioSize += readRawAudioCount; audioEncoder.queueInputBuffer(inputBufIndex, 0, readRawAudioCount, audioTimeUs, 0); audioTimeUs = (long) (1000000 * (rawAudioSize / 2.0) / audioBytesPerSample); } } } outputBufIndex = audioEncoder.dequeueOutputBuffer(outBufferInfo, 10000); if(outputBufIndex >= 0){ // Simply ignore codec config buffers. if ((outBufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG)!= 0) { DLog.i(TAG, "audio encoder: codec config buffer"); audioEncoder.releaseOutputBuffer(outputBufIndex, false); continue; } if(outBufferInfo.size != 0){ ByteBuffer outBuffer = audioOutputBuffers[outputBufIndex]; outBuffer.position(outBufferInfo.offset); outBuffer.limit(outBufferInfo.offset + outBufferInfo.size); DLog.i(TAG, String.format(" writing audio sample : size=%s , presentationTimeUs=%s", outBufferInfo.size, outBufferInfo.presentationTimeUs)); if(lastAudioPresentationTimeUs < outBufferInfo.presentationTimeUs){ lastAudioPresentationTimeUs = outBufferInfo.presentationTimeUs; int outBufSize = outBufferInfo.size; int outPacketSize = outBufSize + 7; outBuffer.position(outBufferInfo.offset); outBuffer.limit(outBufferInfo.offset + outBufSize); byte[] outData = new byte[outBufSize + 7]; addADTStoPacket(outData, outPacketSize); outBuffer.get(outData, 7, outBufSize); fosAccAudio.write(outData, 0, outData.length); DLog.i(TAG, outData.length + " bytes written."); }else{ DLog.e(TAG, "error sample! its presentationTimeUs should not lower than before."); } } audioEncoder.releaseOutputBuffer(outputBufIndex, false); if ((outBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) { sawOutputEOS = true; } }else if (outputBufIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) { audioOutputBuffers = audioEncoder.getOutputBuffers(); } else if (outputBufIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) { MediaFormat audioFormat = audioEncoder.getOutputFormat(); DLog.i(TAG, "format change : "+ audioFormat); } } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { try { if (fisRawAudio != null) fisRawAudio.close(); if(fosAccAudio != null) fosAccAudio.close(); } catch (IOException e) { e.printStackTrace(); } } } private MediaCodec createACCAudioDecoder() throws IOException { MediaCodec codec = MediaCodec.createEncoderByType(AUDIO_MIME); MediaFormat format = new MediaFormat(); format.setString(MediaFormat.KEY_MIME, AUDIO_MIME); format.setInteger(MediaFormat.KEY_BIT_RATE, 128000); format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 2); format.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100); format.setInteger(MediaFormat.KEY_AAC_PROFILE,MediaCodecInfo.CodecProfileLevel.AACObjectLC); codec.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE); return codec; } /** * Add ADTS header at the beginning of each and every AAC packet. * This is needed as MediaCodec encoder generates a packet of raw * AAC data. * * Note the packetLen must count in the ADTS header itself. **/ private void addADTStoPacket(byte[] packet, int packetLen) { int profile = 2; //AAC LC //39=MediaCodecInfo.CodecProfileLevel.AACObjectELD; int freqIdx = 4; //44.1KHz int chanCfg = 2; //CPE // fill in ADTS data packet[0] = (byte)0xFF; packet[1] = (byte)0xF9; packet[2] = (byte)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2)); packet[3] = (byte)(((chanCfg&3)<<6) + (packetLen>>11)); packet[4] = (byte)((packetLen&0x7FF) >> 3); packet[5] = (byte)(((packetLen&7)<<5) + 0x1F); packet[6] = (byte)0xFC; } }
參考資料:
數字音頻: http://en.flossmanuals.net/pure-data/ch003_what-is-digital-audio/
WAV文件格式: http://soundfile.sapp.org/doc/WaveFormat/
ACC文件格式: http://www.cnblogs.com/caosiyang/archive/2012/07/16/2594029.html
有關Android Media編程的一些CTS:https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts
WAV轉ACC相關問題: http://stackoverflow.com/questions/18862715/how-to-generate-the-aac-adts-elementary-stream-with-android-mediacodec