[WebRTC] Audio Codec Encoder 基類註解

時間 2020-06-28

標籤 webrtc audio codec encoder 註解简体版

原文原文鏈接

Audio Codec Encoder 中包含大量錯誤重傳、減小傳輸量、QoS分析、網絡質量等的優化部分。如：web

FEC（前向糾錯forward error correction），
VAD(靜音檢測，Voice Activity Detector)，
DTX(不連續傳輸Discontinuous Transmission).
RTT（往返時延 round-trip time )
ANA 。。。

真正的編碼在EncodeImpl()函數中。api

音頻幀率計算方法：網絡

格式（編碼字節數、採樣一位所佔的字節數） format = s16(格式)=16（bit）
聲道數 channels = 2
一次採樣（一秒中所佔的位數）TotalBit = sampling * channels * format = 1411200
一次採樣（一秒中所佔的字節數）TotalByte = TotalBit/8 = 176400

AAC:

nb_samples和frame_size = 1024
一幀數據量：10242s16/8 = 4096個字節。
ACC幀率 (一秒播放幀數)= TotalByte/4096 = 43.06640625幀

MP3:

nb_samples和frame_size = 1152
一幀數據量：11522s16/8 = 4608個字節。
MP3幀率 (一秒播放幀數)= TotalByte/4608 = 38.28125幀

webrtc\src\api\audio_codecs\audio_encoder.h

// This is the interface class for encoders in AudioCoding module. Each codec
// type must have an implementation of this class.
class AudioEncoder {
 public:
  

  virtual ~AudioEncoder() = default;

  // 設置採樣率和通道數。
  virtual int SampleRateHz() const = 0;
  virtual size_t NumChannels() const = 0;

  // 返回採樣率 SampleRate Hz. 採樣率從8 kHz（窄帶）到48 kHz（全頻）
  // 人對頻率的識別範圍是 20HZ ~ 20kHZ
  // 電話採樣率 8kHZ
  virtual int RtpTimestampRateHz() const;

  // 下一個編碼包中，10 ms內的編碼幀數。
  // 每一個包編碼出來的幀數可能會不同。
  virtual size_t Num10MsFramesInNextPacket() const = 0;

  // 10ms編碼 最大幀數
  virtual size_t Max10MsFramesInAPacket() const = 0;

  // 當前的比特率 bits/s (碼率). 
  //Opus 支持恆定比特率（CBR）和可變比特率（VBR）
  virtual int GetTargetBitrate() const = 0;


  // 編碼動做，實際會調用 EncodeImpl()。
  // 輸入一個10 ms音頻包
  // 多通道音頻須要交叉編碼。
  // Audio.size() == SampleRateHz() * NumChannels / 100
  EncodedInfo Encode(uint32_t rtp_timestamp,
                     rtc::ArrayView<const int16_t> audio,
                     rtc::Buffer* encoded);

  // 編碼結果包發出以前，從新編碼
  virtual void Reset() = 0;

  // Enables or disables codec-internal FEC (forward error correction).
  // 名詞：NACK重傳, FEC(前向糾錯)
  // 題外：若是視頻Codec選擇爲H264的時候, FEC,RED是被關閉的。
  // 		參見rtp_video_sender.cc
  virtual bool SetFec(bool enable);

  // DTX(不連續傳輸)
  // Enables or disables codec-internal VAD(靜音檢測，Voice Activity Detector)/DTX(不連續傳輸Discontinuous Transmission)/. 
  // 主要應用在不活躍的語音週期中，下降傳輸速率，同時保持可接受的輸出質量。
  // VAD將輸入信號分類爲活動語音、非活動語音和背景噪聲。
  // 基於VAD決策，DTX在靜默期間插入靜默插入描述符（SID）幀。
  // 在靜默期間，SID幀被週期性地發送到CNG(溫馨噪音生成)模塊，該模塊在接收端的非活動語音期間產生環境噪聲。
  virtual bool SetDtx(bool enable);

  // Returns the status of codec-internal DTX. 
  virtual bool GetDtx() const;

  // Sets the application mode. 
  enum class Application { kSpeech, kAudio };
  virtual bool SetApplication(Application application);

  // 設置播放(decoder)的最大采樣率。
  virtual void SetMaxPlaybackRate(int frequency_hz);


  RTC_DEPRECATED virtual void SetTargetBitrate(int target_bps);

  // NOTE: This method is subject to change. Do not call or override it.
  virtual rtc::ArrayView<std::unique_ptr<AudioEncoder>>
  ReclaimContainedEncoders();

  // Enables audio network adaptor. Returns true if successful.
  virtual bool EnableAudioNetworkAdaptor(const std::string& config_string,
                                         RtcEventLog* event_log);

  // Disables audio network adaptor.
  virtual void DisableAudioNetworkAdaptor();

  // Provides uplink packet loss fraction to this encoder to allow it to adapt.
  // 上行鏈路包丟失片段處理
  // |uplink_packet_loss_fraction| is in the range [0.0, 1.0].
  // The uplink packet loss fractions as set by the ANA FEC controller.
  virtual void OnReceivedUplinkPacketLossFraction(
      float uplink_packet_loss_fraction);

  // 能夠FEC前向糾錯的部分。
  // Provides 1st-order-FEC-recoverable uplink packet loss rate to this encoder
  // to allow it to adapt.
  // |uplink_recoverable_packet_loss_fraction| is in the range [0.0, 1.0].
  virtual void OnReceivedUplinkRecoverablePacketLossFraction(
      float uplink_recoverable_packet_loss_fraction);

  // Provides target audio bitrate to this encoder to allow it to adapt.
  virtual void OnReceivedTargetAudioBitrate(int target_bps);

  // Provides target audio bitrate and corresponding probing interval of
  // the bandwidth estimator to this encoder to allow it to adapt.
  virtual void OnReceivedUplinkBandwidth(int target_audio_bitrate_bps,
                                         absl::optional<int64_t> bwe_period_ms);

  // Provides target audio bitrate and corresponding probing interval of
  // the bandwidth estimator to this encoder to allow it to adapt.
  virtual void OnReceivedUplinkAllocation(BitrateAllocationUpdate update);

  // RTT：round-trip time(往返時延)，是指從數據包發送開始，到接收端確認接收，
  // 而後發送確認給發送端總共經歷的延時，注意：不包括接收端處理須要的耗時。
  // Sender:s(t0)-------------------------------------->Receiver:r(t1)
  // Sender:r(t3)<---------------------------------------Receiver:s(t2)
  //  rtt時間=t1-t0+t3-t2=t3-t0-(t2-t1)=t3-t0-d      d（接收端處理耗時）
  // Provides RTT to this encoder to allow it to adapt.
  virtual void OnReceivedRtt(int rtt_ms);

  // Overhead 每一個包中Encoder帶來的字節大小。
  // Provides overhead to this encoder to adapt. The overhead is the number of
  // bytes that will be added to each packet the encoder generates.
  virtual void OnReceivedOverhead(size_t overhead_bytes_per_packet);

  // To allow encoder to adapt its frame length, it must be provided the frame
  // length range that receivers can accept.
  virtual void SetReceiverFrameLengthRange(int min_frame_length_ms,
                                           int max_frame_length_ms);

  // 網絡適配 統計用
  // Get statistics related to audio network adaptation.
  virtual ANAStats GetANAStats() const;

 protected:
  // 真正的Encoding部分
  // Subclasses implement this to perform the actual encoding. Called by
  // Encode().
  virtual EncodedInfo EncodeImpl(uint32_t rtp_timestamp,
                                 rtc::ArrayView<const int16_t> audio,
                                 rtc::Buffer* encoded) = 0;
};

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。