ogg logical bitstream framing原文html
The Ogg transport bitstream is designed to provide framing, error protection and seeking structure for higher-level codec streams that consist of raw, unencapsulated data packets, such as the Vorbis audio codec or Theora video codec. Ogg傳輸比特流旨在爲由原始,未封裝的數據包(例如Vorbis音頻編解碼器或Theora視頻編解碼器)組成的高級編解碼器流提供成幀,錯誤保護和查找結構。算法
Vorbis encodes short-time blocks of PCM data into raw packets of bit-packed data. These raw packets may be used directly by transport mechanisms that provide their own framing and packet-separation mechanisms (such as UDP datagrams). For stream based storage (such as files) and transport (such as TCP streams or pipes), Vorbis uses the Ogg bitstream format to provide framing/sync, sync recapture after error, landmarks during seeking, and enough information to properly separate data back into packets at the original packet boundaries without relying on decoding to find packet boundaries. Vorbis將短期的PCM數據塊編碼爲位打包數據的原始數據包。這些原始數據包能夠由提供本身的成幀和數據包分離機制(例如UDP數據報)的傳輸機制直接使用。對於基於流的存儲(例如文件)和傳輸(例如TCP流或管道),Vorbis使用Ogg比特流格式提供成幀/同步,錯誤後同步從新捕獲,搜索期間的地標以及足夠的信息以將數據正確分離回原始分組邊界處的全部分組,而無需依靠解碼來找到分組邊界。markdown
A logical Ogg bitstream is a contiguous stream of sequential pages belonging only to the logical bitstream. A physical Ogg bitstream is constructed from one or more than one logical Ogg bitstream (the simplest physical bitstream is simply a single logical bitstream). We describe below the exact formatting of an Ogg logical bitstream. Combining logical bitstreams into more complex physical bitstreams is described in the Ogg bitstream overview. The exact mapping of raw Vorbis packets into a valid Ogg Vorbis physical bitstream is described in the Vorbis I Specification. 邏輯Ogg比特流是僅屬於該邏輯比特流的連續頁面的連續流。物理Ogg比特流是由一個或多個邏輯Ogg比特流構成的(最簡單的物理比特流只是一個邏輯比特流)。咱們在下面描述Ogg邏輯比特流的確切格式。 Ogg比特流概述中介紹了將邏輯比特流組合爲更復雜的物理比特流。 Vorbis I規範中描述了原始Vorbis數據包到有效Ogg Vorbis物理比特流的精確映射。app
An Ogg stream is structured by dividing incoming packets into segments of up to 255 bytes and then wrapping a group of contiguous packet segments into a variable length page preceded by a page header. Both the header size and page size are variable; the page header contains sizing information and checksum data to determine header/page size and data integrity. Ogg流的結構是將傳入的數據包劃分爲最多255個字節的段,而後將一組連續的數據包段包裝到一個可變長度的頁中,並在頁頭以前。標頭大小和頁面大小都是可變的;頁面標題包含大小調整信息和校驗和數據,以肯定標題/頁面大小和數據完整性。 The bitstream is captured (or recaptured) by looking for the beginning of a page, specifically the capture pattern. Once the capture pattern is found, the decoder verifies page sync and integrity by computing and comparing the checksum. At that point, the decoder can extract the packets themselves. 經過查找頁面的開頭(特別是捕獲模式)來捕獲(或從新捕獲)比特流。一旦找到捕獲模式,解碼器就會經過計算和比較校驗和來驗證頁面同步和完整性。此時,解碼器能夠提取數據包自己。less
Packets are logically divided into multiple segments before encoding into a page. Note that the segmentation and fragmentation process is a logical one; it's used to compute page header values and the original page data need not be disturbed, even when a packet spans page boundaries. The raw packet is logically divided into [n] 255 byte segments and a last fractional segment of < 255 bytes. A packet size may well consist only of the trailing fractional segment, and a fractional segment may be zero length. These values, called "lacing values" are then saved and placed into the header segment table. An example should make the basic concept clear: 數據包在編碼爲頁面以前,在邏輯上分爲多個段。注意,分段和分段過程是合乎邏輯的;它用於計算頁面標頭值,而且即便數據包跨越頁面邊界,也沒必要干擾原始頁面數據。 原始數據包在邏輯上分爲[n] 255個字節段和<255個字節的最後一個小數段。數據包大小可能僅由尾隨的小數部分組成,小數部分的長度可能爲零。而後將這些值(稱爲「lacing valuesß」)保存並放入標題段表中。 一個例子應該使基本概念清楚: raw packet:dom
|packet data____| 753 bytes lacing values for page header segment table: 255,255,243 We simply add the lacing values for the total size; the last lacing value for a packet is always the value that is less than 255. Note that this encoding both avoids imposing a maximum packet size as well as imposing minimum overhead on small packets (as opposed to, eg, simply using two bytes at the head of every packet and having a max packet size of 32k. Small packets (<255, the typical case) are penalized with twice the segmentation overhead). Using the lacing values as suggested, small packets see the minimum possible byte-aligned overhead (1 byte) and large packets, over 512 bytes or so, see a fairly constant ~.5% overhead on encoding space. 咱們只需將lacing values加總尺寸便可;數據包的最後一個花邊值始終是小於255的值。請注意,此編碼既能夠避免在小數據包上施加最大的數據包大小,也能夠避免施加最小的開銷(與之相反,例如,僅在數據包上使用兩個字節)每一個數據包的首位,最大數據包大小爲32k。小數據包(典型值<255)會受到分段開銷的兩倍的懲罰。使用建議的花邊值,小數據包會看到最小的字節對齊開銷(1個字節),而大數據包會超過512個字節左右,在編碼空間上看到至關恆定的〜.5%開銷。 Note that a lacing value of 255 implies that a second lacing value follows in the packet, and a value of < 255 marks the end of the packet after that many additional bytes. A packet of 255 bytes (or a multiple of 255 bytes) is terminated by a lacing value of 0: 請注意,lacing values爲255表示在包中緊跟第二個lacing values,而值<255則表示在該字節以後有不少附加字節。 255字節(或255字節的倍數)的數據包以0的花邊值終止:ide
raw packet:oop
|packet data____| 255 bytes lacing values: 255, 0 Note also that a 'nil' (zero length) packet is not an error; it consists of nothing more than a lacing value of zero in the header.大數據
Packets are not restricted to beginning and ending within a page, although individual segments are, by definition, required to do so. Packets are not restricted to a maximum size, although excessively large packets in the data stream are discouraged. After segmenting a packet, the encoder may decide not to place all the resulting segments into the current page; to do so, the encoder places the lacing values of the segments it wishes to belong to the current page into the current segment table, then finishes the page. The next page is begun with the first value in the segment table belonging to the next packet segment, thus continuing the packet (data in the packet body must also correspond properly to the lacing values in the spanned pages. The segment data in the first packet corresponding to the lacing values of the first page belong in that page; packet segments listed in the segment table of the following page must begin the page body of the subsequent page). The last mechanic to spanning a page boundary is to set the header flag in the new page to indicate that the first lacing value in the segment table continues rather than begins a packet; a header flag of 0x01 is set to indicate a continued packet. Although mandatory, it is not actually algorithmically necessary; one could inspect the preceding segment table to determine if the packet is new or continued. Adding the information to the packet_header flag allows a simpler design (with no overhead) that needs only inspect the current page header after frame capture. This also allows faster error recovery in the event that the packet originates in a corrupt preceding page, implying that the previous page's segment table cannot be trusted. Note that a packet can span an arbitrary number of pages; the above spanning process is repeated for each spanned page boundary. Also a 'zero termination' on a packet size that is an even multiple of 255 must appear even if the lacing value appears in the next page as a zero-length continuation of the current packet. The header flag should be set to 0x01 to indicate that the packet spanned, even though the span is a nil case as far as data is concerned. The encoding looks odd, but is properly optimized for speed and the expected case of the majority of packets being between 50 and 200 bytes (note that it is designed such that packets of wildly different sizes can be handled within the model; placing packet size restrictions on the encoder would have only slightly simplified design in page generation and increased overall encoder complexity). The main point behind tracking individual packets (and packet segments) is to allow more flexible encoding tricks that requiring explicit knowledge of packet size. An example is simple bandwidth limiting, implemented by simply truncating packets in the nominal case if the packet is arranged so that the least sensitive portion of the data comes last.flex
The headering mechanism is designed to avoid copying and re-assembly of the packet data (ie, making the packet segmentation process a logical one); the header can be generated directly from incoming packet data. The encoder buffers packet data until it finishes a complete page at which point it writes the header followed by the buffered packet segments. 標頭機制旨在避免複製和重組數據包數據(即,使數據包分段過程成爲合乎邏輯的過程);標頭能夠直接從傳入的數據包數據中生成。編碼器緩衝數據包數據,直到完成一個完整的頁面爲止,在該點上,編碼器將寫入標頭,而後寫入緩衝的數據包段。
A header begins with a capture pattern that simplifies identifying pages; once the decoder has found the capture pattern it can do a more intensive job of verifying that it has in fact found a page boundary (as opposed to an inadvertent coincidence in the byte stream). 標題以捕獲模式開頭,該模式簡化了識別頁面的過程。一旦解碼器找到了捕獲模式,它就能夠作更多的工做來驗證它實際上已經發現了頁邊界(與字節流中的偶然巧合相對)。
byte value 0 0x4f 'O' 1 0x67 'g' 2 0x67 'g' 3 0x53 'S'
The capture pattern is followed by the stream structure revision: 捕獲模式以後是流結構修訂: byte value 4 0x00
The header type flag identifies this page's context in the bitstream:
byte value 5 bitflags: 0x01: unset = fresh packet 開始包 set = continued packet 0x02: unset = not first page of logical bitstream set = first page of logical bitstream (bos) 0x04: unset = not last page of logical bitstream set = last page of logical bitstream (eos)
(This is packed in the same way the rest of Ogg data is packed; LSb of LSB first. Note that the 'position' data specifies a 'sample' number (eg, in a CD quality sample is four octets, 16 bits for left and 16 bits for right; in video it would likely be the frame number. It is up to the specific codec in use to define the semantic meaning of the granule position value). The position specified is the total samples encoded after including all packets finished on this page (packets begun on this page but continuing on to the next page do not count). The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully. A special value of '-1' (in two's complement) indicates that no packets finish on this page. 以與打包其餘Ogg數據相同的方式打包;首先打包LSB的LSb。請注意,「位置」數據指定了「樣本」編號(例如,CD質量樣本中爲四個八位位組,左爲16位 右邊是16位;在視頻中,多是幀號。這取決於所使用的特定編解碼器,以定義顆粒位置值的語義。)指定的位置是在包括全部包完成後編碼的總樣本數在此頁上的數據包(在此頁上開始但繼續到下一頁的數據包不計算在內)。此處的理由是,在最後一頁的幀頭中指定的位置指示比特流編碼的數據有多長時間。流仍將返回正確數量的樣本,能夠徹底解碼。 特殊值「 -1」(以2的補碼錶示)表示該頁面上沒有數據包結束。 byte value 6 0xXX LSB 7 0xXX 8 0xXX 9 0xXX 10 0xXX 11 0xXX 12 0xXX 13 0xXX MSB
Ogg allows for separate logical bitstreams to be mixed at page granularity in a physical bitstream. The most common case would be sequential arrangement, but it is possible to interleave pages for two separate bitstreams to be decoded concurrently. The serial number is the means by which pages physical pages are associated with a particular logical stream. Each logical stream must have a unique serial number within a physical stream: Ogg容許將單獨的邏輯位流以頁面粒度混合到物理位流中。最多見的狀況是順序排列,可是能夠對要同時解碼的兩個獨立位流進行頁面交織。序列號是將頁面物理頁面與特定邏輯流關聯的方式。每一個邏輯流在物理流中必須具備惟一的序列號: byte value 14 0xXX LSB 15 0xXX 16 0xXX 17 0xXX MSB
Page counter; lets us know if a page is lost (useful where packets span page boundaries). 頁面計數器;讓咱們知道頁面是否丟失(在數據包跨越頁面邊界的地方頗有用) byte value 18 0xXX LSB 19 0xXX 20 0xXX 21 0xXX MSB
32 bit CRC value (direct algorithm, initial val and final XOR = 0, generator polynomial=0x04c11db7). The value is computed over the entire header (with the CRC field in the header set to zero) and then continued over the page. The CRC field is then filled with the computed value. (A thorough discussion of CRC algorithms can be found in "A Painless Guide to CRC Error Detection Algorithms" by Ross Williams ross@ross.net.) 32位CRC值(直接算法,初始val和最終XOR = 0,生成多項式= 0x04c11db7)。該值是在整個標頭上計算的(標頭中的CRC字段設置爲零),而後在頁面上繼續計算。而後,CRC字段將填充計算出的值。 (能夠在Ross Williams ross@ross.net的「無痛CRC錯誤檢測算法指南」中找到有關CRC算法的詳盡討論。) byte value 22 0xXX LSB 23 0xXX 24 0xXX 25 0xXX MSB
The number of segment entries to appear in the segment table. The maximum number of 255 segments (255 bytes each) sets the maximum possible physical page size at 65307 bytes or just under 64kB (thus we know that a header corrupted so as destroy sizing/alignment information will not cause a runaway bitstream. We'll read in the page according to the corrupted size information that's guaranteed to be a reasonable size regardless, notice the checksum mismatch, drop sync and then look for recapture). 要在細分表中顯示的細分條目數。最多255個段(每一個255字節)將最大可能的物理頁面大小設置爲65307字節或略低於64kB(所以咱們知道報頭已損壞,所以破壞大小調整/對齊信息不會致使比特流失控。根據已損壞的大小信息讀取頁面,不管該大小是多少,都應保證大小合理,請注意校驗和不匹配,丟包同步,而後查找從新捕獲)。 byte value 26 0x00-0xff (0-255)
The lacing values for each packet segment physically appearing in this page are listed in contiguous order. 物理上出如今此頁面中的每一個數據包段的繫帶值按連續順序列出。
byte value 27 0x00-0xff (0-255) [...] n 0x00-0xff (0-255, n=page_segments+26) Total page size is calculated directly from the known header size and lacing values in the segment table. Packet data segments follow immediately after the header. 總頁面大小直接根據細分表中已知的標題大小和繫帶值計算。包數據段緊跟在頭以後。 Page headers typically impose a flat .25-.5% space overhead assuming nominal ~8k page sizes. The segmentation table needed for exact packet recovery in the streaming layer adds approximately .5-1% nominal assuming expected encoder behavior in the 44.1kHz, 128kbps stereo encodings. 假設標稱〜8k的頁面大小,頁面標題一般會產生.25-.5%的固定空間開銷。假設在44.1kHz,128kbps立體聲編碼中預期的編碼器行爲,則在流傳輸層中進行精確數據包恢復所需的分段表將增長約0.5-1%的標稱值。