BASIC4MCU | 하드웨어 | AUDIO | Vorbis I specification - (2)
페이지 정보
작성자 키트 작성일2017-08-25 14:13 조회1,725회 댓글0건본문
Vorbis I specification
February 27, 2015
6. Floor type 0 setup and decode
6.1. Overview
Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately known as Line Spectral Frequency or LSF) representation to encode a smooth spectral envelope curve as the frequency response of the LSP filter. This representation is equivalent to a traditional all-pole infinite impulse response filter as would be used in linear predictive coding; LSP representation may be converted to LPC representation and vice-versa.
6.2. Floor 0 format
Floor zero configuration consists of six integer fields and a list of VQ codebooks for use in coding/decoding the LSP filter coefficient values used by each frame.
6.2.1. header decode
Configuration information for instances of floor zero decodes from the codec setup header (third packet). configuration decode proceeds as follows:
1 1) [floor0_order] = read an unsigned integer of 8 bits
2 2) [floor0_rate] = read an unsigned integer of 16 bits
3 3) [floor0_bark_map_size] = read an unsigned integer of 16 bits
4 4) [floor0_amplitude_bits] = read an unsigned integer of six bits
5 5) [floor0_amplitude_offset] = read an unsigned integer of eight bits
6 6) [floor0_number_of_books] = read an unsigned integer of four bits and add 1
7 7) array [floor0_book_list] = read a list of [floor0_number_of_books] unsigned integers of eight bits each;An end-of-packet condition during any of these bitstream reads renders this stream undecodable. In addition, any element of the array [floor0_book_list] that is greater than the maximum codebook number for this bitstream is an error condition that also renders the stream undecodable.
6.2.2. packet decode
Extracting a floor0 curve from an audio packet consists of first decoding the curve amplitude and [floor0_order] LSP coefficient values from the bitstream, and then computing the floor curve, which is defined as the frequency response of the decoded LSP filter.
Packet decode proceeds as follows:
1 1) [amplitude] = read an unsigned integer of [floor0_amplitude_bits] bits
2 2) if ( [amplitude] is greater than zero ) {
3 3) [coefficients] is an empty, zero length vector
4 4) [booknumber] = read an unsigned integer of ilog( [floor0_number_of_books] ) bits
5 5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable
6 6) [last] = zero;
7 7) vector [temp_vector] = read vector from bitstream using codebook number [floor0_book_list] element [booknumber] in VQ context.
8 8) add the scalar value [last] to each scalar in vector [temp_vector]
9 9) [last] = the value of the last scalar in vector [temp_vector]
10 10) concatenate [temp_vector] onto the end of the [coefficients] vector
11 11) if (length of vector [coefficients] is less than [floor0_order], continue at step 6
12
13 }
14
15 12) done.
16Take note of the following properties of decode:
- An [amplitude] value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis). Several later stages of decode don’t occur for an unused channel.
- An end-of-packet condition during decode should be considered a nominal occruence; if end-of-packet is reached during any read operation above, floor decode is to return ’unused’ status as if the [amplitude] value had read zero at the beginning of decode.
- The book number used for decode can, in fact, be stored in the bitstream in ilog( [floor0_number_of_books] - 1 ) bits. Nevertheless, the above specification is correct and values greater than the maximum possible book value are reserved.
- The number of scalars read into the vector [coefficients] may be greater than [floor0_order], the number actually required for curve computation. For example, if the VQ codebook used for the floor currently being decoded has a [codebook_dimensions] value of three and [floor0_order] is ten, the only way to fill all the needed scalars in [coefficients] is to to read a total of twelve scalars as four vectors of three scalars each. This is not an error condition, and care must be taken not to allow a buffer overflow in decode. The extra values are not used and may be ignored or discarded.
6.2.3. curve computation
Given an [amplitude] integer and [coefficients] vector from packet decode as well as the [floor0_order], [floor0_rate], [floor0_bark_map_size], [floor0_amplitude_bits] and [floor0_amplitude_offset] values from floor setup, and an output vector size [n] specified by the decode process, we compute a floor output vector.
If the value [amplitude] is zero, the return value is a length [n] vector with all-zero scalars. Otherwise, begin by assuming the following definitions for the given vector to be synthesized:
where
and
The above is used to synthesize the LSP curve on a Bark-scale frequency axis, then map the result to a linear-scale frequency axis. Similarly, the below calculation synthesizes the output LSP curve [output] on a log (dB) amplitude scale, mapping it to linear amplitude in the last step:
- 1.
- [i] = 0
- 2.
- [ω] = π * map element [i] / [floor0_bark_map_size]
- 3.
- if ( [floor0_order] is odd )
- a)
- calculate [p] and [q] according to:
else [floor0_order] is even
- a)
- calculate [p] and [q] according to:
- 4.
- calculate [linear_floor_value] according to:
- 5.
- [iteration_condition] = map element [i]
- 6.
- [output] element [i] = [linear_floor_value]
- 7.
- increment [i]
- 8.
- if ( map element [i] is equal to [iteration_condition] ) continue at step 5
- 9.
- if ( [i] is less than [n] ) continue at step 2
- 10.
- done
Errata 20150227: Bark scale computation Due to a typo when typesetting this version of the specification from the original HTML document, the Bark scale computation previously erroneously read:
Note that the last parenthesis is misplaced. This document now uses the correct equation as it appeared in the original HTML spec document:
7. Floor type 1 setup and decode
7.1. Overview
Vorbis floor type one uses a piecewise straight-line representation to encode a spectral envelope curve. The representation plots this curve mechanically on a linear frequency axis and a logarithmic (dB) amplitude axis. The integer plotting algorithm used is similar to Bresenham’s algorithm.
7.2. Floor 1 format
7.2.1. model
Floor type one represents a spectral curve as a series of line segments. Synthesis constructs a floor curve using iterative prediction in a process roughly equivalent to the following simplified description:
- the first line segment (base case) is a logical line spanning from x˙0,y˙0 to x˙1,y˙1 where in the base case x˙0=0 and x˙1=[n], the full range of the spectral floor to be computed.
- the induction step chooses a point x˙new within an existing logical line segment and produces a y˙new value at that point computed from the existing line’s y value at x˙new (as plotted by the line) and a difference value decoded from the bitstream packet.
- floor computation produces two new line segments, one running from x˙0,y˙0 to x˙new,y˙new and from x˙new,y˙new to x˙1,y˙1. This step is performed logically even if y˙new represents no change to the amplitude value at x˙new so that later refinement is additionally bounded at x˙new.
- the induction step repeats, using a list of x values specified in the codec setup header at floor 1 initialization time. Computation is completed at the end of the x value list.
Consider the following example, with values chosen for ease of understanding rather than representing typical configuration:
For the below example, we assume a floor setup with an [n] of 128. The list of selected X values in increasing order is 0,16,32,48,64,80,96,112 and 128. In list order, the values interleave as 0, 128, 64, 32, 96, 16, 48, 80 and 112. The corresponding list-order Y values as decoded from an example packet are 110, 20, -5, -45, 0, -25, -10, 30 and -10. We compute the floor in the following way, beginning with the first line:
Figure 7: graph of example floorWe now draw new logical lines to reflect the correction to new˙Y, and iterate for X positions 32 and 96:
Figure 8: graph of example floorAlthough the new Y value at X position 96 is unchanged, it is still used later as an endpoint for further refinement. From here on, the pattern should be clear; we complete the floor computation as follows:
Figure 9: graph of example floor
Figure 10: graph of example floorA more efficient algorithm with carefully defined integer rounding behavior is used for actual decode, as described later. The actual algorithm splits Y value computation and line plotting into two steps with modifications to the above algorithm to eliminate noise accumulation through integer roundoff/truncation.
7.2.2. header decode
A list of floor X values is stored in the packet header in interleaved format (used in list order during packet decode and synthesis). This list is split into partitions, and each partition is assigned to a partition class. X positions 0 and [n] are implicit and do not belong to an explicit partition or partition class.
A partition class consists of a representation vector width (the number of Y values which the partition class encodes at once), a ’subclass’ value representing the number of alternate entropy books the partition class may use in representing Y values, the list of [subclass] books and a master book used to encode which alternate books were chosen for representation in a given packet. The master/subclass mechanism is meant to be used as a flexible representation cascade while still using codebooks only in a scalar context.
댓글 0
조회수 1,725등록된 댓글이 없습니다.