Variable-Rate Finite-State Vector Quantization and Applications to Speech and Image Coding
MetadataShow full item record
A finite-state vector quantizer is a finite-state machine that can be viewed as a collection of memoryless full-searched vector quantizers, where each input vector is encoded using a vector quantizer associated with the current encoder state; the current state and selected codeword determine the next encoder state. It is generally assumed that the state codebooks are unstructured and have the same cardinality leading to a fixed-rate scheme . In this paper, we present two variable-rate variations of the scheme in  with the possibility of using structured as well as unstructured state codebooks. In the first scheme, we let the state codebook sizes be different for different states, implying different rate distribution among the states. In the second scheme, in addition tot his flexibility, we use pruned tree- structured vector quantizers as the state quantizers, i.e., we let each of the state quantizers be a variable-rate encoder. For encoding sampled speech data, both of these schemes perform significantly better than the fixed-rate scheme of . The second scheme gives the best performance of all; performance improvements of up to 4.25 dB at the rate of 0.5 bits/sample are obtained over the scheme in .<P>We also consider the 2-D extension of the above mentioned schemes and describe two low bit rate image coding systems based on these schemes. The first system subtracts the mean from each input block and then encodes the mean-subtracted block by means of the 2-D versions of fixed- rate and variable-rate finite-state vector quantizer; the block- mean is separately encoded in an efficient manner by exploiting the high correlation present in the means of adjacent blocks. In the second system, a prediction is made on each pixel using a 5th-order predictor and the residual is again encoded using the 2-D versions of the fixed-rate and variable-rate finite-state vector quantizer. At a bit rate of 0.3 bits per pixel, a peak signal-to-noise ratio in excess of 31 dB is achieved for encoding the 512 x 512 version of "Lena" using the schemes employing variable-rate finite-state vector quantizers.