Scalable audio coding using the nonuniform modulated complex lapped transform

This paper introduces a scalable audio coder using the nonuniform modulated complex lapped transform (NMCLT) [1], which is a new nonuniform oversampled filter bank with a better combination of time- and frequency-domain localization than previous designs. Masking functions for different critical Bark bands are first calculated directly from the NMCLT coefficients as perceptual weights and arithmetic coding is then used to compress bit planes of the weighted NMCLT coefficients to generate a perceptually scalable audio bitstream. The loss in coding performance due to oversampling is offset by limiting the amount of redudancy in the transform and exploiting the correlations among the NMCLT basis functions. Experiments show that our new coder outperforms a coder with the modulated lapped transform (MLT) [2] both objectively and subjectively.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Scalable audio coding using the nonuniform modulated complex lapped transform Conference Paper