A Three-Component Image Model Based on Human Visual Perception and Its Applications in Image Coding and Processing
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
In this work, results of psychovisual studies of the human visual system are discussed and interpreted in a mathematical framework. the formation of the perception is described by appropriate minimization problems and the edge information is found to be of primary importance in the visual perception. Having introduced the concept of edge strength, it is demonstrated that strong edges are of higher perceptual importance than weaker edges (textures). We have also found that smooth areas of an image influence the human visual perception together with the edge information, and that this influence can be mathematically described via a minimization problem. Based on this study, we have proposed to decompose the image into three components: (i) primary, (ii) smooth and (iii) texture, which contain, respectively, the strong edges, the background and the textures. An algorithm is developed to generate the three-component images model.
Then, the use of this perceptually-motivated image model in the context of image compression is investigated. The primary component is encoded separately by encoding the intensity and geometric information of the strong edge brim contours. Two alternatives for coding the smooth and texture components are studied: Entropy-coded adaptive DCT and entropy-coded subband coding. It is shown via extensive simulations that the proposed schemes, which can be thought of as a hybrid of waveform coding and feature-based coding techniques, result in both subjective and objective performance improvements over several other image coding schemes and, in particular, over the JPEG continuous-tone image compression standard. These improvements are especially noticeable at low bit rates. Furthermore, it is shown that a perceptual tuning based on the contrast-sensitivity of the human visual system can be used in the DCT-based scheme, which in conjunction with the three-component model, leads to additional subjective performance improvements.
Finally, a scheme for structurally representing planar curves is developed based on the ideas for the three-component image model. This scheme does not have the ambiguity problem associated with the scale-space-based schemes.