Institute for Systems Research

Permanent URI for this communityhttp://hdl.handle.net/1903/4375

Browse

Search Results

Now showing 1 - 10 of 20

Enhancing LZW Coding Using a Variable-Length Binary Encoding
(1995) Acharya, Tinku; JaJa, Joseph F.; ISR
We present here a methodology to enhance the LZW coding for text compression using a variable-length binary encoding scheme. The basic principle of this encoding is based on allocating a set of prefix codes to a set of integers growing dynamically. The prefix property enables unique decoding of a string of elements from this set. We presented the experimental results to show the effectiveness of this variable-length binary encoding scheme.
Optimal Unified Architectures for the Real-Time Computation of Time-Recursive Discrete Sinusoidal Transforms
(1993) Liu, K.J. Ray; Chiu, Ching-Te; Kolagotla, Ravi K.; JaJa, Joseph F.; ISR
An optimal unified architecture that can efficiently compute the Discrete Cosine, Sine, Hartley, Fourier, Lapped Orthogonal, and Complex Lapped transforms for a continuous input data stream is proposed. This structure uses only half as many multipliers as the previous best known scheme [1]. The proposed architecture is regular, modular, and has only local interconnections in both data and control paths. There is no limitation on the transform size N and only 2N - 2 multipliers are needed for the DCT. The throughput of this scheme is one input sample per clock cycle. We provide a theoretical justification by showing that any discrete transform whose basis functions satisfy the Fundamental recurrence Formula has a second-order autoregressive structure in its filter realization. We also demonstrate that dual generation transform pairs share the same autoregressive structure. We extend these time-recursive concepts to multi- dimensional transforms. The resulting d-dimensional structures are fully- pipelined and consist of only d 1-D transform arrays and shift registers.
Optimal Unified Architectures for the Real-Time Computation of Time-Recursive Discrete Sinusoidal Transforms
(1992) Liu, K.J. Ray; Chiu, Ching-Te; Kolagotla, Ravi K.; JaJa, Joseph F.; ISR
An optimal unified architecture that can efficiently compute the Discrete Cosine, Sine, Hartley, Fourier, Lapped Orthogonal, and the Complex Lapped transforms for a continuous input data stream is proposed. This structure uses only half as many multipliers as the previous best known scheme [1]. This architecture is regular, modular, and has only local interconnections in both the data and control paths. There is no limitation on the transform size N and only 2N - 2 multipliers are needed for the DCT. The throughput of this scheme is one input sample per clock cycle. We provide a theoretical justification by showing that any discrete transform whose basis functions satisfy the Fundamental Recurrence Formula has a second-order autoregressive structure in its filter realization. We also demonstrate that dual generation transform pairs share the same autoregressive structure. We extend these time-recursive concepts to multi-dimensional transforms. The resulting multi-dimensional structure are fully- pipelined and consist of only d 1-D transform arrays and shift registers, where d is the dimension.
VLSI Architectures and Implementation of Predictive Tree- Searched Vector Quantizers for Real-Time Video Compression
(1992) Yu, S-S.; Kolagotla, Ravi K.; JaJa, Joseph F.; ISR
We describe a pipelined systolic architecture for implementing predictive Tree-Searched Vector Quantization (PTSVQ) for real- time image and speech coding applications. This architecture uses identical processors for both the encoding and decoding processes. the overall design is regular and the control is simple. Input data is processed at a rate of 1 pixel per clock cycle, which allows real-time processing of images at video rates. We implemented these processors using 1.2um CMOS technology. Spice simulations indicate correct operation at 40 MHz. Prototype version of these chips fabricated using 2um CMOS technology work at 20 MHz.
VLSI Implementation of Real-Time Parallel DCT/DST Lattice Structures for Video
(1992) Chiu, Ching-Te; Kolagotla, Ravi K.; Liu, K.J. Ray; JaJa, Joseph F.; ISR
The alternate use [1] of the discrete cosine transform (DCT) and the discrete sine transform (DST) can achieve a higher data compression rate and less block effect in image processing. A parallel lattice structure that can dually generate the 1-D DCT and DST is proposed. We also develop a fully-pipelined 2-D DCT lattice architecture that consists of two 1-D DCT/DST arrays without transposition. Both architectures are ideally suited for VLSI implementation because they are modular, regular, and have only local interconnections. the VLSI implementation of the lattice module using the distributed arithmetic approach is described. This realization of the lattice module using 2 um CMOS technology can achieve an 80Mb/s data rate.
Systolic Architectures for Finite-State Vector Quantization
(1991) Kolagotla, Ravi K.; Yu, S-S.; JaJa, Joseph F.; ISR
We present a new systolic architecture for implementing Finite State Vector Quantization in real-time for both speech and image data. This architecture is modular and has a very simple control flow. Only one processor is needed for speech compression. A linear array of processors is used for image compression; the number of processors needed is independent of the size of the image. We also present a simple architecture for converting line- scanned image data into the format required by this systolic architecture. Image data is processed at a rate of 1 pixel per clock cycle. An implementation at 31.5 MHz can quantize 1024 x 1024 pixel images at 30 frames/sec in real-time. We describe a VLSI implementation of these FSTSVQ processors.
VLSI Implementation of a Tree Searched Vector Quantizer
(1990) Kolagotla, Ravi K.; Yu, S.S.; JaJa, Joseph F.; ISR
The VLSI design and implementation of a Tree Searched Vector Quantizer is presented. The number of processors needed is equal to the depth of the tree. All processors are identical and data flow between processors is regular. No global control signals are needed. The processors have been fabricated using MOSIS' 2mm N- well process on a 7.9mm x 9.2mm die. Each processor chip contains 25,000 transistors and has 84 pins. The processors have been thoroughly tested at a clock frequency of 10 MHz. These processors will be used in an adaptive image compression system to compress LANDSAT images.
Parallel Algorithms for VLSI Layout.
(1989) JaJa, Joseph F.; Krishnamurthy, Sridhar; ISR
Efficient automatic layout tools are clearly essential for designing complex VLSI systems. Recent efforts have been directed toward developing parallel algorithms to handle the different subproblems involved in the layout phase. Some of these algorithms have been shown to offer significant speed-ups over the sequential ones. In this chapter, basic parallel techniques that have been found to be effective for handling problems arising in the layout phase are reviewed. In particular, parallel algorithms for problems arising in paffitioning, placement and routing are discussed. The algorithms used to handle these problems can be classified into two broad categories: iterative or direct. The iterative techniques such as simulated annealing and the Kernighan-Lin algorithm are very effective for partitioning and placment. The direct methods seem to be dominant in routing. These methods and some new methods are discussed in the general context of parallel processing. Efficient algorithms for the shared-memory model and for distributed-memory multiprocessors such as the hypercube are described. In addition, several special-purpose hardware for placement and routing are also outlined and their merits discussed.
Provably Good Parallel Algorithms for Channel Routing of Multi- terminal Nets.
(1988) Krishnamurthy, Sridhar; JaJa, Joseph F.; ISR
We consider the channel routing problem of a set of multi- terminal nets in the knockknee model. We develop a new approach to route all the nets within d + ALPHA tracks, where d is the channel density, and 0 < ALPHA < d, such that the corresponding layout can be realized with three layers. Both the routing and the layer assignment algorithms have linear time sequential implementations. In addition both can be implemented on the CREW- PRAM model in O( n/p + log n) time, with p processors, 1 < p < n and n is the size of the input.
Optimal Architectures for Multidimensional Transforms.
(1988) Chakrabarti, Chaitali; JaJa, Joseph F.; ISR
Multidimensional transforms have widespread applications in computer vision, pattern analysis and image processing. The only existing optimal architecture for computing multidimensional DFT on data of size n = Nd requires very large rotator units of area O(n^2) and pipeline-time O(log n). In this paper we propose a family of optimal architectures with areatime trade-offs for computing multidimensional transforms. The large rotator unit is replaced by a combination of a small rotator unit, a transpose unit and a block rotator unit. The combination has an area of O(N^(d+2a)) and a pipeline time of O(N^(d/2-a)log n), for 0 < a < d/2. We apply this scheme to design optimal architectures for two-dimensional DFT, DHT and DCT. The computation is made efficient by mapping each of the one-dimensional transforms involved into two dimensions.

Institute for Systems Research

Browse

Filters

Settings

Sort By

Results per page

Search Results