Institute for Systems Research Technical Reports

Permanent URI for this collectionhttp://hdl.handle.net/1903/4376

This archive contains a collection of reports generated by the faculty and students of the Institute for Systems Research (ISR), a permanent, interdisciplinary research unit in the A. James Clark School of Engineering at the University of Maryland. ISR-based projects are conducted through partnerships with industry and government, bringing together faculty and students from multiple academic departments and colleges across the university.

Browse

Search Results

Now showing 1 - 7 of 7
  • Thumbnail Image
    Item
    Ripple Analysis in Ferret Primary Auditory Cortex III. Prediction of Unit Responses to Arbitrary Spectral Profiles
    (1995) Shamma, S.; Versnel, H.; ISR
    We examined whether AI responses to arbitrary spectral profiles can be explained by the superposition of responses to the individual ripple components that make up the spectral pattern. For each unit, the ripple transfer function was first measured using ripple stimuli consisting of broadband complexes with sinusoidally modulated spectral envelopes (Shamma et al. 1994). Unit responses to various combinations of ripples were compared to those predicted from the superposition of responses according to the transfer function. Spectral profiled included combinations of 2-5 ripples of equal amplitudes and random phases, and vowel- like profiles composed of 10 ripples with various amplitudes and phases. The results demonstrate that predicted and measured responses are reasonably well matched, and hence support the notion that AI analyzes the acoustic spectrum in a substantially linear manner.
  • Thumbnail Image
    Item
    Normalization and Noise-Robustness in Early Auditory Representations
    (1993) Wang, K.; Shamma, S.; ISR
    A common sequence of operations in the early stages of most sensory systems is a multiscale transform followed by a compressive nonlinearity. In this paper, we explore the contribution of these operations to the formation of robust and perceptually significant representation in the early auditory system. It is shown that auditory representation of the acoustic spectrum is effectively a self-normalized spectral analysis, i.e., the auditory system computes a spectrum that is divided by a smoothed version of itself. Such a self-normalization induces significant effects such as spectral shape enhancement and robustness against scaling and noise corruption. Examples using synthesized signals and a natural speech vowel are presented to illustrate these results. Furthermore, the characteristics of auditory representation are discussed in the context of several psychoacoustical findings, together with the possible benefits of this model for various engineering applications.
  • Thumbnail Image
    Item
    Classification of the Transient Signals via Auditory Representations
    (1991) Teolis, A.; Shamma, S.; ISR
    We use a model of processing in the human auditory system to develop robust representations of signals. These reduced representations are then presented to a neural network for training and classification.

    Empirical studies demonstrate that auditory representations compare favorably to direct frequency (magnitude spectrum) representations with respect to classification performance (i.e. probabilities of detection and false alarm). For this comparison the Receiver Operating Characteristic (ROC) curves are generated from signals derived from the standard transient data set (STDS) distributed by DARPA/ONR.

  • Thumbnail Image
    Item
    Cochlear Filters Design Using a Parallel Dilating-Biquad Switched-Capacitor Filter Bank
    (1991) Lin, Jyhfong; Ki, Wing-Hung; Thompson, K.E.; Shamma, S.; ISR
    A parallel filter bank is proposed to implement cochlear filters using very large time-constant (VLT) switched-capacitor (SC) filters. Significant hardware reduction is achieved in three ways. First, VLT SC biquads are used where the capacitor spread ratio of each biquad is about inversely proportional to the square root of wT, w is the center frequency of the filter and T is the inverse of the sampling frequency of the biquad. Second, the number of biquads is reduced by biquad sharing where n-biquad per channel cochlear filters are realized with only one additional biquad per channel after the first channel. Finally, LPN-type filter is used to avoid the very small capacitor in the forward path of each biquad. Furthermore, this filter bank s not only parasitics-insensitive but also gain-and-offset compensated using biphase clocking.
  • Thumbnail Image
    Item
    Realization of Cochlear Model by VLT Switched-Capacitor Filter Biquads
    (1991) Lin, Jyhfong; Ki, Wing-Hung; Thompson, K.E.; Shamma, S.; ISR
    We describe here the realization of a cochlear model using switched capacitor filters (SCF). This approach is made possible by a new design technique, called charge differencing (CD), which reduces by up to 50% the silicon area required to implement very large time-constant (VLT) filter biquads. In this technique, filter time constants are controlled by ratios of capacitor differences making the capacitor spread ratio very small. The new SCF's are also parasitics-free and are stabilized against op-amp inaccuracies, such as input offsets and finite gains, using a two-phase gain-offset-compensation method.
  • Thumbnail Image
    Item
    Cascaded Neural-Analog Networks for Real Time Decomposition of Superposed Radar Signals in the Presence of Noise.
    (1989) Teolis, A.; Pati, Y.C.; Peckerar, M.C.; Shamma, S.; ISR
    Among the numerous problems which arise in the context of radar signal processing is the problem of extraction of information from a noise corrupted signal. In this application the signal is assumed to be the superposition of outputs from multiple radar emitters. Associated with the output of each emitter is a unique set of parameters which are in general unknown. Significant parameters associated with each emitter are (i) the pulse repetition frequencies, (ii) the pulse durations (widths) associated with pulse trains and (iii) the pulse amplitudes: A superposition of the outputs of multiple emitters together with additive noise is observed at the receiver. In this study we consider the problem of decomposing such a noise corrupted linear combination of emitter outputs into an underlying set of basis signals while also identifying the parameters associated with each of the emitters involved. Foremost among our objectives is to design a system capable of performing this decomposition/classification in a demanding realtime environment. We present here a system composed of three cascaded neural-analog networks which, in simulation, has demonstrated an ability to nominally perform the task of decomposition and classification of superposed radar signals under extremely high noise conditions.
  • Thumbnail Image
    Item
    The Acoustic Features of Speech Phonemes in a Model of Auditory Processing: Vowels and Unvoiced Fricatives.
    (1987) Shamma, S.; ISR
    The acoustic features of three types of stimuli (a harmonic series, naturally spoken vowels, and unvoiced fricatives) are analyzed based on the response patterns they evoke in a model of auditory processing. The model consists of a peripheral cochlear stage, followed by two central neural networks. At the peripheral stage, the asymmetrical nature of the cochlear filters, combined with the preservation of the fine temporal structure of their outputs, provide for robust and level-tolerant spatiotemporal representation of the speech signals. At the subsequent central stages, the cochlear patterns are processed by two layers of lateral inhibitory networks (LIN) to extract the perceptually important parameters of the stimuli. For the harmonic series, an in-phase and an out-of-phase version (one harmonic inverted) are used to illustrate the role of the spatiotemporal cues in encoding the spectral and temporal features of the stimuli. With the more complex vowel sounds, the primary acoustic features encoded by the LIN outputs are the few largest harmonic components of the stimuli, i.e., those closest to the formant frequencics. The output patterns computed for different (male and female) speakers display moderate variability, especially in the locations of the output peaks. However, the results also suggest that the relative levels of the LIN peaks (or the weight distribution of the patterns) is a more stable and characteristic feature of the different vowel groups. The results for the unvoiced fricatives indicate that the most invariant and distinctive acoustic feature the auditory model extracts, is the location of the high frequency edge of each stimulus spectrum.