Electrical & Computer Engineering Research Works
Permanent URI for this collectionhttp://hdl.handle.net/1903/1658
Browse
Item Achieving Utility Arbitrarily Close to the Optimal with Limited Energy(IEEE, 2000-07) Qu, Gang; Potkonjak, MiodragEnergy is one of the limited resources for modern systems, especially the battery-operated devices and personal digi- tal assistants. The backlog in new technologies for more powerful battery is changing the traditional system design philosophies. For example, due to the limitation on battery life, it is more realistic to design for the optimal benefit from limited resource rather than design to meet all the applica- tions' requirement. We consider the following problem: a system achieves a certain amount of utility from a set of applications by providing them certain levels of quality of service (QoS). We want to allocate the limited system re- sources to get the maximal system utility. We formulate this utility maximization problem, which is NP-hard in gen- eral, and propose heuristic algorithms that are capable of finding solutions provably arbitrarily close to the optimal. We have also derived explicit formulae to guide the alloca- tion of resources to actually achieve such solutions. Simu- lation shows that our approach can use 99.9% of the given resource to achieve 25.6% and 32.17% more system utilities over two other heuristics, while providing QoS guarantees to the application program.Item Adversarial Machine Learning for NextG Covert Communications Using Multiple Antennas(MDPI, 2022-07-29) Kim, Brian; Sagduyu, Yalin; Davaslioglu, Kemal; Erpek, Tugba; Ulukus, SennurThis paper studies the privacy of wireless communications from an eavesdropper that employs a deep learning (DL) classifier to detect transmissions of interest. There exists one transmitter that transmits to its receiver in the presence of an eavesdropper. In the meantime, a cooperative jammer (CJ) with multiple antennas transmits carefully crafted adversarial perturbations over the air to fool the eavesdropper into classifying the received superposition of signals as noise. While generating the adversarial perturbation at the CJ, multiple antennas are utilized to improve the attack performance in terms of fooling the eavesdropper. Two main points are considered while exploiting the multiple antennas at the adversary, namely the power allocation among antennas and the utilization of channel diversity. To limit the impact on the bit error rate (BER) at the receiver, the CJ puts an upper bound on the strength of the perturbation signal. Performance results show that this adversarial perturbation causes the eavesdropper to misclassify the received signals as noise with a high probability while increasing the BER at the legitimate receiver only slightly. Furthermore, the adversarial perturbation is shown to become more effective when multiple antennas are utilized.Item Algorithmic Composition as a Model of Creativity(1996-12) Jacob, Bruce““’There are two distinct types of creativity: the flash out of the blue (inspiration? genius?), and the process of incremental revisions (hard work). Not only are we years away from modeling the former, we do not even begin to understand it. The latter is algorithmic in nature and has been modeled in many systems both musical and non-musical. Algorithmic composition is as old as music composition. It is often considered a cheat, a way out when the composer needs material and/or inspiration. It can also be thought of as a compositional tool that simply makes the composer's work go faster. This article makes a case for algorithmic composition as such a tool. The 'hard work' type of creativity often involves trying many different combinations against each other and choosing one over others. This iterative task seems natural to be expressed as a computer algorithm. The implementation issues can be reduced to two components: how to understand one's own creative process well enough to reproduce it as an algorithm, and how to program a computer to differentiate between 'good' and 'bad' music. The philosophical issues reduce to the question who or what is responsible for the music produced?Item Analogue Quantum Gravity in Hyperbolic Metamaterials(MDPI, 2022-04-14) Smolyaninov, Igor I.; Smolyaninova, Vera N.It is well known that extraordinary photons in hyperbolic metamaterials may be described as living in an effective Minkowski spacetime, which is defined by the peculiar form of the strongly anisotropic dielectric tensor in these metamaterials. Here, we demonstrate that within the scope of this approximation, the sound waves in hyperbolic metamaterials look similar to gravitational waves, and therefore the quantized sound waves (phonons) look similar to gravitons. Such an analogue model of quantum gravity looks especially interesting near the phase transitions in hyperbolic metamaterials where it becomes possible to switch quantum gravity effects on and off as a function of metamaterial temperature. We also predict strong enhancement of sonoluminescence in ferrofluid-based hyperbolic metamaterials, which looks analogous to particle creation in strong gravitational fields.Item Analysis of Energy Reduction on Dynamic Scaling-Enabled Systems(IEEE, 2005-12) Yuan, Lin; Qu, GangDynamic voltage scaling (DVS) is a technique that varies the supply voltage and clock frequency, based on the computation load, to provide desired performance with the minimal amount of energy consumption. It has been demonstrated as one of the most effective low power system design techniques, particularly for real time embedded systems. Most existing work are on two different system models that enable DVS: the ideal DVS system that can change its operating voltage with no physical constraints, and the multiple DVS system that has only a number of discrete voltages available. Although the ideal DVS system provides the theoretical lower bound on system’s energy consumption, it is the practicability of multiple DVS systems and the emergence of other DVS-enabled systems, which do not fit either model, that challenges system designers the following questions: should DVS be implemented in the design or not? if so, how should DVS be implemented? In this paper, we answer these questions by studying the DVS-enabled systems that can vary the operating voltage dynamically under various real-life physical constraints. Based on system’s different behavior during voltage transition, we define the optimistic feasible DVS system and the pessimistic feasible DVS system. We buildmathematical model for each DVSenabled system and analyze their potential in energy reduction. Finally, we simulate a secure wireless communication network with different DVS-enabled systems. The results show that DVS gives significant energy saving over system with fixed voltage. Interestingly, we also observe that although multiple DVS system may consume more energy than the theoretical lower bound, the optimistic and pessimistic feasible DVS systems can achieve energy savings very close to the theoretical bound provided by the ideal DVS system.Item Analysis of Watermarking Techniques for Graph Coloring Problem(IEEE, 1998-11) Qu, Gang; Pokonjak, Miodrag;Item An Analytical Model for Designing Memory Hierarchies(1996-10) Jacob, Bruce; Chen, Peter M.; Silverman, Seth R.; Mudge, Trevor N.Memory hierarchies have long been studied by many means: system building, trace-driven simulation, and mathematical analysis. Yet little help is available for the system designer wishing to quickly size the different levels in a memory hierarchy to a first-order approximation. In this paper, we present a simple analysis for providing this practical help and some unexpected results and intuition that come out of the analysis. By applying a specific, parametized model of workload locality, we are able to derive a closed-form solution for the optimal size of each hierarchy level. We verify the accuracy of this solution against exhaustive simulation with two case studies: a three-level I/O storage hierarchy and a three-level processor-cache hierarchy. In all but one case, the configuration recommended by the model performs within 5% of optimal. One result of our analysis is that the first place to spend money is the cheapest (rather than the fastest) cache level, particularly with small system budgets. Another is that money spent on an n-level hierarchy is spent in a fixed proportion until another level is added.Item Analytical solution for the side-fringing fields of narrow beveled heads(American Institue of Physics, 1997-04-15) Mayergoyz, I. D.; Madabhushi, R.; Burke, E. R.; Gomez, R. D.By using conical coordinates, exact analytical solutions for three-dimensional side-fringing fields of recording heads that are beveled in the down-track direction are found. These solutions are derived under the assumption of zero gap length. The side-fringing fields for the two limiting cases of infinitesimally narrow heads and semi-infinitely wide heads are presented and compared.Item Approaching the Maximum Energy Saving on Embedded Systems with Multiple Voltages(IEEE, 2003-11) Hua, Shaoxiong; Qu, Gang;Dynamic voltage scaling (DVS) is arguably the most effective energy reduction technique. The multiple-voltage DVS systems, which can operate only at pre-determined discrete voltages, are practical and have been well studied. However, one important unsolved problem is how many levels and at which values should voltages be implemented on a multiple-voltage DVS system to achieve the maximum energy saving. We refer this as the voltage set-up problem. In this paper, (1) we derive analytical solutions for dual-voltage system. (2) For the general case that does not have analytic solutions, we develop efficient numerical methods. (3) We demonstrate how to apply the proposed algorithms on system design. (4) Interestingly, the experimental results suggest that the multiple-voltage DVS system, when the voltages are set up properly, can reach DVS technique’s full potential in energy saving. Specifically, on the design of an ad hoc application specific system and the design of the MPEG video encoder, we find that the best single-voltage systems consume 150% and 20% more energy than the tight theoretical lower bounds, respectively. However, our approach gives dual-, 3-, and 4-voltage DVS system settings that are only 17.6%, 4.9%, and 2.6% for the ad hoc system, and 4.0%, 1.1%, and 0.2% for the MPEG video encoder, over the same lower bounds.Item ARBITRATE-AND-MOVE PRIMITIVES FOR HIGH THROUGHPUT ON-CHIP INTERCONNECTION NETWORKS(IEEE, 2004-05) Balkan, Aydin O.; Vishkin, U.; Qu, GangAn n-leaf pipelined balanced binary tree is used for arbitration of order and movement of data from n input ports to one output port. A novel arbitrate-and-move primitive circuit for every node of the tree, which is based on a concept of reduced synchrony that benefits from attractive features of both asynchronous and synchronous designs, is presented. The design objective of the pipelined binary tree is to provide a key building block in a high-throughput mesh-of-trees interconnection network for Explicit Multi Threading (XMT) architecture, a recently introduced parallel computation framework. The proposed reduced synchrony circuit was compared with asynchronous and synchronous designs of arbitrate-and-move primitives. Simulations with 0.18m technology show that compared to an asynchronous design, the proposed reduced synchrony implementation achieves a higher throughput, up to 2 Giga- Requests per second on an 8-leaf binary tree. Our circuit also consumes less power than the synchronous design, and requires less silicon area than both the synchronous and asynchronous designs.Item The Associative-Skew Clock Routing Problem(IEEE, 1999-11) Chen, Yu; Kahng, Andrew B.; Qu, Gang; Zelikovsky, AlexanderWe introduce the associative skew clock routing problem, which seeks a clock routing tree such that zero skew is preserved only within identified groups of sinks. The associative skew problem is easier to address within current EDA frameworks than useful-skew (skew-scheduling) approaches, and defines an interesting tradeoff between the traditional zero-skew clock routing problem (one sink group) and the Steiner minimum tree problem (n sink groups). We present a set of heuristic building blocks, including an efficient and optimal method of merging two zero-skew trees such that zero skew is preserved within the sink sets of each tree. Finally, we list a number of open issues for research and practical application.Item Automation: A New Open-Access Journal with a Broad Scope and an Exciting Mission(MDPI, 2020-10-29) Abed, Eyad H.Item Bilaterally Reduced Rolandic Beta Band Activity in Minor Stroke Patients - Dataset(2022) Kulasingham, Joshua; Brodbeck, Christian; Khan, Sheena; Simon, Jonathan; Marsh, ElisabethStroke patients with hemiparesis display decreased beta band (13–25Hz) rolandic activity, correlating to impaired motor function. However, clinically, patients without significant weakness, with small lesions far from sensorimotor cortex, exhibit bilateral decreased motor dexterity and slowed reaction times. We investigate whether these minor stroke patients also display abnormal beta band activity. Magnetoencephalographic (MEG) data were collected from nine minor stroke patients (NIHSS < 4) without significant hemiparesis, at ~1 and ~6 months postinfarct, and eight age-similar controls. Rolandic relative beta power during matching tasks and resting state, and Beta Event Related (De)Synchronization (ERD/ERS) during button press responses were analyzed. Regardless of lesion location, patients had significantly reduced relative beta power and ERS compared to controls. abnormalities persisted over visits, and were present in both ipsi- and contra-lesional hemispheres, consistent with bilateral impairments in motor dexterity and speed. Minor stroke patients without severe weakness display reduced rolandic beta band activity in both hemispheres, which may be linked to bilaterally impaired dexterity and processing speed, implicating global connectivity dysfunction affecting sensorimotor cortex independent of lesion location. Findings not only illustrate global network disruption after minor stroke, but suggest rolandic beta band activity may be a potential biomarker and treatment target, even for minor stroke patients with small lesions far from sensorimotor areas.Item BioBench: A Benchmark Suite of Bioinformatics Applications(2005-03) Albayraktaroglu, Kursad; Jaleel, Aamer; Wu, Xue; Franklin, Manoj; Jacob, Bruce; Tseng, Chau-Wen; Yeung, DonaldRecent advances in bioinformatics and the significant increase in computational power available to researchers have made it possible to make better use of the vast amounts of genetic data that has been collected over the last two decades. As the uses of genetic data expand to include drug discovery and development of gene-based therapies, bioinformatics is destined to take its place in the forefront of scientific computing application domains. Despite the clear importance of this field, common bioinformatics applications and their implication on microarchitectural design have received scant attention from the computer architecture community so far. The availability of a common set of bioinformatics benchmarks could be the first step to motivate further research in this crucial area. To this end, this paper presents BioBench, a benchmark suite that represents a diverse set of bioinformatics applications. The first version of BioBench includes applications from different application domains, with a particular emphasis on mature genomics applications. The applications in the benchmark are described briefly, and basic execution characteristics obtained on a real processor are presented. Compared to SPEC INT and SPEC FP benchmarks, applications in BioBench display a higher percentage of load/store instructions, almost negligible floating point operation content, and higher IPC than either SPEC INT and SPEC FP applications. Our evaluation suggests that bioinformatics applications have distinctly different characteristics from the applications in both of the mentioned SPEC suites; and our findings indicate that bioinformatics workloads can benefit from architectural improvements to memory bandwidth and techniques that exploit their high levels of ILP. The entire BioBench suite and accompanying reference data will be made freely available to researchers.Item Cache Design for Embedded Real-Time Systems(1999-06-30) Jacob, BruceCaches have long been a mechanism for speeding memory access and are popular in embedded hardware architectures from microcontrollers to core-based ASIC designs. However, caches are considered ill-suited for embedded real-time systems because they provide a probabilistic performance boost— a cache may or may not contain the desired data at any given moment. Analysis that guarantees when an item will or will not be in the cache has proven difficult, so many real-time systems simply disable caching and schedule tasks based on worst-case memory access time. Yet there are several cache organizations that provide the benefit of caching without the real-time drawbacks of hardware-managed caches. These are software-managed caches, and several different examples can be found, from DSP-style on-chip RAM to academic designs. This paper compares the operation and organization of caches as found in general-purpose processors, microcontrollers, and DSPs; it also discusses designs for embedded realtime systems.Item Can Cooling Technology Save Many-Core Parallel Programming from Its Programming Woes?(2015-10-16) O'Brien, Sean; Vishkin, Uzi; Edwards, James; Waks, Edo; Yang, BaoThis paper is advancing the following premise (henceforth, "vision"): that it is feasible to greatly enhance data movement in the short term, and do it in ways that would be both power efficient and pragmatic in the long term. The paper spells this premise out in greater detail: 1. it is feasible to build first generations of a variety of (power-inefficient) designs for which data movement will not be a restriction and begin application software development for them; 2. growing reliance on silicon compatible photonic technologies, and feasible advances in them with proper investment, will allow reduction of power consumption in these design by several orders of magnitude; 3. successful high performance application software, the ease of programming demonstrated and growing adoption by customers, software vendors and programmers will incentivize (hardware vendor) investment in new application-software-compatible generations of these designs (a new "software spiral" a la former Intel CEO, Andy Grove) with further reduction of power consumption in each generation; 4. microfluidic cooling is instrumental for enabling item 1, as well as for midwifing this overall vision. The opening paragraph of the paper provides a preamble to that vision, the body of the paper supports it and the paragraph "Moore's-Law-type vision" summarizes it. The scope of the paper is a bit forward looking and it may not exactly fit any particular community. However, its new directions for interaction among architecture and programming may suggest new horizons for representing and exposing a greater variety of data and task parallelism.Item Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity(Association for Computer Machinery (ACM), 2023-06-12) Hamman, Faisal; Chen, Jiahao; Dutta, SanghamitraExisting regulations often prohibit model developers from accessing protected attributes (gender, race, etc.) during training. This leads to scenarios where fairness assessments might need to be done on populations without knowing their memberships in protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train their models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset solely for auditing purposes). However, the model developers might be allowed to test their models for disparity by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as, statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. Furthermore, we show that one can reconstruct the protected attributes of all the individuals from O (𝑁𝑘log(𝑛/𝑁𝑘)) queries when 𝑁𝑘 ≪ 𝑛 using techniques from compressed sensing (𝑛 is the size of the test dataset and 𝑁𝑘 is the size of smallest group therein). Our results pose an interesting debate in algorithmic fairness: Should querying for fairness metrics be viewed as a neutral-valued solution to ensure compliance with regulations? Or, does it constitute a violation of regulations and privacy if the number of queries answered is enough for the model developers to identify the protected attributes of specific individuals? To address this supposed violation of regulations and privacy, we also propose Attribute-Conceal, a novel technique that achieves differential privacy by calibrating noise to the smooth sensitivity of our bias query function, outperforming naive techniques such as the Laplace mechanism. We also include experimental results on the Adult dataset and synthetic dataset (broad range of parameters).Item A Case for Studying DRAM Issues at the System Level(IEEE Micro, 2003-08) Jacob, BruceTHE WIDENING GAP BETWEEN TODAY’S PROCESSOR AND MEMORY SPEEDS MAKES DRAM SUBSYSTEM DESIGN AN INCREASINGLY IMPORTANT PART OF COMPUTER SYSTEM DESIGN. IF THE DRAM RESEARCH COMMUNITY WOULD FOLLOW THE MICROPROCESSOR COMMUNITY’S LEAD BY LEANING MORE HEAVILY ON ARCHITECTURE- AND SYSTEM-LEVEL SOLUTIONS IN ADDITION TO TECHNOLOGY-LEVEL SOLUTIONS TO ACHIEVE HIGHER PERFORMANCE, THE GAP MIGHT BEGIN TO CLOSE.Item CASPER: An Integrated Energy-Driven Approach for Task Graph Scheduling on Distributed Embedded Systems(IEEE, 2005-07) Kianzad, Vida; Bhattacharyya, Shuvra S.; Qu, GangFor multiprocessor embedded systems, the dynamic voltage scaling (DVS) technique can be applied to scheduled applications for energy reduction. DVS utilizes slack in the schedule to slow down processes and save energy. Therefore, it is generally believed that the maximal energy saving is achieved on a schedule with the minimum makespan (maximal slack). Most current approaches treat task assignment, scheduling, and DVS separately. In this paper, we present a framework called CASPER (Combined Assignment, Scheduling, and PowER-management) that challenges this common belief by integrating task scheduling and DVS under a single iterative optimization loop via genetic algorithm. We have conducted extensive experiments to validate the energy efficiency of CASPER. For homogeneous multiprocessor systems (in which all processors are of the same type), we consider a recently proposed slack distribution algorithm (PDP-SPM) [3]: applying PDP-SPM on the schedule with the minimal makespan gives an average of 53.8% energy saving; CASPER finds schedules with slightly larger makespan but a 57.3% energy saving, a 7.8% improvement. For heterogeneous systems, we consider the power variation DVS (PV-DVS) algorithm [13], CASPER improves its energy efficiency by 8.2%. Finally, our results also show that the proposed single loop CASPER framework saves 23.3% more energy over GMA+EE-GLSA [12], the only other known integrated approach with a nested loop that combines scheduling and power management in the inner loop but leaves assignment in the outer loop.Item Changes in Cortical Directional Connectivity during Difficult Listening in Younger and Older Adults(Wiley, 2023-05) Soleimani, Behrad; Dushyanthi Karunathilake, I. M.; Das, Proloy; Kuchinsky, Stephanie E.; Babadi, Behtash; Simon, Jonathan E.One way to investigate the mechanisms that underlie speech comprehension under difficult listening conditions is via cortical connectivity. The innovative Network Localized Granger Causality (NLGC) framework was applied to magnetoencephalography (MEG) data, obtained from older and younger subjects performing a speech listening task in noisy conditions, in delta and theta frequency bands. Directional connectivity between frontal, temporal, and parietal lobes was analyzed. Both aging- and condition-related changes were found, particularly in theta. In younger adults, as background noise increased, there was a transition from predominantly temporal-to-frontal (bottom-up) connections, to predominantly frontal-to-temporal (top-down). In contrast, older adults showed bidirectional information flow between frontal and temporal cortices even for speech in quiet, not changing substantially with increased noise. Additionally, younger listeners did not show changes in the nature of their cortical links for different listening conditions, whereas older listeners exhibited a switch from predominantly facilitative links to predominantly sharpening, when noise increased.