Tech Reports in Computer Science and Engineering
Permanent URI for this communityhttp://hdl.handle.net/1903/5
The technical reports collections in this community are deposited by the Library of the Computer Science department. If you have questions about these collections, please contact library staff at library@cs.umd.edu
Browse
1224 results
Search Results
Item Pipelined CPU-GPU Scheduling for Caches(2021-03-23) Gerzhoy, Daniel; Yeung, DonaldHeterogeneous microprocessors integrate a CPU and GPU with a shared cache hierarchy on the same chip, affording low-overhead communication between the CPU and GPU's cores. Often times, large array data structures are communicated from the CPU to the GPU and back. While the on-chip cache hierarchy can support such CPU-GPU producer-consumer sharing, this almost never happens due to poor temporal reuse. Because the data structures can be quite large, by the time the consumer reads the data, it has been evicted from cache even though the producer had brought it on-chip when it originally wrote the data. As a result, the CPU-GPU communication happens through main memory instead of the cache, hurting performance and energy. This paper exploits the on-chip caches in a heterogeneous microprocessor to improve CPU-GPU communication efficiency. We divide streaming computations executed by the CPU and GPU that exhibit producer-consumer sharing into chunks, and overlap the execution of CPU chunks with GPU chunks in a software pipeline. To enforce data dependences, the producer executes one chunk ahead of the consumer at all times. We also propose a low-overhead synchronization mechanism in which the CPU directly controls thread-block scheduling in the GPU to maintain the producer's "run-ahead distance" relative to the consumer. By adjusting the chunk size or run-ahead distance, we can make the CPU-GPU working set fit in the last-level cache, thus permitting the producer-consumer sharing to occur through the LLC. We show through simulation that our technique reduces the number of DRAM accesses by 30.4%, improves performance by 26.8%, and lowers memory system energy by 27.4% averaged across 7 benchmarks.Item Nervous system maps on the C. elegans genome(2020-09-28) Cherniak, Christopher; Mokhtarzada, Zekeria; Rodriguez-Esteban, RaulThis project begins from a synoptic point of view, focusing upon the large-scale (global) landscape of the genome. This is along the lines of combinatorial network optimization in computational complexity theory [1]. Our research program here in turn originated along parallel lines in computational neuroanatomy [2,3,4,5]. Rather than mapping body structure onto the genome, the present report focuses upon statistically significant mappings of the Caenorhabditis elegans nervous system onto its genome. Via published datasets, evidence is derived for a "wormunculus", on the model of a homunculus representation, but on the C. elegans genome. The main method of testing somatic-genomic position-correlations here is via public genome databases, with r^2 analyses and p evaluations. These findings appear to yield some of the basic structural and functional organization of invertebrate nucleus and chromosome architecture. The design rationale for somatic maps on the genome in turn may be efficient interconnections. A next question this study raises: How do these various somatic maps mesh (interrelate, interact) with each other?Item Boundary Element Solution of Electromagnetic Fields for Non-Perfect Conductors at Low Frequencies and Thin Skin Depths(2020-05-13) Gumerov, Nail A.; Adelman, Ross N.; Duraiswami, RamaniA novel boundary element formulation for solving problems involving eddy currents in the thin skin depth approximation is developed. It is assumed that the time-harmonic magnetic field outside the scatterers can be described using the quasistatic approximation. A two-term asymptotic expansion with respect to a small parameter characterizing the skin depth is derived for the magnetic and electric fields outside and inside the scatterer, which can be extended to higher order terms if needed. The introduction of a special surface operator (the inverse surface gradient) allows the reduction of the problem complexity. A method to compute this operator is developed. The obtained formulation operates only with scalar quantities and requires computation of surface operators that are usual for boundary element (method of moments) solutions to the Laplace equation. The formulation can be accelerated using the fast multipole method. The method is much faster than solving the vector Maxwell equations. The obtained solutions are compared with the Mie solution for scattering from a sphere and the error of the solution is studied. Computations for much more complex shapes of different topologies, including for magnetic and electric field cages used in testing are also performed and discussed.Item Design and Evaluation of Monolithic Computers Implemented Using Crossbar ReRAM(2019-07-16) Jagasivamani, Meenatchi; Walden, Candace; Singh, Devesh; Li, Shang; Kang, Luyi; Asnaashari, Mehdi; Dubois, Sylvain; Jacob, Bruce; Yeung, DonaldA monolithic computer is an emerging architecture in which a multicore CPU and a high-capacity main memory system are all integrated in a single die. We believe such architectures will be possible in the near future due to nonvolatile memory technology, such as the resistive random access memory, or ReRAM, from Crossbar Incorporated. Crossbar's ReRAM can be fabricated in a standard CMOS logic process, allowing it to be integrated into a CPU's die. The ReRAM cells are manufactured in between metal wires and do not employ per-cell access transistors, leaving the bulk of the base silicon area vacant. This means that a CPU can be monolithically integrated directly underneath the ReRAM memory, allowing the cores to have massively parallel access to the main memory. This paper presents the characteristics of Crossbar's ReRAM technology, informing architects on how ReRAM can enable monolithic computers. Then, it develops a CPU and memory system architecture around those characteristics, especially to exploit the unprecedented memory-level parallelism. The architecture employs a tiled CPU, and incorporates memory controllers into every compute tile that support a variable access granularity to enable high scalability. Lastly, the paper conducts an experimental evaluation of monolithic computers on graph kernels and streaming computations. Our results show that compared to a DRAM-based tiled CPU, a monolithic computer achieves 4.7x higher performance on the graph kernels, and achieves roughly parity on the streaming computations. Given a future 7nm technology node, a monolithic computer could outperform the conventional system by 66% for the streaming computations.Item Cell Maps on the Human Genome(2018-06-01) Cherniak, Christopher; Rodriguez-Esteban, RaulSub-cellular organization is significantly mapped onto the human genome: Evidence is reported for a "cellunculus" -- on the model of a homunculus, on the H. sapiens genome. We have previously described a statistically significant, global, supra-chromosomal representation of the human body that appears to extend over the entire genome. Here, we extend the genome mapping model, zooming down to the typical individual animal cell. Basic cell structure turns out to map onto the total genome, mirrored via genes that express in particular cell organelles (e.g., “nuclear membrane”); evidence also suggests similar cell maps appear on individual chromosomes that map the dorsoventral body axis.Item Digital Words: Moving Forward with Measuring the Readability of Online Texts(2018-10-26) Redmiles, Elissa M.; Maszkiewicz, Lisa; Hwang, Emily; Kuchhal, Dhruv; Liu, Everest; Morales, Miraida; Peskov, Denis; Rao, Sudha; Stevens, Rock; Gligoric, Kristina; Kross, Sean; Mazurek, Michelle L.; Daumé, Hal IIIThe readability of a digital text can influence people’s information acquisition (Wikipedia articles), online security (how-to articles), and even health (WebMD). Readability metrics can also alter search rankings and are used to evaluate AI system performance. However, prior work on measuring readability has significant gaps, especially for HCI applications. Prior work has (a) focused on grade-school texts, (b) ignored domain-specific, jargon-heavy texts (e.g., health advice), and (c) failed to compare metrics, especially in the context of scaling to use with online corpora. This paper addresses these shortcomings by comparing well-known readability measures and a novel domain-specific approach across four different corpora: crowd-worker generated stories, Wikipedia articles, security and privacy advice, and health information. We evaluate the convergent, discriminant, and content validity of each measure and detail tradeoffs in domain-specificity and participant burden. These results provide a foundation for more accurate readability measurements in HCI.Item A Comparison of Header and Deep Packet Features when Detecting Network Intrusions(2018-07-07) Watson, GavinA classical multilayer perceptron algorithm and novel convolutional neural network payload classifying algorithm are presented for use on a realistic network intrusion detection dataset. The payload classifying algorithm is judged to be inferior to the multilayer perceptron but shows significance in being able to distinguish between network intrusions and benign traffic. The multilayer perceptron that is trained on less than 1% of the available classification data is judged to be a good modern estimate of usage in the real-world when compared to prior research. It boasts an average true positive rate of 94.5% and an average false positive rate of 4.68%.Item Ethics Emerging: The Story of Privacy and Security Perceptions in Virtual Reality(2018-02-20) Adams, Devon; Bah, Alseny; Barwulor, Catherine; Musabay, Nureli; Pitkin, Kadeem; Redmiles, Elissa M.Virtual reality (VR) technology aims to transport the user to a virtual world, fully immersing them in an experience entirely separate from the real world. VR devices can use sensor data to draw deeply personal inferences (e.g., medical conditions, emotions) and can enable virtual crimes (e.g., theft, assault on virtual representations of the user) from which users have been shown to experience real, significant emotional pain. As such, VR may involve especially sensitive user data and interactions. To effectively mitigate such risks and design for safer experiences, we aim to understand end-user perceptions of VR risks and how, if at all, developers are considering and addressing those risks. In this paper, we present the first work on VR security and privacy perceptions: a mixed-methods study involving semi-structured interviews with 20 VR users and developers, a survey of VR privacy policies, and an ethics co-design study with VR developers. We establish a foundational understanding of perceived risks in VR; raise concerns about the state of VR privacy policies; and contribute a concrete VR developer "code of ethics", created by developers, for developers.Item An Application of Jeeves for Honeypot Sanitization(2018-02-15) Webster, AshtonBeing able to quickly create realistic honeypots is very useful for obtaining accurate information about attacker behavior. However, creating realistic honeypots requires sanitization of the original system from which the honeypot is derived. To achieve this the use of the Jeeves, a language based on faceted values, is extended to rapidly replace secret values with believable and non-interfering sanitized values. By making several changes to the source code of Jelf, a web server implemented in Jeeves, we are able to quickly and easily create sanitized honeypots. Our experiments show that the sanitized and unsanitized versions of Jelf only differ in response times by less than 1%.Item Fast and Service-preserving Recovery from Malware Infections using CRIU(2018-02-15) Webster, Ashton; Eckenrod, Ryan; Purtilo, JamesOnce a computer system has been infected with malware, restoring it to an uninfected state often requires costly service-interrupting actions such as rolling back to a stable snapshot or reimaging the system entirely. We present CRIU-MR: a technique for restoring an infected server system running within a Linux container to an uninfected state in a service-preserving manner using Checkpoint/Restore in Userspace (CRIU). We modify the CRIU source code to flexibly integrate with existing malware detection technologies so that it can remove suspected malware processes within a Linux container during a checkpoint/restore event. This allows for infected containers with a potentially damaged filesystem to be checkpointed and subsequently restored on a fresh backup filesystem while both removing malware processes and preserving the state of trusted ones. This method can be quickly performed with minimal impact on service availability, restoring active TCP connections and completely removing several types of malware from infected Linux containers.