Technical Reports of the Computer Science Department

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 1187
  • Item
    Boundary Element Solution of Electromagnetic Fields for Non-Perfect Conductors at Low Frequencies and Thin Skin Depths
    (2020-05-13) Gumerov, Nail A.; Adelman, Ross N.; Duraiswami, Ramani
    A novel boundary element formulation for solving problems involving eddy currents in the thin skin depth approximation is developed. It is assumed that the time-harmonic magnetic field outside the scatterers can be described using the quasistatic approximation. A two-term asymptotic expansion with respect to a small parameter characterizing the skin depth is derived for the magnetic and electric fields outside and inside the scatterer, which can be extended to higher order terms if needed. The introduction of a special surface operator (the inverse surface gradient) allows the reduction of the problem complexity. A method to compute this operator is developed. The obtained formulation operates only with scalar quantities and requires computation of surface operators that are usual for boundary element (method of moments) solutions to the Laplace equation. The formulation can be accelerated using the fast multipole method. The method is much faster than solving the vector Maxwell equations. The obtained solutions are compared with the Mie solution for scattering from a sphere and the error of the solution is studied. Computations for much more complex shapes of different topologies, including for magnetic and electric field cages used in testing are also performed and discussed.
  • Item
    Cell Maps on the Human Genome
    (2018-06-01) Cherniak, Christopher; Rodriguez-Esteban, Raul
    Sub-cellular organization is significantly mapped onto the human genome: Evidence is reported for a "cellunculus" -- on the model of a homunculus, on the H. sapiens genome. We have previously described a statistically significant, global, supra-chromosomal representation of the human body that appears to extend over the entire genome. Here, we extend the genome mapping model, zooming down to the typical individual animal cell. Basic cell structure turns out to map onto the total genome, mirrored via genes that express in particular cell organelles (e.g., “nuclear membrane”); evidence also suggests similar cell maps appear on individual chromosomes that map the dorsoventral body axis.
  • Item
    Digital Words: Moving Forward with Measuring the Readability of Online Texts
    (2018-10-26) Redmiles, Elissa M.; Maszkiewicz, Lisa; Hwang, Emily; Kuchhal, Dhruv; Liu, Everest; Morales, Miraida; Peskov, Denis; Rao, Sudha; Stevens, Rock; Gligoric, Kristina; Kross, Sean; Mazurek, Michelle L.; Daumé, Hal III
    The readability of a digital text can influence people’s information acquisition (Wikipedia articles), online security (how-to articles), and even health (WebMD). Readability metrics can also alter search rankings and are used to evaluate AI system performance. However, prior work on measuring readability has significant gaps, especially for HCI applications. Prior work has (a) focused on grade-school texts, (b) ignored domain-specific, jargon-heavy texts (e.g., health advice), and (c) failed to compare metrics, especially in the context of scaling to use with online corpora. This paper addresses these shortcomings by comparing well-known readability measures and a novel domain-specific approach across four different corpora: crowd-worker generated stories, Wikipedia articles, security and privacy advice, and health information. We evaluate the convergent, discriminant, and content validity of each measure and detail tradeoffs in domain-specificity and participant burden. These results provide a foundation for more accurate readability measurements in HCI.
  • Item
    A Comparison of Header and Deep Packet Features when Detecting Network Intrusions
    (2018-07-07) Watson, Gavin
    A classical multilayer perceptron algorithm and novel convolutional neural network payload classifying algorithm are presented for use on a realistic network intrusion detection dataset. The payload classifying algorithm is judged to be inferior to the multilayer perceptron but shows significance in being able to distinguish between network intrusions and benign traffic. The multilayer perceptron that is trained on less than 1% of the available classification data is judged to be a good modern estimate of usage in the real-world when compared to prior research. It boasts an average true positive rate of 94.5% and an average false positive rate of 4.68%.
  • Item
    Ethics Emerging: The Story of Privacy and Security Perceptions in Virtual Reality
    (2018-02-20) Adams, Devon; Bah, Alseny; Barwulor, Catherine; Musabay, Nureli; Pitkin, Kadeem; Redmiles, Elissa M.
    Virtual reality (VR) technology aims to transport the user to a virtual world, fully immersing them in an experience entirely separate from the real world. VR devices can use sensor data to draw deeply personal inferences (e.g., medical conditions, emotions) and can enable virtual crimes (e.g., theft, assault on virtual representations of the user) from which users have been shown to experience real, significant emotional pain. As such, VR may involve especially sensitive user data and interactions. To effectively mitigate such risks and design for safer experiences, we aim to understand end-user perceptions of VR risks and how, if at all, developers are considering and addressing those risks. In this paper, we present the first work on VR security and privacy perceptions: a mixed-methods study involving semi-structured interviews with 20 VR users and developers, a survey of VR privacy policies, and an ethics co-design study with VR developers. We establish a foundational understanding of perceived risks in VR; raise concerns about the state of VR privacy policies; and contribute a concrete VR developer "code of ethics", created by developers, for developers.
  • Item
    An Application of Jeeves for Honeypot Sanitization
    (2018-02-15) Webster, Ashton
    Being able to quickly create realistic honeypots is very useful for obtaining accurate information about attacker behavior. However, creating realistic honeypots requires sanitization of the original system from which the honeypot is derived. To achieve this the use of the Jeeves, a language based on faceted values, is extended to rapidly replace secret values with believable and non-interfering sanitized values. By making several changes to the source code of Jelf, a web server implemented in Jeeves, we are able to quickly and easily create sanitized honeypots. Our experiments show that the sanitized and unsanitized versions of Jelf only differ in response times by less than 1%.
  • Item
    Fast and Service-preserving Recovery from Malware Infections using CRIU
    (2018-02-15) Webster, Ashton; Eckenrod, Ryan; Purtilo, James
    Once a computer system has been infected with malware, restoring it to an uninfected state often requires costly service-interrupting actions such as rolling back to a stable snapshot or reimaging the system entirely. We present CRIU-MR: a technique for restoring an infected server system running within a Linux container to an uninfected state in a service-preserving manner using Checkpoint/Restore in Userspace (CRIU). We modify the CRIU source code to flexibly integrate with existing malware detection technologies so that it can remove suspected malware processes within a Linux container during a checkpoint/restore event. This allows for infected containers with a potentially damaged filesystem to be checkpointed and subsequently restored on a fresh backup filesystem while both removing malware processes and preserving the state of trusted ones. This method can be quickly performed with minimal impact on service availability, restoring active TCP connections and completely removing several types of malware from infected Linux containers.
  • Item
    A Summary of Survey Methodology Best Practices for Security and Privacy Researchers
    (2017-05-03) Redmiles, Elissa M.; Acar, Yasemin; Fahl, Sascha; Mazurek, Michelle L.
    "Given a choice between dancing pigs and security, users will pick dancing pigs every time," warns an oft-cited quote from well-known security researcher Bruce Schneier. This issue of understanding how to make security tools and mechanisms work better for humans (often categorized as usability, broadly construed) has become increasingly important over the past 17 years, as illustrated by the growing body of research. Usable security and privacy research has improved our understanding of how to help users stay safe from phishing attacks, and control access to their accounts, as just three examples. One key technique for understanding and improving how human decision making affects security is the gathering of self-reported data from users. This data is typically gathered via survey and interview studies, and serves to inform the broader security and privacy community about user needs, behaviors, and beliefs. The quality of this data, and the validity of subsequent research results, depends on the choices researchers make when designing their experiments. Contained here is a set of essential guidelines for conducting self-report usability studies distilled from prior work in survey methodology and related fields. Other fields that rely on self-report data, such as the health and social sciences, have established guidelines and recommendations for collecting high quality self-report data.
  • Item
    How Well Do My Results Generalize? Comparing Security and Privacy Survey Results from MTurk and Web Panels to the U.S.
    (2017-02-21) Redmiles, Elissa M.; Kross, Sean; Pradhan, Alisha; Mazurek, Michelle L.
    Security and privacy researchers often rely on data collected from Amazon Mechanical Turk (MTurk) to evaluate security tools, to understand users' privacy preferences, to measure online behavior, and for other studies. While the demographics of MTurk are broader than some other options, researchers have also recently begun to use census-representative web-panels to sample respondents with more representative demographics. Yet, we know little about whether security and privacy results from either of these data sources generalize to a broader population. In this paper, we compare the results of a survey about security and privacy knowledge, experiences, advice, and internet behavior distributed using MTurk (n=480), a nearly census-representative web-panel (n=428), and a probabilistic telephone sample (n=3,000) statistically weighted to be accurate within 2.7% of the true prevalence in the U.S. Surprisingly, we find that MTurk responses are slightly more representative of the U.S. population than are responses from the census-representative panel, except for users who hold no more than a high-school diploma or who are 50 years of age or older. Further, we find that statistical weighting of MTurk responses to balance demographics does not significantly improve generalizability. This leads us to hypothesize that differences between MTurkers and the general public are due not to demographics, but to differences in factors such as internet skill. Overall, our findings offer tempered encouragement for researchers using MTurk samples and enhance our ability to appropriately contextualize and interpret the results of crowdsourced security and privacy research.
  • Item
    A Comparison of Transfer Learning Algorithms for Defect and Vulnerability Detection
    (2017-02-08) Webster, Ashton
    Machine learning techniques for defect and vulnerability detection have the potential to quickly direct developers' attention to software components with faulty implementations. Effective application of such defect prediction methods in practical software development environments requires transfer learning algorithms so that models built using existing projects can recognize defects as they emerge in a new project. Up until this study, comparing the efficacy of transfer learning algorithms was challenging because previous studies used differing data sets, baselines, and performance metrics. By providing open source implementations and baseline performance metrics for several transfer learning algorithms on two different data sets, our project offers software engineers the tools to objectively compare methods and readily identify top performing transfer learning algorithms in the domain of both vulnerability and defect prediction.
  • Item
    Identifying Fixed Points in Recurrent Neural Networks using Directional Fibers: Supplemental Material on Theoretical Results and Practical Aspects of Numerical Traversal
    (2016-12-12) Katz, Garrett; Reggia, James
    Fixed points of recurrent neural networks can represent many things, including stored memories, solutions to optimization problems, and waypoints along non-fixed attractors. As such, they are relevant to a number of neurocomputational phenomena, ranging from low-level motor control and tool use to high-level problem solving and decision making. Therefore, global solution of the fixed point equations can improve our understanding and engineering of recurrent neural networks. While local solvers and statistical characterizations abound, we do not know of any method for efficiently and precisely locating all fixed points of an arbitrary network. To solve this problem we have proposed a novel strategy for global fixed point location, based on numerical traversal of mathematical objects we defined called directional fibers [2]. This report supplements our results in [2] by presenting certain technical aspects of our method in more depth.
  • Item
    Where is the Digital Divide? A Survey of Security, Privacy, and Socioeconomics
    (2016-11-03) Redmiles, Elissa M.; Kross, Sean; Mazurek, Michelle L.
    The behavior of the least-secure user can influence security and privacy outcomes for everyone else. Thus, it is important to understand the factors that influence the security and privacy of a broad variety of people. Prior work has suggested that users with differing socioeconomic status (SES) may behave differently; however, no research has examined how SES, advice sources, and resources relate to the security and privacy incidents users report. To address this question, we analyze a 3,000 respondent, census-representative telephone survey. We find that, contrary to prior assumptions, people with lower educational attainment report equal or fewer incidents as more educated people, and that users’ experiences are significantly correlated with their advice sources, regardless of SES or resources.
  • Item
    SMILE: Simulator for Maryland Imitation Learning Environment
    (2016-05-19) Huang, Di-Wei; Katz, Garrett E.; Gentili, Rodolphe J.; Reggia, James A.
    As robot imitation learning is beginning to replace conventional hand-coded approaches in programming robot behaviors, much work is focusing on learning from the actions of demonstrators. We hypothesize that in many situations, procedural tasks can be learned more effectively by observing object behaviors while completely ignoring the demonstrator's motions. To support studying this hypothesis and robot imitation learning in general, we built a software system named SMILE that is a simulated 3D environment. In this virtual environment, both a simulated robot and a user-controlled demonstrator can manipulate various objects on a tabletop. The demonstrator is not embodied in SMILE, and therefore a recorded demonstration appears as if the objects move on their own. In addition to recording demonstrations, SMILE also allows programing the simulated robot via Matlab scripts, as well as creating highly customizable objects for task scenarios via XML. This report describes the features and usages of SMILE.
  • Item
    I Think They're Trying To Tell Me Something: Advice Sources and Selection for Digital Security
    (2015-11-30) Redmiles, Elissa M.; Malone, Amelia; Mazurek, Michelle L.
    Users receive a multitude of digital- and physical-security advice every day. Indeed, if we implemented all the security advice we received, we would never leave our houses or use the Internet. Instead, users selectively choose some advice to accept and some (most) to reject; however, it is unclear whether they are effectively prioritizing what is most important or most useful. If we can understand from where users take security advice and how they subsequently develop security behaviors, we can develop more effective security interventions. As a first step, we conducted 25 semi-structured interviews of security-sensitive (those users who deal with sensitive data or hold security clearances) and general users. These interviews resulted in several key findings: (1) users' main sources of digital-security advice include IT professionals, workplaces, and negative events, whether experienced personally or retold through TV; (2) users determine whether to accept digital-security advice based on the trustworthiness of the advice-source, as they feel inadequately able to evaluate the advice content; (3) users reject advice for many reasons, from believing that someone else is responsible for their security to finding that the advice contains too much marketing material or threatens their privacy; and (4) security-sensitive users differ from general users in a number of ways, including feeling that digital-security advice is more useful in their day-to-day lives and relying heavily on their workplace as a source of security information. These and our other findings inform a set of design recommendations for enhancing the efficacy of digital-security advice.
  • Item
    Cybersecurity - What's Language got to do with it?
    (2015-09-18) Klavans, Judith L.
    A new opportunity to explore and leverage the power of computational linguistic methods and analysis in ensuring effective Cybersecurity is presented. This White Paper discusses some of the specific emerging research opportunities, covering human language technologies such as language identification, topic modeling, and information extraction for keyword recognition.
  • Item
    Democratizing Facial Recognition with Google Glass
    (2015-09-14) Krach, Jeremy
    Lightweight and camera-equipped wearable devices such as Android-backed Google Glass— with their potential for wide-spread and mobile data capture—have piqued the imagination of technologists and privacy advocates alike. This paper describes an experimental system which confirms the feasibility of such devices for surveillance through live data collection and facial recognition. Furthermore, even though effective surveillance tasks are computationally demanding, this work illustrates that performance of such systems is scalable through careful architecting of communication between static servers and mobile collection devices. When the bulk of the complexity can be offloaded to the server, and with the availability of highly-available communication channels between collector and processor, we have the foundation upon which future surveillance systems might be constructed. Such systems awaken nightmares for those advocating privacy of the modern citizen, while inspiring innovators to push forward the bounds of what can be accomplished with today’s technology. The present project enables advocates from both ends of the spectrum to debate privacy policy as it can be seen through the lens of systems that are possible today.
  • Item
    Checking Interaction-Based Declassification Policies for Android Using Symbolic Execution
    (2015-07-01) Micinski, Kristopher; Fetter-Degges, Jonathan; Jeon, Jinseong; Foster, Jeffrey S.; Clarkson, Michael R.
    Mobile apps can access a wide variety of secure information, such as contacts and location. However, current mobile platforms include only coarse access control mechanisms to protect such data. In this paper, we introduce interaction-based declassification policies, in which the user's interactions with the app constrain the release of sensitive information. Our policies are defined extensionally, so as to be independent of the app's implementation, based on sequences of security-relevant events that occur in app runs. Policies use LTL formulae to precisely specify which secret inputs, read at which times, may be released. We formalize a semantic security condition, interaction-based noninterference, to define our policies precisely. Finally, we describe a prototype tool that uses symbolic execution of Dalvik bytecode to check interaction-based declassification policies for Android, and we show that it enforces policies correctly on a set of apps.
  • Item
    Accurate computation of Galerkin double surface integrals in the 3-D boundary element method
    (2015-05-29) Adelman, Ross; Gumerov, Nail A.; Duraiswami, Ramani
    Many boundary element integral equation kernels are based on the Green’s functions of the Laplace and Helmholtz equations in three dimensions. These include, for example, the Laplace, Helmholtz, elasticity, Stokes, and Maxwell equations. Integral equation formulations lead to more compact, but dense linear systems. These dense systems are often solved iteratively via Krylov subspace methods, which may be accelerated via the fast multipole method. There are advantages to Galerkin formulations for such integral equations, as they treat problems associated with kernel singularity, and lead to symmetric and better conditioned matrices. However, the Galerkin method requires each entry in the system matrix to be created via the computation of a double surface integral over one or more pairs of triangles. There are a number of semi-analytical methods to treat these integrals, which all have some issues, and are discussed in this paper. We present novel methods to compute all the integrals that arise in Galerkin formulations involving kernels based on the Laplace and Helmholtz Green’s functions to any specified accuracy. Integrals involving completely geometrically separated triangles are non-singular and are computed using a technique based on spherical harmonics and multipole expansions and translations, which results in the integration of polynomial functions over the triangles. Integrals involving cases where the triangles have common vertices, edges, or are coincident are treated via scaling and symmetry arguments, combined with automatic recursive geometric decomposition of the integrals. Example results are presented, and the developed software is available as open source.
  • Item
    Automating Efficient RAM-Model Secure Computation
    (2014-03-13) Liu, Chang; Huang, Yan; Shi, Elaine; Katz, Jonathan; Hicks, Michael
    RAM-model secure computation addresses the inherent limitations of circuit-model secure computation considered in almost all previous work. Here, we describe the first automated approach for RAM-model secure computation in the semi-honest model. We define an intermediate representation called SCVM and a corresponding type system suited for RAM-model secure computation. Leveraging compile-time optimizations, our approach achieves order-of-magnitude speedups compared to both circuit-model secure computation and the state-of-art RAM-model secure computation.
  • Item
    A Stochastic Approach to Uncertainty in the Equations of MHD Kinematics
    (2014-07-10) Phillips, Edward G.; Elman, Howard C.
    The magnetohydodynamic (MHD) kinematics model describes the electromagnetic behavior of an electrically conducting fluid when its hydrodynamic properties are assumed to be known. In particular, the MHD kinematics equations can be used to simulate the magnetic field induced by a given velocity field. While prescribing the velocity field leads to a simpler model than the fully coupled MHD system, this may introduce some epistemic uncertainty into the model. If the velocity of a physical system is not known with certainty, the magnetic field obtained from the model may not be reflective of the magnetic field seen in experiments. Additionally, uncertainty in physical parameters such as the magnetic resistivity may affect the reliability of predictions obtained from this model. By modeling the velocity and the resistivity as random variables in the MHD kinematics model, we seek to quantify the effects of uncertainty in these fields on the induced magnetic field. We develop stochastic expressions for these quantities and investigate their impact within a finite element discretization of the kinematics equations. We obtain mean and variance data through Monte-Carlo simulation for several test problems. Toward this end, we develop and test an efficient block preconditioner for the linear systems arising from the discretized equations.