A. James Clark School of Engineering

Permanent URI for this communityhttp://hdl.handle.net/1903/1654

The collections in this community comprise faculty research works, as well as graduate theses and dissertations.

Browse

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Item
    A COMPREHENSIVE EVALUATION OF FEATURE-BASED MALICIOUS WEBSITE DETECTION
    (2020) McGahagan , John Francis; Cukier,, Michel; Reliability Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Although the internet enables many important functions of modern life, it is also a ground for nefarious activity by malicious actors and cybercriminals. For example, malicious websites facilitate phishing attacks, malware infections, data theft, and disruption. A major component of cybersecurity is to detect and mitigate attacks enabled by malicious websites. Although prior researchers have presented promising results – specifically in the use of website features to detect malicious websites – malicious website detection continues to pose major challenges. This dissertation presents an investigation into feature-based malicious website detection. We conducted six studies on malicious website detection, with a focus on discovering new features for malicious website detection, challenging assumptions of features from prior research, comparing the importance of the features for malicious website detection, building and evaluating detection models over various scenarios, and evaluating malicious website detection models across different datasets and over time. We evaluated this approach on various datasets, including: a dataset composed of several threats from industry; a dataset derived from the Alexa top one million domains and supplemented with open source threat intelligence information; and a dataset consisting of websites gathered repeatedly over time. Results led us to postulate that new, unstudied, features could be incorporated to improve malicious website detection models, since, in many cases, models built with new features outperformed models built from features used in prior research and did so with fewer features. We also found that features discovered using feature selection could be applied to other datasets with minor adjustments. In addition: we demonstrated that the performance of detection models decreased over time; we measured the change of websites in relation to our detection model; and we demonstrated the benefit of re-training in various scenarios.
  • Thumbnail Image
    Item
    Improving Existing Static and Dynamic Malware Detection Techniques with Instruction-level Behavior
    (2019) Kim, Danny; Barua, Rajeev; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    My Ph.D. focuses on detecting malware by leveraging the information obtained at an instruction-level. Instruction-level information is obtained by looking at the instructions or disassembly that make up an executable. My initial work focused on using a dynamic binary instrumentation (DBI) tool. A DBI tool enables the study of instruction-level behavior while the malware is executing, which I show proves to be valuable in detecting malware. To expand on my work with dynamic instruction-level information, I integrated it with machine learning to increase the scalability and robustness of my detection tool. To further increase the scalability of the dynamic detection of malware, I created a two stage static-dynamic malware detection scheme aimed at achieving the accuracy of a fully-dynamic detection scheme without the high computational resources and time required. Lastly, I show the improvement of static analysis-based detection of malware by automatically generated machine learning features based on opcode sequences with the help of convolutional neural networks. The first part of my research focused on obfuscated malware. Obfuscation is the process in which malware tries to hide itself from static analysis and trick disassemblers. I found that by using a DBI tool, I was able to not only detect obfuscation, but detect the differences in how it occurred in malware versus goodware. Through dynamic program-level analysis, I was able to detect specific obfuscations and use the varying methods in which it was used by programs to differentiate malware and goodware. I found that by using the mere presence of obfuscation as a method of detecting malware, I was able to detect previously undetected malware. I then focused on using my knowledge of dynamic program-level features to build a highly accurate machine learning-based malware detection tool. Machine learning is useful in malware detection because it can process a large amount of data to determine meaningful relationships to distinguish malware from benign programs. Through the integration of machine learning, I was able to expand my obfuscation detection schemes to address a broader class of malware, which ultimately led to a malware detection tool that can detect 98.45% of malware with a 1% false positive rate. Understanding the pitfalls of dynamic analysis of malware, I focused on creating a more efficient method of detecting malware. Malware detection comes in three methods: static analysis, dynamic analysis, and hybrids. Static analysis is fast and effective for detecting previously seen malware where as dynamic analysis can be more accurate and robust against zero-day or polymorphic malware, but at the cost of a high computational load. Most modern defenses today use a hybrid approach, which uses both static and dynamic analysis, but are suboptimal. I created a two-phase malware detection tool that approaches the accuracy of the dynamic-only system with only a small fraction of its computational cost, while maintaining a real-time malware detection timeliness similar to a static-only system, thus achieving the best of both approaches. Lastly, my Ph.D. focused on reducing the need for manual feature generation by utilizing Convolutional Neural Networks (CNNs) to automatically generate feature vectors from raw input data. My work shows that using a raw sequence of opcode sequences from static disassembly with a CNN model can automatically produce feature vectors that are useful for detecting malware. Because this process is automated, it presents as a scalable method of consistently producing useful features without human intervention or labor that can be used to detect malware.
  • Thumbnail Image
    Item
    CYBERSECURITY FOR INTELLECTUAL PROPERTY: DEVELOPING PRACTICAL FINGERPRINTING TECHNIQUES FOR INTEGRATED CIRCUITRY
    (2015) Dunbar, Carson Joseph; Qu, Gang; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The system on a chip (SoC) paradigm for computing has become more prevalent in modern society. Because of this, reuse of different functional integrated circuits (ICs), with standardized inputs and outputs, make designing SoC systems easier. As a result, the theft of intellectual property for different ICs has become a highly profitable business. One method of theft-prevention is to add a signature, or fingerprint, to ICs so that they may be tracked after they are sold. The contribution of this dissertation is the creation and simulation of three new fingerprinting methods that can be implemented automatically during the design process. In addition, because manufacturing and design costs are significant, three of the fingerprinting methods presented, attempt to alleviate costs by determining the fingerprint in the post-silicon stage of the VLSI design cycle. Our first two approaches to fingerprint ICs, are to use Observability Don’t Cares (ODCs) and Satisfiability Don’t Cares (SDCs), which are almost always present in ICs, to hide our fingerprint. ODCs cause an IC to ignore certain internal signals, which we can utilize to create fingerprints that have a minimal performance overhead. Using a heuristic approach, we are also able to choose the overhead the gate will have by removing some fingerprint locations. The experiments show that this work is effective and can provide a large number of fingerprints for more substantial circuits, with a minimal overhead. SDCs are similar to ODCs except that they focus on input patterns, to gates, that cannot exist. For this work, we found a way to quickly locate most of the SDCs in a circuit and depending on the input patterns that we know will not occur, replace the gates to create a fingerprint with a minimal overhead. We also created two methods to implement this SDC fingerprinting method, each with their own advantages and disadvantages. Both the ODC and SDC fingerprinting methods can be implemented in the circuit design or physical design of the IC, and finalized in the post-silicon phase, thus reducing the cost of manufacturing several different circuits. The third method developed for this dissertation was based on our previous work on finite state machine (FSM) protection to generate a fingerprint. We show that we can edit ICs with incomplete FSMs by adding additional transitions from the set of don’t care transitions. Although the best candidates for this method are those with unused states and transitions, additional states can be added to the circuit to generate additional don’t care transitions and states, useful for generating more fingerprints. This method has the potential for an astronomical number of fingerprints, but the generated fingerprints need to be filtered for designs that have an acceptable design overhead in comparison to the original circuit. Our fourth and final method for IC fingerprinting utilizes scan-chains which help to monitor the internal state of a sequential circuit. By modifying the interconnects between flip flops in a scan chain we can create unique fingerprints that are easy to detect by the user. These modifications are done after the design for test and during the fabrication stage, which helps reduce redesign overhead. These changes can also be finalized in the post-silicon stage, similar to the work for the ODC and SDC fingerprinting, to minimize manufacturing costs. The hope with this dissertation is to demonstrate that these methods for generating fingerprints, for ICs, will improve upon the current state of the art. First, these methods will create a significant number of unique fingerprints. Second, they will create fingerprints that have an acceptable overhead and are easy to detect by the developer and are harder to detect or remove by the adversary. Finally, we show that three of the methods will reduce the cost of manufacturing by being able to be implemented in the later stages of their design cycle.