Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    Item
    A COMPREHENSIVE EVALUATION OF FEATURE-BASED MALICIOUS WEBSITE DETECTION
    (2020) McGahagan , John Francis; Cukier,, Michel; Reliability Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Although the internet enables many important functions of modern life, it is also a ground for nefarious activity by malicious actors and cybercriminals. For example, malicious websites facilitate phishing attacks, malware infections, data theft, and disruption. A major component of cybersecurity is to detect and mitigate attacks enabled by malicious websites. Although prior researchers have presented promising results – specifically in the use of website features to detect malicious websites – malicious website detection continues to pose major challenges. This dissertation presents an investigation into feature-based malicious website detection. We conducted six studies on malicious website detection, with a focus on discovering new features for malicious website detection, challenging assumptions of features from prior research, comparing the importance of the features for malicious website detection, building and evaluating detection models over various scenarios, and evaluating malicious website detection models across different datasets and over time. We evaluated this approach on various datasets, including: a dataset composed of several threats from industry; a dataset derived from the Alexa top one million domains and supplemented with open source threat intelligence information; and a dataset consisting of websites gathered repeatedly over time. Results led us to postulate that new, unstudied, features could be incorporated to improve malicious website detection models, since, in many cases, models built with new features outperformed models built from features used in prior research and did so with fewer features. We also found that features discovered using feature selection could be applied to other datasets with minor adjustments. In addition: we demonstrated that the performance of detection models decreased over time; we measured the change of websites in relation to our detection model; and we demonstrated the benefit of re-training in various scenarios.