|dc.description.abstract||Malware still is a vital security threat. Adversaries continue to distribute various types of malicious programs to victims around the world. In this study, we try to understand the strategies the miscreants take to distribute malware, develop systems to detect malware delivery and explore the benefit of a transparent platform for blocking malware distribution in advance.
At the first part of the study, to understand the malware distribution, we conduct several measurements. We initiate the study by investigating the dynamics of malware delivery. We share several findings including the downloaders responsible for the malware delivery and the high ratio of signed malicious downloaders. We further look into the problem of signed malware. To successfully distribute malware, the attacker exploits weaknesses in the code-signing PKI, which falls into three categories: inadequate client-side protections, publisher-side key mismanagement, and CA-side verification failures. We propose an algorithm to identify malware that exploits those weaknesses and to classify to the corresponding weakness. Using the algorithm, We conduct a systematic study of the weaknesses of code-signing PKI on a large scale. Then, we move to the problem of revocation. Certificate revocation is the primary defense against the abuse in code-signing PKI. We identify the effective revocation process, which includes the discovery of compromised certificates, the revocation date setting, and the dissemination of revocation information; moreover, we systematically measure the problems in the revocation process and new threats introduced by these problems.
For the next part, we explore two different approaches to detect the malware distribution. We study the executable files known as downloader Trojans or droppers, which are the core of the malware delivery techniques. The malware delivery networks instruct these downloaders across the Internet to access a set of DNS domain address to retrieve payloads. We first focus on the downloaded by relationship between a downloader and a payload recorded by different sensors and introduce the downloader graph abstraction. The downloader graph captures the download activities across end hosts and exposes large parts of the malware download activity, which may otherwise remain undetected, by connecting the dots. By combining telemetry from anti-virus and intrusion-prevention systems, we perform a large-scale analysis on 19 million downloader graphs from 5 million real hosts. The analysis revealed several strong indicators of malicious activity, such as the slow growth rate and the high diameter. Moreover, we observed that, besides the local indicators, taking into account the global properties boost the performance in distinguishing between malicious and benign download activity. For example, the file prevalence (i.e., the number of hosts a file appears on) and download patterns (e.g., number of files downloaded per domain) are different from malicious to benign download activities. Next, we target the silent delivery campaigns, which is the critical method for quickly delivering malware or potentially unwanted programs (PUPs) to a large number of hosts at scale. Such large-scale attacks require coordination activities among multiple hosts involved in malicious activity. We developed Beewolf, a system for detecting silent delivery campaigns from Internet-wide records of download events. We exploit the behavior of downloaders involved in campaigns for this system: they operate in lockstep to retrieve payloads. We utilize Beewolf to identify these locksteps in an unsupervised and deterministic fashion at scale. Moreover, the lockstep detection exposes the indirect relationships among the downloaders. We investigate the indirect relationships and present novel findings such as the overlap between the malware and PUP ecosystem.
The two different studies revealed the problems caused by the opaque software distribution ecosystem and the importance of the global properties in detecting malware distribution. To address both of these findings, we propose a transparent platform for software distribution called Download Transparency. Transparency guarantees openness and accountability of the data, however, itself does not provide any security guarantees. Although there exists an anecdotal example showing the benefit of transparency, it is still not clear how beneficial it is to security. In the last part of this work, we explore the benefit of transparency in the domain of downloads. To measure the performance, we designed the participants and the policies they might take when utilizing the platform. We then simulate different policies with five years of download events and measure the block performance. The results suggest that the Download Transparency can help to block a significant part of the malware distribution before the community can flag it as malicious.||en_US