Measuring and Mitigating Potential Risks of Third-party Resource Inclusions

Indela, Soumya

Measuring and Mitigating Potential Risks of Third-party Resource Inclusions

Files

Indela_umd_0117E_21931.pdf (3.12 MB)

No. of downloads: 249

Date

2021

Authors

Indela, Soumya

Advisor

Levin, Dave

DRUM DOI

https://doi.org/10.13016/zniq-0oud

Abstract

In today's computer services, developers commonly use third-party resources like libraries, hosting infrastructure and advertisements. Using third-party components improves the efficiency and enhances the quality of developing custom applications. However, while using third-party resources adopts their benefits, it adopts their vulnerabilities, as well. Unfortunately, developers are uninformed about the risks, as a result of which, the services are susceptible to various attacks. There has been a lot of work on how to develop first-hand secure services. The key focus in my thesis is quantifying the risks in the inclusion of third-party resources and looking into possible ways of mitigating them. Based on the fundamental ways that risks arise, we broadly classify them into Direct and Indirect Risks. Direct risk is the risk that comes with invoking the third-party resource incorrectly—even if the third party is otherwise trustworthy whereas indirect risk is the risk that comes with the third-party resource potentially acting in an untrustworthy manner—even if it were invoked correctly.

To understand the security related direct risks in third-party inclusions, we study cryptographic frameworks. Developers often use these frameworks incorrectly and introduce security vulnerabilities. This is because current cryptographic frameworks erode abstraction boundaries, as they do not encapsulate all the framework-specific knowledge and expect developers to understand security attacks and defenses. Starting from the documented misuse cases of cryptographic APIs, we infer five developer needs and we show that a good API design would address these needs only partially. Building on this observation, we propose APIs that are semantically meaningful for developers. We show how these interfaces can be implemented consistently on top of existing frameworks using novel and known design patterns, and we propose build management hooks for isolating security workarounds needed during the development and test phases.

To understand the performance related direct risks in third-party inclusions, we study resource hints in webpage HTML. Today's websites involve loading a large number of resources, resulting in a considerable amount of time issuing DNS requests, requesting resources, and waiting for responses. As an optimization for these time sinks, websites may load resource hints, such as DNS prefetch, preconnect, preload, pre-render, and prefetch tags in their HTML files to cause clients to initiate DNS queries and resource fetches early in their web-page downloads before encountering the precise resource to download. We explore whether websites are making effective use of resource hints using techniques based on the tool we developed to obtain a complete snapshot of a webpage at a given point in time. We find that many popular websites are highly ineffective in their use of resource hints, causing clients to query and connect to extraneous domains, download unnecessary data, and may even use resource hints to bypass ad blockers.

To evaluate the indirect risks, we study the web topology. Users who visit benign, popular websites are unfortunately bombarded with malicious popups, malware- loading sites, and phishing sites. The questions we want to address here are: Which domains are responsible for such malicious activity? At what point in the process of loading a popular, trusted website does the trust break down to loading dangerous content? To answer these questions, we first understand what third-party resources websites load (both directly and indirectly). I present a tool that constructs the most complete map of a website’s resource-level topology to date. This is surprisingly nontrivial; most prior work used only a single run of a single tool (e.g., Puppeteer or Selenium), but I show that this misses a significant fraction of resources. I then apply my tool to collect the resource topology graphs of 20,000 websites from the Alexa ranking, and analyze them to understand which third-party resource inclusions lead to malicious resources. I believe that these third-party inclusions are not always constant or blocked by existing Ad-blockers. We argue that greater accountability of these third parties can lead to a safer web.

URI (handle)

http://hdl.handle.net/1903/27947

Collections

UMD Theses and Dissertations
Electrical & Computer Engineering Theses and Dissertations

Full item page