DIAGNOSING AND IMPROVING THE PERFORMANCE OF INTERNET ANYCAST

Loading...
Thumbnail Image

Files

Li_umd_0117E_20036.pdf (1.6 MB)
No. of downloads: 207

Publication or External Link

Date

2019

Citation

Abstract

IP anycast is widely used in Internet infrastructure, including many of the root and top-level DNS servers, major open DNS resolvers, and content delivery networks (CDNs). Increasing popularity of anycast in DNS resolvers involves it in most activities of Internet users. As a result, the performance of anycast deployments is critical to all the Internet users.

What makes IP anycast such an attractive option for these globally replicated services are the desired properties that anycast would appear to achieve: reduced overall access latency for clients, improved scalability by distributing traffic across servers, and enhanced resilience to DDoS attacks. These desired properties, however, are not guaranteed. In anycast, a packet is directed to certain anycast site through inter-domain routing, which can fail to pick a route with better performance in terms of latency or load balance. Prior work has studied anycast deployments and painted a mixed picture of anycast performance: many clients of anycast are not served by their nearby anycast servers and experience large latency overheads; anycast sometimes does not balance load across sites effectively; the catchment of an anycast site is mostly stable, but it is very sensitive to routing changes.

Although it was observed over a decade ago that anycast deployments can be inefficient, there exist surprisingly few explanations on the causes or solutions. In addition, most prior work evaluated only one or several deployments with measurement snapshots. I extended previous studies by large-scale and longitudinal measurements towards distinct anycast deployments, which can provide more complete insights on identifying performance bottlenecks and providing potential improvements. More importantly, I develop novel measurement techniques to identify the major causes for inefficiency in anycast, and propose a fix to it. In this dissertation, I defend the following thesis: Performance-unawareness of BGP routing leads to larger path inflation in anycast than in unicast; and with current topology and protocol support, a policy that selects routes based on geographic information could significantly reduce anycast inflation.

In the first part of the dissertation, I use longitudinal measurements collected from a large Internet measurement platform towards distinct anycast deployments to quantitatively demonstrate the inefficiency in performance of anycast. I measured most root DNS servers, popular open DNS resolvers, and one of the major CDNs. With the passive and active measurements across multiple years, I illustrate that anycast performs poorly for most deployments that I measured: anycast is neither effective at directing queries to nearby sites, nor does it distribute traffic in a balanced manner. Furthermore, this longitudinal study over distinct anycast deployments shows that the performance has little correlation with number of sites.

In the second part of the dissertation, I focus on identifying the root causes for the performance deficits in anycast. I develop novel measurement techniques to compare AS-level routes from client to multiple anycast sites. These techniques allow me to reaffirm that the major cause of the inefficiency in anycast is the performance- unawareness of inter-domain routing. With measurements from two anycast deployments, I illustrate how much latency inflation among clients can be attributed to the policy-based performance-unaware decisions made by BGP routing. In addition, I design BGP control plane experiments to directly reveal relative preference among routes, and how much such preference affects anycast performance. The newly discovered relative preferences shed light on improving state-of-art models of inter-domain routing for researchers.

In the last part of the dissertation, I describe an incrementally deployable fix to the inefficiency of IP anycast. Prior work has proposed a particular deployment scheme for anycast to improve its performance: anycast servers should be deployed such that they all share the same upstream provider. However, this solution would require re-negotiating services that are not working under such a deployment. Moreover, to put the entire anycast service behind a single upstream provider introduces a single point of failure. In the last chapter, I show that a static hint with embedded geographic information in BGP announcements fixes most of the inefficiency in anycast. I evaluate the improvements from such static hints in BGP route selection mechanisms through simulation with real network traces. The simulation results show that the fix is promising: in the anycast deployments I evaluated, the fix reduces latency inflation for almost all clients, and reduces latency by 50ms for 23% to 33% of the clients. I further conduct control plane experiments to evaluate the effectiveness of the static hints in BGP announcements with real-world anycast deployments.

This dissertation provides broad and longitudinal performance evaluation of distinct anycast deployments for different services, and identifies an at-fault weakness of BGP routing which is particularly amplified in anycast, i.e., route selection is based on policies and is unaware of performance. While applying the model of BGP routing to diagnose anycast, anycast itself serves as a magnifying glass to reveal new insights on the route selection process of the BGP in general. This work can help refine the model of route selection process that can be applied to various BGP- related studies. Finally, this dissertation provides suggestions to the community on improving anycast performance, which thus improves performance and reliability for many critical Internet infrastructure and ultimately benefits global Internet users.

Notes

Rights