Thursday, September 3, 2009

Understanding BGP Misconfiguration

Ratul Mahajan , David Wetherall , Tom Anderson, Understanding BGP misconfiguration, ACM SIGCOMM 2002
The whole motivation behind the paper was to highlight that accidental BGP misconfiguration have the potential to disrupt whole of the Internet connectivity which makes it important to dwell deeper into their patterns so as to have a better understanding about their frequency as well as causes. To achieve this goal, the authors attempted to answer the following questions:
  • Frequency of misconfigurations?
  • Their impact on global connectivity?
  • Their causes?
  • What are the solutions to reduce their frequency and impact?
This project spanned over 21 days and the entire stream og BGP updates taken from 23 different vantage points across the Internet was analyzed. Moreover, to validate their results, they subsequently surveyed the ISP operators involved in each incident. The fact which makes this problem pretty exciting was that there was no trivial ways in which these misconfiguration could be detected. So, the paper claims to have studied a subset of these misconfiguration (and hence talks about providing a 'lower bound' idea of the scenario). The methodology involved studying 2 broad classes of globally visible faults:
  • Origin Misconfiguration: This refers to the unintentional insertion of a route into the global BGP tables. This further talked about self-deaggregation, related-origin and self-origin misconfigurations.
  • Export Misconfiguration: This referred to the inadvertent export of a route to a BGP peer in violation of exporter's policy. (Thus violating valley-free property)
 The methodolgy of identifying misconfiguration was to see short-lived (less than a day) address mappings as potential misconfigs and then confirming with AS operators if it indeed was a misconfig. Here it was argued, the origin misconfig could be a short lived new route and policy violating short-lived AS paths could be export misconfigs. These AS relationships were inferred using Gao's algorithm. It was claimed that the misconfiguration detection frequency was pretty high and majority of origin misconfigs occured due to faulty redistribution or initialization bugs. On the other hand, prefix based configuration was responsible for 22% of the export misconfigs.

Finally, the paper proposed a variety of fixes such as configuration checkers, automatic verification and better user interfaces. Overall, I liked the approach of this paper in the sense that it tackled the problem in a real hands-on level. No doubt that misconfigurations are pretty commonplace and except certain scenarios, they donot always disrupt complete connectivity, however they do have a considerable impact on routing loads. I would really like if we could discuss more along the lines of steps which could be taken towards developing automatic verification techniques to minimize these misconfigurations and how will the related design decisions change in case of S-BGP.

1 comment:

  1. I agree the key question is automated verification. I did some work on this with one of my students, Lakshmi@NYU: L. Subramanian, V. Roth, I. Stoica, R. H. Katz, S. Shenker, “Listen and Whisper: Security Mechanisms for BGP,” USENIX/ACM Symposium on Networked System Design and Implementation (NSDI’04), San Francisco, CA, (March 2004). Unfortunately, no real practical effect.

    ReplyDelete