Notes on Border Gateway Protocol (BGP)
Peter Dordal, Loyola University Computer Science Department, ~2006
BGP is used for communication of routing information between different
Autonomous Systems, where agreement on internal metrics is impractical
(or politically impossible). BGP thus does not
send information about the relative costs of various paths. In intra-AS
routing protocols such as RIP (a distance-vector implementation), the
fact that shortest routes are found guarantees that the routes are
loop-free, as a route with a loop cannot be shortest (in fact, this is
probably a more important reason for finding shortest routes than
saving on network link usage). BGP receivers must thus have some other
way for choosing loop-free routes; BGP achieves this by sending full AS-path information for each route (below).
BGP's predecessor was EGP, which guaranteed loop-free routes by
allowing only a single route to any AS, thus forcing the internet into
a tree topology, at least at the level of ASs. The AS graph could
contain no cycles or alternative routes, and hence there could be no
redundancy provided by alternative paths. EGP also thus avoided having
to make decisions as to the preferred path; there was never more than
one choice. EGP was sometimes described as a reachability protocol; its only concern was whether a given network was reachable.
Like with EGP, BGP is primarily concerned with reachability, and we
ignore any internal distance metrics. Such internal metrics include
- hop count
- weighted hop count
- preference values
- propagation-delay time
- bandwidth
There is no universal metric!
BGP, however, does support an arbitrary AS-graph (topology of ASs). To avoid loops, BGP tags each destination network with the full AS-path
needed to reach that network; as information about a destination
network is propagated through the Internet, the AS-path gets longer.
However, real AS-paths are seldom longer than about five, so the space
consumed by the path information is manageable.
BGP is also intended to support policy routing;
that is, routing based on managerial or administrative concerns in
addition to technical ones. Recall that now, with BGP, there can be
multiple routes to a destination. To choose the one route that we
will use, we combine a mixture of optimization choices and policy choices. Some examples of policy choices might be:
- one route might involve a competitor we wish to avoid
- one route might involve an AS that doesn't provide transit.
Here is a brief summary of the policy-routing strategy, from [VGE]:
Each domain receives one or more routes
from each of its neighbors. A route indicates its sender's reachability
to an address prefix [a network or aggregate of networks]. Each route
also contains one or more path attributes. For each received loop-free route to a given address prefix, the domain first computes an integer preference,
then selects the route with the highest preference. Route preference
assignment reflects domains' policies. The preference function takes as
input a route's path attributes [which may be set by neighbors, but
which may also be ignored or discounted]. However, domains can independently choose their preference functions. [Each domain then exports some, but not necessarily all, of its chosen routes to its neighbors.]
[The subject of the [VGE] paper from which this quote was taken is a
demonstration that independently selected preference functions may,
under some circumstances, cause a failure of convergence.]
It is helpful to distinguish between two kinds of traffic, as seen from a given AS. Local traffic is traffic that either originates or terminates at that AS. Other traffic is transit traffic; the AS is forwarding it along on behalf of some nonlocal party.
Note that if you are within your ISP's AS, then your ISP is handling
your traffic as local traffic, but if you have your own AS, then your
ISP is handling your traffic as transit traffic. Whether you choose to
set up your own AS or not at this level is largely an administrative
matter.
Note also that there may be a chain of customer-provider relationships.
AS1 may be a customer of an ISP that corresponds to AS2, which in turn
may be a customer of a regional provider AS3.
Let AS1 be an "end-user" AS, that is, a typical organization large enough to have its own AS number but in the business of using
the Internet, not of providing services to others. AS1 may be one of
two types. If AS1 is connected to exactly one provider AS, say AS2,
then AS1 is sometimes called a stub AS. If, however, AS1 has connections to two or more providers, eg AS2 and AS3,
___AS2....
/ rest of
Internet
AS1----AS3....
then AS1 is called multi-homed.
The normal understanding of a multi-homed end-user AS is that the
second connection is there solely as a backup; under no circumstances
will the AS1 provide transit services to anybody. That is, if AS3 loses
its connection to the rest of the Internet, traffic could still reach AS3 by the path ->AS2->AS1->AS3, but AS1 would likely decline to provide that transit service.
HOWEVER, that traditional multi-homed arrangement is often replaced by
a reciprocal agreement to provide transit services. This is not
commonly done when AS1 has connections to two separate providers, AS2 and AS3 above, but suppose AS2 is simply another customer:
AS1-------AS3....
|
rest of Internet
AS2-------AS4....
AS1 and AS2 are both ordinary end-user organizations. The link between them could be used just for private data exchange, but it might also be their common backup link. In the latter case, it would be used for transit traffic: if the AS2--AS4 link broke, then AS1 might agree to carry transit traffic from AS3 to AS2.
AS-paths
When an AS advertises reachability to a given dest-net, it includes a
list of all ASs that must be traversed to reach that dest-net. This
list is called the AS-path; this is done solely to provide for loop detection.
If AS1 receives a reachability announcement <net1, AS-path1>,
then it forwards this to its neighbors after prepending itself to the
AS-path: <net1, AS1^AS-path1>. If AS1 receives a report <net1,
AS-path1, and AS1 itself appears within AS-path1, then AS1 does not use that path,
as it contains a loop. Actually, if AS-path1 were <AS2, AS3, AS1,
AS4>, then AS1 would presumably already know that it could reach
net1 via <AS1, AS4>. However, it is possible that the AS1--AS4
link had broken, and we are in what would amount to slow convergence if
a vector-distance protocol were in use. Having the full AS-path allows
detection of loops without the risk of slow convergence.
Information about <dest-net, AS-path> is sent as updates only;
there is no regular broadcast interval for these announcements.
It is possible for a route to have a routing loop even if the AS-path
has no duplicates in it; the routing loop can be internal to one AS.
Similarly, it is possible for a loopless route, or even the best
route, to traverse the same AS more than once; BGP would not support
this kind of route. Neither of these situations are important in
practice
AS-paths potentially add a lot to the size of the AS database. The
number of paths a site must keep track of is proportional to the number
of ASs, because there will be one AS-path to each AS. (Actually, an AS
may have to record many times that many AS-paths, as an AS will hear of
AS-paths that it elects not to use.) Typically there are several
thousand ASs in the world. Let A = number of ASs, N = number of
networks. Typically the average length of an AS-path is about log A,
although this depends on connectivity. If ASs are arranged on a square
grid and are directy connected only to their neighbors in the grid,
then the average length of an AS-path is about sqrt(A).
Assuming the first case, the amount of memory required by BGP is
C*A*log A + K*N,
where C and K are constants.
Every site has one or more BGP speakers.
These must remain coordinated with one another, by internal network
connections, so as to present a consistent view of the site's
connections and advertisements. The bgp speakers are often not the busy border routers. The bgp speakers must maintain a database of all paths received, not just of the paths actually used. However, the speakers advertise only the paths they use themselves.
"Route reflectors" are sometimes used to help with maintaining consistency among all the BGP speakers.
BGP uses TCP for connections. Some consequences of using TCP:
- vulnerability to TCP injection attacks
- no "K-in-N" testing (that is, send N packets; see if at least K get through)
- polling is hard to do in TCP
- route updates are subject to TCP congestion[!]
- regular polling is not needed, due to TCP reliability
AS-path v route aggregation
There is some conflict between the goal of reporting precise AS-paths
to each destination, and of consolidating as many address prefixes as
possible into a single prefix (single CIDR block). Consider:
Z----T--X X,Y,Z,T are ASs
\
Y
Suppose T has paths
- path=<T>,net 200.0.0/23
- path=<T,X>,net 200.0.2/24
- path=<T,Z>,net 200.0.3/24
If T wants to optimize address-block aggregation using CIDR, it will
prefer to aggregate the destinations into the single block 200.0.0/22;
T then has two options in how it reports to Z:
option 1: report 200.0.0/22
with path <T>. But then we ignore X and Z! These are legitimately
part of the AS-paths to some of the destinations; loop detection may
now fail.
option 2: report 200.0.0/22
with path <T,X,Z>, not a real path. This ensures that the
loop-detection algorithm works, but artificially
inflates the length of the AS-path, which is used for certain tie-breaking decisions.
Solution: instead of either option above, we divide the AS-path into AS-sequence and AS-set; all AS-paths included must begin with the sequence and may continue with any members of the set.
Sequence=<T>,set={X,Y},net 200.0.0/22
Filters
There are three levels of "filtering". First, an AS decides which of
the routes that it receives from its neighbors it will actually
consider. This is called import filtering. Then, the AS chooses the most-preferred path from among all the received routes that survive import filtering; this is called best-path selection. Finally, the AS decides which of the routes it actually uses are to be exported to its neighbors; this is called export filtering.
There are some examples below.
BGP Path attributes
These are specified by flags. Attributes can be transitive (carried
along the AS-path to other ASs) or local (internal); minbandwidth is an
example of a transitive attribute. Other flags are used to indicate
required/optional and partial/complete.
NEXT_HOP
This is used as a per-destination attribute when BGP speaker A wants to
let the other side D know when to use B and when to use C as next-hop
router to destinations. Note that this is low-level routing
information.
AS border
A B C | D
| | | | |
+-----+-----+-----+-----+
LOCAL_PREF
Used internally to decide between various exit points. As a given BGP
speaker receives a path and shares it internally, it can tag it with a
LOCAL_PREF value. This allows AS1 to prefer one connection to AS2 over
another, all else being equal. For example, a provider might assign
higher LOCAL_PREF values for routes it gets from its customers, versus
routes it learns from other ISPs. If a customer is multihomed, for
example, then it connects to multiple providers. Each provider must
assign an appropriate LOCAL_PREF value to its direct route to the
customer, versus the route that traverses the other provider.
In the following diagram, where AS1 and AS2 are customers and AS3, AS4 are providers,
AS1-------AS3....
|
rest of Internet
AS2-------AS4....
if the AS1--AS2 link is to be exclusively for backup, then AS3 needs to
adjust LOCAL_PREF values to prefer reaching AS2 via AS4, and AS4 must
do the same for AS1. Otherwise the "backup" link would be used all the
time.
MULTI_EXIT_DISC
This is low-ranking (in the best-path-selection process) information
provided by AS1 to AS2, provided so AS2 can choose which of several
gateways to AS1 to be used to reach AS1's net N. The alternative is to
use only the closest gateway to AS1. The MED value is of great
importance in minimizing total path length, or in moving traffic from
being carried primarily by the destination's provider to being carried
by the sender's provider.
As an example, consider
A
|
AS1: R1-----R2---+-R3
| | |
| | |
AS2: Q1-+---Q2-----Q3
|
B
In the absense of the MED, AS1 will send traffic from A to B via
the R3--Q3 link, and AS2 will return the traffic via Q1--R1. However,
with the MED, the R2--Q2 link might be use. Note the importance of fact
that AS2 is allowed to ignore
the MED; use of it may shift costs from AS1 to AS2! MED must not force
peers to absorb traffic, or allow them to evade their traffic
responsibilities.
COMMUNITY
This is simply a tag to attach to routes. Routes can have multiple tags
corresponding to membership in multiple communities. Some communities
are defined globally; for example, NO_EXPORT and NO_ADVERTISE. Other
communities are relevant only to a given AS; the first two bytes of
such community values are the AS-number of the AS that is defining the
community.
The importance of communities is that they are transitive;
that is, they are passed along as routes are exported from one AS to
the next. The receiving AS is not obligated to honor the community
memberships, of course, but when it does so, this means that, for
example, customers of an AS can place themselves into routing
communities that the provider AS makes use of. This allows customers to
"configure themselves"; if they want to make changes in how their
routes are managed, they don't have to involve their provider in making
the entries.
A customer would have to find out from the provider what the provider's
AS number is, and what communities the provider defines, and what their
numeric codes are. This is likely to be available, say, at the
provider's website. At that point the customer can place itself into
the provider's community at will.
Here are some of the community values supported by digex, from support.digex.net/cst/policies/externalbgp.html.
The AS number is 2548, and the 4-byte community values are all of the
form 2548:N, where N is a 16-bit number. ALGX.com is used here as the
name of the provider (=AS2548), due to mergers.
2548:90 set local_pref used by AS2548 to 90
2548:100 set local_pref to 100, the default
2548:105 set local pref to 105
2548:110 set to 110
2548:990 Routes do not leave
ALGX.COM's backbone. This is the same effect as the well-known
community "no-export." Sending 2548:990 is the preferred method.
2548:991 Routes do not leave
ALGX.COM's customers. ALGX.COM will not announce the routes to other
providers, but will to its own customers.
Filtering Examples
Example of best-path selection
Do we use MED value or not? A neighbor advertises in their MED value
which of multiple routes into their network THEY would prefer we use to
reach a given destination. Using the MED value means that we choose a
lower-cost path for the neighbor, at the cost of a higher-cost path for
ourselves. We might do this if our neighbor is paying us, or if we have
a higher-reliability network and want traffic to remain in our network
for as long as possible (a possible selling point with our customers).
Example of export filtering
The classic example is setting up a private link (AS1--AS2 in the
diagram below) that is to be used only for non-transit traffic (traffic
that starts or ends at AS1 or AS2). In that case, AS1 does not
advertise the link to AS3 and AS2 does not advertise to AS4.
Catch: Can AS1 use the link to reach AS4? Return traffic would necessarily go via AS3.
In the following diagram, where AS1 and AS2 are customers and AS3, AS4 are providers,
AS1-------AS3....
|
rest of Internet
AS2-------AS4....
if the AS1--AS2 link is to be exclusively for backup, then AS3 needs to
adjust LOCAL_PREF values to prefer reaching AS2 via AS4, and AS4 must
do the same for AS1. Otherwise the "backup" link would be used all the
time.
This means that AS1/AS2 must trust AS3/AS4 to treat their private link
as having a low preference. They can't force this, though. If they
really want to be sure their link isn't used normally, they can use
EXPORT FILTERING to not tell AS3/AS4 of the existence of their route.
Example of import filtering
Ignoring routes from a provider we're contractually obligated not to use.
Ignoring routes from providers with recent route-flap (instability) history
Route Flaps
These are sudden (which may be a relative term) changes in the routing
database, that must be propagated to other ASs through BGP.
P--Q-----A
\__R___/
For this to be an issue, there must be alternative routes! As the Q--A
link goes up and down, P must deal with changing paths to A, and must
announce these to others, too. Early studies saw many more route flaps
than expected, with a 24-hour cycle due to congestion
More Filters
One AS might filter out paths
advertised by a neighbor; this amounts to awarding those routes a 0 for
the local_pref value. One backbone provider [Sprintnet] has [or once
had] a rule that it would not accept route blocks of size /24 or
smaller, for example. Another reason for not allowing routes is that
there has been too much recent route flapping; route dampening is the technique of maintaining an exponential leaky bucket "flappage counter":
Add 1 for every flap, divide by 2 every X minutes
An AS might then ignore advertisements about sites whose flappage counter is too high.
Also, one can limit the rate of advertisements.
What if somebody advertises destination D at cost 0? This is a common
configuration error in intra-AS routing; it often makes destination D
unavailable. In BGP you can make a real mess of things if you announce
that you can reach everyone at cost 0; this is one reason for continued
work on BGP authentication protocols.
A provider might filter out advertisements from a customer for networks not assigned to that customer (see policy, above).
RADB/IRR: [Routing Arbiter DataBase/Internet Route Registry]
What happens if an advertised route is completely bogus? These
databases were/are supposed to list each destination network and its
AS; as a simple sanity check, if we get an AS path advertised for net
N, the last AS in the path should be the one listed in the IRR. If not,
we reject it. But often the IRR data is incomplete, or (worse) wrong.
BGP decision process
The first step is to eliminate paths with AS-loops; in general this
means that AS1 will eliminate all route advertisements it receives with
AS1 somewhere in the AS-path. At this stage we may also eliminate paths
with forbidden ASs, or with too much recent route flappage.
The next step, generally the most important in AS routing configuration, is to assign a local preference,
or weight, to each route received. The community attribute is often
helpful here; an AS may also have policies that add a certain amount
for routes that use a certain AS, etc. We may, for example, have a
contract to prefer sprintnet over worldcom. Provider ASs will in
general prefer routes that go through their customers, as these are
"cheaper". Finally, a smaller ISP may have connections to multiple
larger ISPs, but prefer one of them due to contractual reasons. The
paths with the best local-pref are then selected.
The third step is to prefer routes that go through fewer ASs; that is,
that have shorter AS-paths. Note that this means that if an AS includes
itself twice in the AS-path it sends to its neighbor -- a strategy
called AS-prepending -- then that path will have a lower preference. A
decision must also be made at this level as to whether we count AS-seq
or AS-set.
Note that any difference in
local_pref value makes AS-path length completely immaterial. It is,
however, possible to have AS-path length used in the process of
calculation of local_pref.
After this step we apply Multi_Exit_Disc if we are configured to do so; this amounts to honoring our neighbor's preferences. Even then we may disable the use of MED; honoring our neighbor's preferences may raise our own costs!
Fourth we apply trivial tie-breakers. Note that if a tie-breaker rule
assigns significant traffic to one AS over another, then it may have
significant economic consequences and shouldn't be considered
"trivial". If we detect this situation, we would probably address it in
the local-preferences phase.
There are similar rules for exporting paths: conform to policy first, etc. Typically, we can control by policy
which paths we announce. This can lead to peculiarities. Suppose, for
example, that A reaches N via B, and announces this to D. Later A
switches to reaching N via C, but A is forbidden by policy to announce C-paths to D! Then A must announce that old path is no longer ok.
Policy Examples
Arbitrarily complex policies may be created; however, in general policies are determined by a few simple AS relationships:
customer-provider: A provider contracts to transit traffic from the customer to the rest of the internet.
siblings:
siblings have a connection between them that they intend as a backup
route in case either sib loses its route to its primary provider.
Siblings may or may not have the same provider as parent.
peers: peers essentially are
each large providers, that agree to exchange all their customer traffic
with each other. (Generally the idea is to exchange comparable volumes
of traffic, with no exchange of cash, but if the volume flow is
significantly asymmetric, compensation can certainly be negotiated.)
These relationships can be described in terms of what routes are accepted (from Gao). First, an AS has customer routes, that is, the routes of its direct customers, and its customers' customers, etc. Other routes we might call provider/peer routes. Essentially every AS exports its customer routes to all its AS neighbors; we might refer to these routes as its own routes; the issue is with provider-peer routes. Here are Gao's rules:
- Customers do not export provider/peer routes to their providers.
- Providers do export
provider/peer routes to their customers (though note that the provider
may just provide a single consolidated default route to a single-homed
customer).
- Peers do not export provider/peer routes to their providers; peers do not (usually) provide transit services to third parties.
- Siblings do export provider/peer routes to each other.
There may be more complicated variations: if a regional provider is a
customer of a large transit backbone, then the backbone might only
announce routes listed in transit agreement (rather than all routes, as
above). There is a supposition here that the regional provider is
multihomed, and has contracted with that particular transit backbone
only for certain routes.
Gao's rules describe a simplified BGP world. Special cases for special situations may introduce non-convergence or instability.
tier-1 peer to tier-1 peer: full peering is sometimes thought of as
meaning no restrictions on route announcements, but generally peers
don't accept routes belonging
to third parties. If AS1 peers with AS2, then they generally won't
transfer traffic belonging to a third non-customer AS3; AS3 would
either have to become a customer of one or else arrange to peer with both. Each backbone AS must, then, peer with every other backbone AS. [This often takes place at big exchanges like MAE-EAST.]
A consequence of all this is the no-valley theorem [Gao]: if we consider the full AS-path from A to B, then there is at most one
peer-peer link. Those to the left of the peer-peer link are (moving
from left to right) either customer->provider links or
sibling->sibling links; that is, they are non-downwards (ie upwards
or level). To the right of the peer-peer link, we see
provider->customer or sib->sib links; that is, these are
non-upwards. If there is no peer-peer link, then we can still divide
the AS-path into a non-downwards first part and a non-upwards second
part.
The above assumes certain implicit import/export rules. Here is an explicit such rule, from Gao-Rexford:
If AS1 gets two routes r1 and r2 to a
destination, and the first AS of the r1 route is a customer of AS1, and
the first AS of r2 is not, then r1 will be assigned a higher local_pref
value than r2.
More complex rules exist that allow for cases when the local_prefs can
be equal; one such rule states that strict inequality is only required
when r2 is a provider route.
Assuming these rules, one can prove that the BGP algorithm must always converge to a stable arrangement.
Examples
First, let us consider the example of configuring a private link,
such as a hypothetical link between Loyola and Northwestern:
----ISP1---nwu
|
|link1
|
----ISP2---luc
Note that the issue is the use of the ISP1--nwu link by luc, and the ISP2--luc link by nwu; link1 use might be an issue but let's assume that it is not the bottleneck.
Three common options: no-transit, backup, load-balancing
1. nwu,luc don't export link1: no transit at all. This is done via export filtering.
2a. Export but have ISP1, ISP2 rank at low preference:
This means that the link will be used for inbound transit for backup only;
ISP1 prefers route to luc through ISP2, & vice-versa
(you can't necessarily specify ISP1's rank in your advertisement)
2b. Have luc have its DEFAULT path be ISP2 by default, but ISP1-via-nwu
if ISP2 becomes unreachable. This is the outbound side of backup.
3. How could we achieve inbound load balancing?
(outbound load-balancing is sort of "up to us") There's no easy fix
here, in that if ISP1 and ISP2 both have routes to luc, we have lost
all control over how other sites will prefer one to the other. We may be able to artificially make one path appear more expensive.
Another basic situation:
/-------transit B------\
end-cust
transit A---
\-------transit C------/
How can end-cust express preference for B over C? For inbound traffic,
it is A that makes the decision. For outbound traffic, end-cust can add
an appropriate route preference.
In the next example, consider traffic from A to D. Here is one picture:
/-------transit B------\
A
D which path is used?
\-------transit C------/
Here is another picure showing possible exchange points 1 and 2.
+-------transit B------+
|
| which path is used? 1,2 might be
MAE/NAPs
A--1-------transit C------2---D
B and C peer at two points. B would probably like to make sure C
doesn't cheap it out by routing D-bound traffic to B at 1, and A-bound
traffic to B at 2.
Note that if C and B try to
get away with this, a routing loop is created! This is why it's so
important in BGP for sites to advertise the AS-path they actually use
(versus another path); in this example, if C routes its traffic via B,
then it must advertise to B that it is using that AS-path through B.
This rule:
- Tells B not to route to C, since then the AS-path becomes loopful
- Notifies B of what C is doing, if for example there is a contract to handle this differently.
BGP enforces "hop-by-hop consistency": if A sends traffic to B, destination N, then this traffic must follow the same path that B's traffic to N takes. That is, BGP does not provide any support for the following combination:
B's own traffic to N is sent by route X
B routes A's traffic to N via route Y
You might, however, want to do this if your own traffic has higher
priority, or if A's traffic has higher priority (they paid more?)
IP tunneling is a way around this.
provider-based addressing and web farms (with asymmetrical traffic flows); need for MED
+-------S------+ AS1
1 2 3
| | |
+A------+-----B+ AS2
Traffic from A to S goes via link 1, and is carried by AS1. Traffic
from B to S goes via link 3, and is also carried by AS1. But traffic
from S to either A or B goes via link 2, and AS2 pays! Even though S is
a customer of AS1! Fix: AS1 & AS2 might negotiate that each use MED
information from the other.
Some instability examples
Route flaps in the real world are usually caused by mundane things:
routers going up and down, backhoes and other things causing cable
breaks, etc. However, BGP allows genuinely unstable situations to
occur; this is a consequence of allowing each AS a completely
independent hand in selecting preference functions. Here are two simple
examples, from [GaoRexford].
1. Temporary instability:
AS1---
| \
| AS0--d
| /
AS2---
We assume AS1 prefers AS-paths to destination d in the following order:
(2,0), (0)
(that is, <AS2,AS0> is preferred to the direct path <AS0>.)
Similarly, AS2 prefers paths in the order (1,0), (0). Both AS1 and AS2
start out using path (0); they advertise this to each other. As each
receives the other's advertisement, they each switch to routing traffic to the other;
that is, AS1 switches to path (2,0) and AS2 switches to (1,0). This, of
course, causes a routing loop, and so both will switch back to (0) as soon as they announce to each other the change in what they use.
This oscillation may continue for some time, as long as both AS1 and
AS2 switch away from (0) at the same moment. If, however, AS1 switches
to (2,0) while AS2 continues to use (0), then AS2 is stuck and the
situation is stable.
[AS1 and AS2 might choose not to export their d-route to each other for just this reason.]
2. Permanent instability [originally due to Varadhan, Govindan, & Estrin, 2000]
AS1
/ | \
/ | \
/ AS0 \
/ / \ \
/ / \ \
/ / \ \
AS2----------AS3
The destination is within AS0. AS1-3 each have a direct route to AS0, but each prefers
the AS-path that takes their clockwise neighbor; that is, AS1 prefers
(3,0) to (0); AS3 prefers (2,0) to (0), and AS2 prefers (1,0) to (0).
Suppose all adopt (0), and advertise this, and AS1 is the first to look
at the routes and update its table. AS1 switches to the route (3,0),
and announces this.
At this point, AS2 sees that AS1 uses (3,0); if AS2 switches to AS1
then its path would be (1,3,0) rather than (1,0) and so it does not make the switch.
But AS3 does switch: it prefers (2,0) and this is still available. Once
it makes this switch, and advertises it, AS1 sees that the route it had
been using, (3,0), has become (3,1,0). At this point AS1 switches back
to (0).
Now AS2 can switch to using AS1, and does so. After that, AS3 finds it is now using (2,1,0) and it
switches back to (0). This allows AS1 to switch to the longer route,
and then AS2 switches back to the direct route, and then AS3 gets the
longer route, then AS2 again, etc, forever rotating clockwise.
Bibliography
[VGE] Varadhan, Govindan, and Estrin, Persistent route oscillations in inter-domain routing, Computer Networks, Volume 32, January 2000.
[Gao] Lixin Gao, On Inferring Autonomous System Relationships in the Internet, IEEE Transactions on Networking, Volume 9, December 2001.
[GaoRexford] Lixin Gao and Jennifer Rexford, Stable internet routing without global coordination, IEEE Transactions on Networking, Volume 9, December 2001.