Quality of Service

Any sufficiently advanced incompetence is indistinguishable from malice.

Archive for July, 2008

Bombs over Grandma

Posted by qualityofservice on July 28, 2008

Came across some old and interesting news while researching older and equally interesting news related to the Cogent/Telia de-peering dispute* (resolved well over a year ago; as a quaint addendum, Cogent as of one month ago has become a transit-free AS). From NetworkWorld last year:

If the United States found itself under a major cyberattack aimed at undermining the nation’s critical information infrastructure, the Department of Defense is prepared, based on the authority of the president, to launch a cyber counterattack or an actual bombing of an attack source.

Anyone else get visions of a phalanx of oblivious grandmothers — with zombie-bots simultaneously attempting to exploit Kaminski’s recent DNS vulnerability — suddenly finding themselves on the receiving end of Apache gunfire?

I am so turned on right now.

*The long-term goal here is to be able to solidly and confidently converse in the language of large-scale backbone providers, so that I might not make an ass out of myself immediately upon joining their ranks.

Posted in Awesome, Miscellany | Tagged: , , | Leave a Comment »

Whole lot of awesome going on here.

Posted by qualityofservice on July 26, 2008

Not only the most excellent song from Dr Horrible’s Sing-Along Blog, but quite clearly the best song of this year.

Posted in Awesome | Tagged: , , , | Leave a Comment »

Air Travel is Stupid

Posted by qualityofservice on July 17, 2008

Hay guyz, wuts goin’ on hear.

Posted in Dumbassery | 4 Comments »

IPSEC and Multicrap

Posted by qualityofservice on July 17, 2008

A question came up on some other forums: I have an ospf domain with 10 routers. I wanted to configure ipsec between two ospf running routers for the sack of security but I came to know that multicast can not be forward over ipsec. Can anyone tell me the reason why we can not do that? And please also state solution for that. If anyone know useful document on this, please tell me.

The solution is easy.  GRE over IPSEC.

The reason isn’t so simple.  Here’s what I gathered and I’m sticking to it.

The reason is specified in one of the original IPSEC RFC’s: http://tools.ietf.org/html/rfc2401#section-4.7

For multicast traffic, there are multiple destination systems per
multicast group. So some system or person will need to coordinate
among all multicast groups to select an SPI or SPIs on behalf of each
multicast group and then communicate the group’s IPsec information to
all of the legitimate members of that multicast group via mechanisms
not defined here.

The RFC doesn’t say that it can’t be done. It just says that doing so isn’t easy, and they decided to leave the job of actually doing it to someone else. smile.gif

One of the big concerns with multicast IPSEC is obviously scaling. It’s something we don’t think about when we’re running point-to-point GRE over IPSEC from our remote sites to our hubs in low-to-no multicast networks.

However, what happens when you start multicasting from your hub site? Each point-to-point connection is encrypted with a separate key, hence each packet of the multicast stream causes the router to perform a key lookup and encryption for each remote site, adding significant overhead through your router. The router has to “encrypt many” and “send many” for each multicast packet; in a network of 1500 peers, that means 1500 encryptions prior to 1500 replications of each multicast packet at the hub router.

Another way of looking at it is that for a single multicast stream, your hub site needs to be able to support the stream’s bandwidth multiplied by the number of peers. Since each multicast packet going to your crypto engine has to be copied first, THEN encrypted, a 1mb stream going to 1500 peers requires that your engine be able to handle 1.5Gbps of encryption throughput. You can’t encrypt the packet first and then send it out to multiple sites, because each site decrypts the packet differently using its own key.

Resolving this requires a re-examination of how IPSEC works. IPSEC is basically point-to-point, with two “connections” on each endpoint: one SA to encrypt outbound traffic, one SA to decrypt inbound traffic. For multicast to scale, IPSEC has to become either point-to-multipoint, or any-to-any. If the IPSEC peers had a common “group,” they could then encrypt a packet with the “group” key just one time, and replicate that packet to the number of peers; we go from “encrypt many, send many” to “encrypt ONCE, send many.” While the packet in my previous examples would still have to be replicated 1500 times, only one encryption would be performed on that packet. This reduces the burden on your crypto engine: it now would only have to encrypt the 1mbps stream, instead of 1500 1mbps streams. All peer sites would have the correct SA to decrypt the stream.

I believe Cisco’s solution to this issue was the introduction of Group Encrypted Transport. See here for more: http://www.cisco.com/en/US/docs/ios/12_4t/12_4t11/htgetvpn.html

Other concerns, including security concerns, are raised in the following IETF draft: http://tools.ietf.org/html/draft-ietf-msec-ipsec-multicast-issues-01

Posted in IPSEC | Leave a Comment »

How do you do neighbor?*

Posted by qualityofservice on July 16, 2008

Most of the OSPF documentation you’ll read states that OSPF uses multicast traffic to build adjacencies and flood Link State Advertisements. Turns out that at one key point in the building of neighbour relationships, it actually unicasts a packet (IF the media is broadcast; point-to-point media will use multicast for everything). One little packet out of millions can potentially bring down your network should you fail to account for it. Rare, but it can happen, and I’d like to document why (because in our case, it actually did happen; thankfully not to a network under my control, which made figuring out the cause all the more enjoyable). I am indebted to the excellent Cisco Press Troubleshooting IP Routing Protocols text for pointing me in the correct direction.

If you happen to block access to an interface’s IP address for security reasons — without explictly permitting all OSPF traffic earlier in the ACL with something like “permit ip ospf any any”) — you will find yourself in an outage situation if a previously-established neighbour relationship breaks and tries to re-form. This can easily happen in hastily-configured infrastructure ACL’s.
This creates a “time-bomb” on your network: it could be stable for MONTHS…until the relationship is torn down. Hell breaks loose when it tries to come back up. Without going into too much detail, OSPF goes through a series of states while building the relationship. The circumstances described above will cause the router to become stuck in the EXSTART state. For more background on OSPF states, please see the following: http://www.cisco.com/warp/public/104/1.html#t20

To test this, I set up OSPF between two routers, applied an access-list to block unicast packets destined to the IP address of one of the router interfaces (and permit everything else), and created another ACL to apply to “debug ip packet” that would monitor all OSPF traffic (IP Protocol #89).

All the major work was done on one router…important parts of test config are as follows:

interface Loopback0
description OSPF Picks Loopback Address to Become Router ID
ip address 20.20.20.20 255.255.255.255
!
interface Ethernet0/0
ip address 172.20.28.1 255.255.255.0
ip access-group BlockOSPF in
ip ospf priority 100
half-duplex
!
router ospf 1
log-adjacency-changes
passive-interface Loopback0
network 20.20.20.20 0.0.0.0 area 0
network 172.20.28.0 0.0.0.255 area 0
ip access-list extended BlockOSPF
deny ip any host 172.20.28.1
permit ip any any
!
access-list 198 permit ospf any any
Then I enabled debug logging to the VTY and started the debug:
conf f
logging monitor debugging
exit
terminal monitor
debug ip packet 198
debug ip ospf adjacency
When the ACL was applied, the neighbour adjacency remained up as expected (the issue only shows up during adjacency formation; after the relationship is established, it remains in the FULL state and only sends HELLO and LSA packets). So I shut the interface down and brought it back up to force it to go through adjacency setup again. Here’s what happened. “ip packet” debugs are indicated by “IP,” “ospf adj” debugs indicated by “OSPF”:
*Apr 22 02:36:42.684: OSPF: Interface Ethernet0/0 going Up
*Apr 22 02:36:42.684: IP: s=172.20.28.1 (local), d=224.0.0.5 (Ethernet0/0), len 76, sending broad/multicast
*Apr 22 02:36:42.688: IP: s=172.20.28.2 (Ethernet0/0), d=172.20.28.1, len 80, access denied
*Apr 22 02:36:44.296: IP: s=172.16.120.105 (local), d=224.0.0.5 (Serial1/0), len 120, sending broad/multicast
*Apr 22 02:36:44.296: OSPF: Build router LSA for area 0, router ID 20.20.20.20, seq 0×8000003B
*Apr 22 02:36:44.672: %LINK-3-UPDOWN: Interface Ethernet0/0, changed state to up
*Apr 22 02:36:45.672: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to up
*Apr 22 02:36:46.268: IP: s=172.16.120.105 (local), d=224.0.0.5 (Serial1/0), len 80, sending broad/multicast
*Apr 22 02:36:51.064: IP: s=172.20.28.2 (Ethernet0/0), d=224.0.0.5, len 80, rcvd 0
*Apr 22 02:36:51.064: OSPF: 2 Way Communication to 30.30.30.30 on Ethernet0/0, state 2WAY
*Apr 22 02:36:51.064: OSPF: Backup seen Event before WAIT timer on Ethernet0/0
*Apr 22 02:36:51.068: OSPF: DR/BDR election on Ethernet0/0
*Apr 22 02:36:51.068: OSPF: Elect BDR 20.20.20.20
*Apr 22 02:36:51.068: OSPF: Elect DR 30.30.30.30
*Apr 22 02:36:51.068: OSPF: Elect BDR 20.20.20.20
*Apr 22 02:36:51.068: OSPF: Elect DR 30.30.30.30
*Apr 22 02:36:51.068: DR: 30.30.30.30 (Id) BDR: 20.20.20.20 (Id)
*Apr 22 02:36:51.068: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×1BEE opt 0×52 flag 0×7 len 32
*Apr 22 02:36:51.068: IP: s=172.20.28.1 (local), d=172.20.28.2 (Ethernet0/0), len 64, sending
*Apr 22 02:36:51.072: IP: s=172.20.28.1 (local), d=172.20.28.2 (Ethernet0/0), len 80, sending
*Apr 22 02:36:52.684: IP: s=172.20.28.1 (local), d=224.0.0.5 (Ethernet0/0), len 80, sending broad/multicast
*Apr 22 02:36:56.072: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×1BEE opt 0×52 flag 0×7 len 32
*Apr 22 02:36:56.072: IP: s=172.20.28.1 (local), d=172.20.28.2 (Ethernet0/0), len 64, sending
*Apr 22 02:36:56.072: OSPF: Retransmitting DBD to 30.30.30.30 on Ethernet0/0 [1]
*Apr 22 02:36:56.072: IP: s=172.20.28.2 (Ethernet0/0), d=172.20.28.1, len 64, access denied
*Apr 22 02:36:56.268: IP: s=172.16.120.105 (local), d=224.0.0.5 (Serial1/0), len 80, sending broad/multicast
*Apr 22 02:37:01.064: IP: s=172.20.28.2 (Ethernet0/0), d=224.0.0.5, len 80, rcvd 0
*Apr 22 02:37:01.064: OSPF: Neighbor change Event on interface Ethernet0/0
*Apr 22 02:37:01.064: OSPF: DR/BDR election on Ethernet0/0
*Apr 22 02:37:01.064: OSPF: Elect BDR 20.20.20.20
*Apr 22 02:37:01.068: OSPF: Elect DR 30.30.30.30
*Apr 22 02:37:01.068: DR: 30.30.30.30 (Id) BDR: 20.20.20.20 (Id)
*Apr 22 02:37:01.072: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×1BEE opt 0×52 flag 0×7 len 32
*Apr 22 02:37:01.072: IP: s=172.20.28.1 (local), d=172.20.28.2 (Ethernet0/0), len 64, sending
*Apr 22 02:37:01.072: OSPF: Retransmitting DBD to 30.30.30.30 on Ethernet0/0 [2]
*Apr 22 02:37:01.072: IP: s=172.20.28.2 (Ethernet0/0), d=172.20.28.1, len 64, access denied
*Apr 22 02:37:02.684: IP: s=172.20.28.1 (local), d=224.0.0.5 (Ethernet0/0), len 80, sending broad/multicast
The output on the neighbouring router is equally interesting: it will be sending and receiving DBD packets. But since the two are trying to establish which one will be the master and which will be the slave during the initial exchange, each will just keep retransmitting because Router1 isn’t sure who the master will be (flag 0×7 means “I am initializing the relationship, I have more to send, and I am the Master”; normally, both sides send DBD packets with this message and the one with the highest Router ID wins.

For reference, the DBD packet contains three bits: I (Initialization), M (More), and Master/Slave (MS). During the EXSTART stage, each router sets them all (giving the flag 0×7 seen above). During the actual exchange, only the M and MS bits will be sent. The master will have a flag 0×3, the slave will have 0×2, assuming both have DBD packets to send. When they’re finished, the master will say so by setting its flag to 0×1, and the slave will be 0×0.

Now watch what happens when all OSPF traffic is explicitly permitted to hit the interface; note the state of the DBD flags as each device progresses through the EXCHANGE state:
*Apr 22 03:19:40.204: OSPF: Interface Ethernet0/0 going Up
*Apr 22 03:19:40.208: OSPF: 2 Way Communication to 30.30.30.30 on Ethernet0/0, state 2WAY
*Apr 22 03:19:40.208: OSPF: Backup seen Event before WAIT timer on Ethernet0/0
*Apr 22 03:19:40.208: OSPF: DR/BDR election on Ethernet0/0
*Apr 22 03:19:40.208: OSPF: Elect BDR 20.20.20.20
*Apr 22 03:19:40.208: OSPF: Elect DR 30.30.30.30
*Apr 22 03:19:40.208: OSPF: Elect BDR 20.20.20.20
*Apr 22 03:19:40.212: OSPF: Elect DR 30.30.30.30
*Apr 22 03:19:40.212: DR: 30.30.30.30 (Id) BDR: 20.20.20.20 (Id)
*Apr 22 03:19:40.212: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×115E opt 0×52 flag 0×7 len 32
*Apr 22 03:19:40.704: OSPF: Build router LSA for area 0, router ID 20.20.20.20, seq 0×80000041
*Apr 22 03:19:45.212: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×115E opt 0×52 flag 0×7 len 32
*Apr 22 03:19:45.212: OSPF: Retransmitting DBD to 30.30.30.30 on Ethernet0/0 [1]
*Apr 22 03:19:45.212: OSPF: Rcv DBD from 30.30.30.30 on Ethernet0/0 seq 0×2122 opt 0×52 flag 0×7 len 32 mtu 1500 state EXSTART
*Apr 22 03:19:45.216: OSPF: NBR Negotiation Done. We are the SLAVE
*Apr 22 03:19:45.216: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×2122 opt 0×52 flag 0×2 len 112
*Apr 22 03:19:45.220: OSPF: Rcv DBD from 30.30.30.30 on Ethernet0/0 seq 0×2123 opt 0×52 flag 0×3 len 92 mtu 1500 state EXCHANGE
*Apr 22 03:19:45.220: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×2123 opt 0×52 flag 0×0 len 32
*Apr 22 03:19:45.224: OSPF: Rcv DBD from 30.30.30.30 on Ethernet0/0 seq 0×2124 opt 0×52 flag 0×1 len 32 mtu 1500 state EXCHANGE
*Apr 22 03:19:45.224: OSPF: Exchange Done with 30.30.30.30 on Ethernet0/0
*Apr 22 03:19:45.224: OSPF: Send LS REQ to 30.30.30.30 length 12 LSA count 1
*Apr 22 03:19:45.224: OSPF: Send DBD to 30.30.30.30 on Ethernet0/0 seq 0×2124 opt 0×52 flag 0×0 len 32
*Apr 22 03:19:45.228: OSPF: Rcv LS REQ from 30.30.30.30 on Ethernet0/0 length 48 LSA count 2
*Apr 22 03:19:45.228: OSPF: Send UPD to 172.20.28.2 on Ethernet0/0 length 108 LSA count 2
*Apr 22 03:19:45.232: OSPF: Rcv LS UPD from 30.30.30.30 on Ethernet0/0 length 76 LSA count 1
*Apr 22 03:19:45.232: OSPF:
Synchronized with 30.30.30.30 on Ethernet0/0, state FULL
*Apr 22 03:19:45.232:
%OSPF-5-ADJCHG: Process 1, Nbr 30.30.30.30 on Ethernet0/0 from LOADING to FULL, Loading Done

After both sides agree that they are finished with the EXCHANGE state, they proceed to the LOADING state, and ask for updated routes. Once both sides agree that their link-state databases are they same, they proceed to the FULL state and stay there until disrupted. The state of a relationship can be viewed with “show ip ospf neighbour”:
Neighbor ID Pri State Dead Time Address Interface
10.10.10.10 0 FULL/ – 00:00:31 172.16.120.120 Serial1/0
30.30.30.30 1 FULL/BDR 00:00:32 172.20.28.2 Ethernet0/0
Here we see that the neighbour at 30.30.30.30 is the Backup Designated Router for the segment; it has the higher Router ID (> 20.20.20.20), but note that I’ve configured the Eth0/0 interface of my test router with “ip ospf priority 100″ to force the test router to be the DR for the segment; higher priority will always win the DR election**. The default priority is 1, and if both neighbours use the default, the higher Router ID will win. In Cisco IOS, Router ID can be statically defined as a “router ospf”-level command, or it will be the highest loopback address on the router. If there are no loopback addresses or statically defined ID’s, OSPF will select the highest interface IP on the router (which can lead to instabilities if that interface goes down) to become the Router ID.

Note that even though 20.20.20.20 is the DR, 30.30.30.30 still becomes the Master of the EXCHANGE state, since master/slave is determined by highest Router ID and has nothing to do with interface priority. Also note that the neighbour at 10.10.10.10 has no DR or BDR; as this link is a point-to-point serial link, no DR election occurs.

I’m venturing into new and exciting territory here, by putting my studies up to public scrutiny. If you’ve stumbled upon this page by accident (perhaps you were looking to read up on MQC or something) and you see a statement that strikes you as dumb or wrong, I welcome questions, comments, concerns and critiques.

*comma purposefully omitted

**In IOS at least, the DR is “sticky”; after it’s elected, you have to clear the OSPF process on the DR in order to force a re-election on the segment.

Posted in OSPF | Leave a Comment »