Quality of Service

Any sufficiently advanced incompetence is indistinguishable from malice.

Fun with EIGRP

Posted by qualityofservice on October 19, 2009

Recently had a chance to play around with my very first T1, interfacing two routers in a place we’ll call Branch1, across a T1 link between buildings on the site.  Fun times!  I’ve only ever interfaced with a T1 initially configured by the provider, or by way of a frame-relay switch, but have never had to go from serial interface directly into CSU/DSU, oddly enough.  Spent the better part of a week wondering what cable I should use, though I needn’t have worried…a router that’s already connected to a CSU/DSU will give you the right answer if you know what question to ask:

R3#sho controllers serial 0/0

Interface Serial0/0
Hardware is PowerQUICC MPC860
DTE V.35 TX and RX clocks detected.
idb at 0×81122904, driver data structure at 0×8112A398

The truly fun part came in figuring out how to do the routing.  My newly acquired remote sites are all set up as EIGRP stub routers, since I don’t need them transiting traffic for other sites.  Sites that are NOT set up as stub routers (i.e. the default when bringing up EIGRP), when brought into a dual-hub-and-spoke topology such as ours (where all the remote sites have at least two VPN tunnels: one to  each “hub” site), will advertise routes learned from Hub1 back to Hub2 and vice versa.  That makes for a messy and complicated EIGRP topology table, and makes routes from Hub1 appear (from Hub2’s perspective) to be available via Branch1 (and vice versa), meaning if the links between Hub1 and Hub2, Branch1 could be used as a transit site for Hub1-Hub2 traffic.  EIGRP stubs stop this behaviour; they will accept routes from their peers, but EIGRP stub routers by default will only advertise connected and summary routes back to their peers.

This is fine when the site has just one router; it’s more complicated when there are routers downstream from your stub router, as is the case with Branch1.  A router downstream of the EIGRP Stub router carries traffic for the 172.29.18.0/24 LAN in a different building at the facility.  Setting up EIGRP between this router and the Stub router will cause 172.29.18.0/24 to be advertised to the Stub router, but the Stub router will not advertise the 172.29.18.0/24 route to Hub1 or Hub2, making the network unreachable from our hub sites.  You could just point default on the downstream router, and redistribute a static route on the Stub router, but where’s the fun in that…

In my case, the Branch1 office LAN and the adjacent building’s LAN are bitwise-adjacent: 172.29.19.0/24 and 172.29.18.0/24, respectively.  This neatly summarizes to 172.29.18.0/23, so all I had to do was configure the summary route to be advertised on the tunnel interfaces of R2, and point default on R3 –  since stub routers will by default advertise summaries, no more work had to be done.  There are other options to do this, for the curious (mostly covered here: http://www.nil.com/ipcorner/EigrpStub/).

Config on the stub is as follows (for simplicity, the link to only one hub site is shown).  No need to configure a static route to 172.29.18.0/24 on R2, because it will learn the route from its neighbour R3 — but since R2 is a stub, it won’t advertise the 172.29.18.0/24 upstream to R1.  I had a static route in anyway from earlier testing, and I’ll get to that in a second.

interface Tunnel1011105
description “To R1 Hub Site”
bandwidth 3072
ip summary-address eigrp 102 172.29.18.0 255.255.254.0 5
!
i
!
router eigrp 102
eigrp stub connected summary

As mentioned a second ago, for a while I was using a static route on the Stub router, pointing to the downstream router in the other building, and redistributing static into EIGRP.  When I actually brought up EIGRP between the two, got a chance to see something neat.  Picture attached for ease of reference.

eigrp

Doing a “show ip route” showed the proper next hop as per the configured static route:

R2#sho ip route 172.29.18.0
Routing entry for 172.29.18.0/24
Known via “static”, distance 1, metric 0
Routing Descriptor Blocks:

  • 172.29.254.62

Route metric is 0, traffic share count is 1

Admin distance of 1 beats everything but a better-match (i.e. longer prefix) or a directly connected route.

But with EIGRP now running between the two, it also showed up in the EIGRP topology table, but with a max metric (note that the composite metric is actually correct; it’s a function of min bandwidth and total delay along with path, and the configured K values i.e. “eigrp metric …” and we use weird K-values in our environment, so YMMV):

R2#sho ip eigrp topology 172.29.18.0 255.255.255.0
IP-EIGRP (AS 102): Topology entry for 172.29.18.0/24
State is Passive, Query origin flag is 1, 0 Successor(s), FD is 4294967295
Routing Descriptor Blocks:
172.29.254.62 (Serial0/0/1), from 172.29.254.62, Send flag is 0×0
Composite metric is (4230656/38400), Route is Internal
Vector metric:
Minimum bandwidth is 1544 Kbit
Total delay is 20100 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 1

I then removed the static route, and ran the same commands.  EIGRP took over, and nary a packet was lost.  The metric given by FD also correctly matches that given by the “composite metric”:

R2#sho ip route 172.29.18.0
Routing entry for 172.29.18.0/24
Known via “eigrp 102”, distance 90, metric 4230656, type internal
Redistributing via eigrp 102
Last update from 172.29.254.62 on Serial0/0/1, 00:00:51 ago
Routing Descriptor Blocks:

  • 172.29.254.62, from 172.29.254.62, 00:00:51 ago, via Serial0/0/1

Route metric is 4230656, traffic share count is 1
Total delay is 20100 microseconds, minimum bandwidth is 1544 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 1

R2#sho ip eigrp topology 172.29.18.0 255.255.255.0
IP-EIGRP (AS 102): Topology entry for 172.29.18.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 4230656
Routing Descriptor Blocks:
172.29.254.62 (Serial0/0/1), from 172.29.254.62, Send flag is 0×0
Composite metric is (4230656/38400), Route is Internal
Vector metric:
Minimum bandwidth is 1544 Kbit
Total delay is 20100 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 1

Of course, the rest of the network doesn’t see any of this complexity at all.  It just sees the summary route.  From R1:

R1#sho ip route 172.29.18.0
Routing entry for 172.29.18.0/23
Known via “eigrp 102”, distance 90, metric 7246080, type internal
Redistributing via eigrp 102
Last update from 10.1.1.106 on Tunnel1011105, 01:29:47 ago
Routing Descriptor Blocks:

  • 10.1.1.106, from 10.1.1.106, 01:29:47 ago, via Tunnel1011105

Route metric is 7246080, traffic share count is 1
Total delay is 50100 microseconds, minimum bandwidth is 3072 Kbit
Reliability 255/255, minimum MTU 1440 bytes
Loading 1/255, Hops 1

Also note the change in minimum bandwidth; locally in Branch1,  the minimum bandwidth to 172.29.18.0/24 is that of the T1 interconnecting the two EIGRP peers: 1544 kbps.  When this route gets hidden behind the summary, it inherits the minimum bandwidth as seen on the path between remote sites and the originator of the summary route.  Since the bandwidth of the tunnel between R1 and the summary originator (a tunnel on vpn-01.mtj with configured bandwidth of 3072), that explains the change in min bandwidth.

The delay value comes from the sum of delays along the path: the delay of the lowest-delay component route on the summary-originating router (which is the fa0/1 interface – 172.29.19.0/24 — on vpn-01.mtj, with a delay of 100 usec) + the delay of the next-hop tunnel interface (default 50,000 usec, seen with “sho int tunnel…” command) == 50,100 usec.

So if you ever see something like that max metric in the EIGRP topology table, that’s where it came from: there was already a route with a better administrative distance on the router, but EIGRP keeps its route installed in its topology table “just in case.”  Contrast with RIP, which only keeps a route in its database if it’s the best route; all others will not show up in the output of “show ip rip database.”

Posted in EIGRP | Tagged: , , , | Leave a Comment »

Catalyst Campus Design

Posted by qualityofservice on September 2, 2009

Appears to be a work in progress here, but since it flatters my own approach to design where I can get away with it, I consider it worth sharing. ^_^

http://www.cisco.com/en/US/docs/solutions/Enterprise/Campus/campover.html

Among the highlights: “proper” use of the 3750 series Catalyst switches.  Apart from what may be ever-so-slightly higher backplane capacity that will never be used in a my existing $dayJob environment anyway, there is one architectural difference between the 3750 and the 3650, and one architectural difference only: the ability to do a proper switch stack via the StackWise feature.

If one does not plan to stack the switches, there is no reason to shell out the extra money for a 3750 when the 3650 line will do the job more than adequately.  If the money’s already been spent, however, you can do some great things to simplify and optimize your network.  These include reduced sevice requirements (no need to run STP* or HSRP), simplified configuration and manageability (done only once on the master switch in the stack as opposed to two physical switches) and increased availability and load-balancing.  Of course, this precludes physical separation of the switches by distances greater than the longest available StackWise cable; but if its physical separation one wanted, it’s cheaper to do so with the 3560.

By running etherchannels (which can span multiple switches in the stack) to your up/downstream switches, you can make use of multiple links to your “core” stack instead of having the redundant links blocked by STP.  By definition, loops cannot form, reducing complexity associated with STP (especially in multi-vendor environments, with HP defaulting to MSTP, Cisco defaulting to proprietary PVST+, Dell defaulting to god knows what…), and more easily facilitating the spanning of VLANs across access-layer switches – generally not desired for your user-access, but sometimes preferred for server-access where layer-2 adjacencies are preferred for convenience/clustering’s sake.  This is best illustrated on page 18 of the document, with the next page illustrating the benefit in terms of number of active, traffic-passing links vs. an STP design.

Not running HSRP elimates the risk of split-braining, which occurs with physically separated layer-3 switches when they lose connectivity to one another; both switches go active for the HSRP virtual IP address, but return traffic only goes towards one of the two switches, blackholing traffic destined for one of the switches.

And it’s certainly easy enough to do; assuming your existing separate 3xxx series each have links to the same upstream and downstream devices, just take one ‘of ‘em offline, configure the other with a nice high priority (“switch priority 15”), connect the “secondary” switch via the StackWise cables and power it up.  Recommend administratively disabling all the ports until the appropriate EtherChannels have been set up on all switches.

In summation, a quick general-usage guide for the current models of Catalyst switches, all of which come in PoE and non-PoE, and 12/24/48-port flavours of various speeds (10/100/1000); FYI, their QoS architectures and configuration are identical, differing only in buffer pool size and policer granularity (1Mbps policer intervals on the 2960, 8kbps intervals on the 3560/3750):

2960: Dumb layer-2 wiring closet switch.  Replaces the 2950.  No layer-3, no routing protocols, no VRF stuff, nothing.  Can be managed via IPv4/v6.

3560: Smarter layer-3 wiring-closet/distribution/core switch; successor to the 3550.  Can route IPv4 and v6 over all protocols.

3750: Smartest layer-3 1RU switch.  Exactly the same as the 3560, but with the StackWise feature to enable virtual switching — presenting multiple physical switches as a single switch.

3650-E and 3750-E: Same as their counterparts, but more backplane capacity (StackWise backplane doubles from 32 Gbps to 64 Gbps), dual AC power supplies, and support for 10GigE.

Save for the E-series, all of the above feature a single AC power supply, and a proprietary DC power input.  It can be fed redundant power to this DC input via the RPS 675 (EoS) and RPS 2300.  Each RPS can feed up to six different devices; however, the RPS 675 can only actively back up one device.  The 2300 can provide redundant power for up to two.  If you load up 6 switches on a 2300 and all of their internal supplies fail, four of the switches are out of luck.

As far as dual AC supplies go (again, save for the E-series), one has to take a large step up to the 4500 and 6500-series chassis.  Each features the potential for in-box power and supervisor redundancy.  4500 is relatively (of the two) low-cost choice for port-density and minimal features; 6500 fully loaded can hold most if not all of the internet routing table, all manner of line-cards for all manner of port types, and a variety of service modules (load-balancing and SSL-terminating ACE modules, firewall modules, network analysis modules, etc), and a ton of features.  Not likely to see deployment in $dayJob any time soon, but the datacenter aggregation switch of choice in large environments.

*Yah, yah, you should still “run” STP so as to prevent accidental loops from forming, I know, I know: http://www.cisco.com/web/strategy/docs/gov/turniton_stpt.pdf

Posted in Miscellany, Switching | Leave a Comment »

Browsing safely

Posted by qualityofservice on July 29, 2009

A lot of my real-life friends ask me how I go about securing my PC, mistakenly believing that because I’m a network guy, I also know something about security.  I know enough to secure the devices for which I’m typically responsible, but the home-user use-case tends to be so widely different and varied that the answer is always “it depends,” and there are infinite dependent variables.  It’s easy for me to secure my LAN against rogue DHCP servers, but less so to (cheaply, in terms of time and money) secure a family PC from a teenager who opens every attachment in their email, or clicks through every page on 4chan.  So take what follows with a grain of salt.  It’s also written with Windows in mind (in other words, spare me the “use linux!”).

As far as personal, home-use computing goes, I find the talk that goes on about the relative insecurity of my Windows Vista and XP boxes to be little more than histrionics.  If someone wants the pictures from my digital camera, it’s easier just to add themselves to the Ottawa, ON network on Facebook, and they can probably get a look at them that way.  There are lots of easier ways to steal information about a person than breaking into their PC, when we so readily make information about ourselves available publicly.

(And even once they’ve done that, what can happen, exactly?  Steal my banking information?  That’s not exactly “risky.” A phone-call to my bank, and the problem goes away.  At worst, I show up to a branch in person to verify my identity with all manner of official documentation.  Inconvenient, but hardly life-endangering.)

Want to get someone’s phone number?  Look them up on Facebook.  Check out their friends list.  Randomly send messages to people on the list, say that you’re an old friend trying to contact them for a high-school reunion, but they aren’t responding to your notes and you’re not sure if they check Facebook that often.  Use your imagination.

From there, you can start to draw the conclusion that security isn’t just technical;  it’s social.  It’s about who and what you trust.

But since not everyone can root through my trash for banking statements and hydro bills, remote compromise of my machine may be their only convenient option.  The goal isn’t complete and total security; as the saying goes, the only way to completely secure your PC is to turn it off.  The goal is to make accessing your PC as inconvenient as possible.  For many users, sticking their PC behind a broadband router provides a cheap form of firewalling that’s more than enough to protect them from outside threats.  Personally, I just turn on Windows Firewall and connect directly to my cable modem.  Never was one for trying to hide my PC behind a router.

Next up, the absolute minimum required to easily get around the internets is a browser.  I’m primarily a Firefox man; Google Chrome is incredibly fast and lightweight, but seems a bit lacking in the feature department.  I’ll confess to not giving it a true college try.

But here’s what I use:

Firefox: http://www.mozilla.com/en-US/firefox/personal.html

With the following addons:

AdBlock Plus: http://adblockplus.org/en/

NoScript:  http://noscript.net/getit

Install the add-ons, configure ABP to subscribe to the easylist filter, and that’s it.  NoScript is a bit annoying to work with at first, but you’ll soon get a feel for it.  Under Firefox’s menu, go Tools -> Options -> Advanced -> Update, and make sure everything is checked off to automatically check for and install updates to the browser and add-ons.

After that, configure Windows Update to check for updates frequently, and to download and install frequently.

I don’t run real-time signature-based AV e.g. Norton Anti-Virus, at all.  Vista takes up 800MB of RAM on its own; I don’t need a few million signatures adding another 250MB.  Every month, Microsoft releases an updated Malicious Software Removal Tool (MSRT website: http://www.microsoft.com/security/malwareremove/default.aspx) which runs in the background and is very quiet (as in, you don’t even know it’s running unless it finds something).  You can force it to run manually so that you can actually watch what it’s doing by going Start -> Run -> mrt.exe.

As a backup to the MSRT, I’ll occassionally run MalwareBytes’ Anti-Malware tool: http://www.malwarebytes.org/mbam.php

I run Threatfire (http://www.threatfire.com/) on its most sensitive settings.  It’s another one of those things that appears obnoxious at first, but you get use to it.

For generic cleanup, I run ccleaner (http://www.ccleaner.com/).  I check off just about everything possible, with the exception of “Wipe Free Space,” because that takes FOREVER.  I clear out all histories and saved-form information.  If you’re the kind of person who checks off “Have browser remember passwords,” you may find using this annoying.  But that’s the kind of person who makes themselves most vulnerable when sharing their PC/laptop with someone else.  If you’re incredibly serious about wanting to secure your information, clear out that crap and start getting better at memorizing complicated passwords.  You can configure the program to add itself to your Recycle Bin, so you can just right-click the bin and open it up.  Close all your browsers (so it can access and delete browser caches) and run it every few days.  Clean out everything, and do the “Clean Registry” step, too.

As a bonus, the program also includes a feature to uninstall programs and disable programs that start on bootup.  Check to see if there are programs listed that you don’t recognize, or those which you know don’t need to actually start on boot.  Disable them until needed.

Complicated passwords are another obvious thing;  the more valuable the information, the harder it should be to access.  If you protect it with a username and password of admin/admin, it obviously isn’t that valuable to you.

If you need to fill out a registration form or something, make up a gmail account that you’ll never actually use (I sign up to things using qos.recyclebin@gmail.com, for example, and if you need a gmail invite, let me know), and fill in all the registration forms with fake information unless absolutely necessary.

That way, you hide as many things as you can from being harvested, and you have a convenient place to find your account info when you use it to sign up for something.  If you’re trying to get vendor whitepapers from a place like http://techrepublic.com.com/, for example, do they REALLY need your work email and phone number?  Hell no.  Make some up.  The only reason you need a “real” fake email address that you never check is so that you can check it occassionally in the event that they require you to verify your email address.  You can publish this email address ANYWHERE and never worry because you know for a fact that there’s nothing useful in it anyway.

If you approach every form on the internet with the attitude that it might some day be used against you, you protect yourself against all manner of information harvesting.

Last and most importantly, I don’t install anything I don’t need, and I don’t open weird shit that I’m not expecting.

The more you add to anything, the more you have to protect/trust.  The larger your friends list, the higher probability of someone telling someone else something you don’t want them to know.  Do you really need 4-5 different cellphone related applications on your PC?  No, get them off there.  Limit your exposure to applications to only those which you can easily update, and update often.

And don’t bother giving someone’s chain-letter email due courtesy.  It doesn’t deserve it.  Especially if there’s some sort of weird attachment or link.  Most web-based email clients can disable the display of images or running of scripts, and have good anti-spam and malware practices established.   Don’t turn on the display of images or run weird attachments.  Chances are, you didn’t request it, ergo you’re unlikely to need it.

As mentioned at the beginning, I am hardly a security professional, and am quite amenable to comments and adjustments.  Happy internetting!

Posted in Miscellany, Security | Leave a Comment »

Posted by qualityofservice on July 2, 2009

Light posting in the last little while due to a death in the family.

Will resume over the coming days with more than one could ever care to learn about QoS architecture of the Catalyst 2960/2970/3560/3750 series switches!

Posted in Miscellany | Leave a Comment »

NetFlow

Posted by qualityofservice on June 20, 2009

NetFlow is a Cisco proprietary standard, soon to become (if it’s not already) an international standard in the form of IPFIX (http://www.ietf.org/html.charters/ipfix-charter.html). It tracks flows ingress into an interface and does accounting based on source/dest IP/port, TOS, originating autonomous system, and all manner of other cool things.  This info can be exported to central collectors which can store the data in a DB and mangle it as they see fit.

NetFlow is supported on any and all recent IOS routers (read as: 1800/2800/3800 ISR series, 7200/7600, etc).  Alas, no support on Catalyst dumb Layer-2 and multilayer switches outside of the 4500/6500 line, and even then it requires special hardware in the form of proper line cards/Supervisor Engine(s).

However, you can do a poor-man’s NetFlow by building a “probe” that accepts mirrored traffic from a SPAN port on a switch, and crafts its own NetFlow data from the observed traffic (see also: nTop).  Your mileage may vary depending on your IOS version; this note’s test router uses 12.4(15)T7.

You don’t need a collector to get some use out of the feature, though; it maintains a local cache and that’s what this note’s going to be about.  Quite easy to turn on:

interface FastEthernet0/1
ip address x.x.x.x y.y.y.y
ip flow ingress

Verification:

TEST-VPN-Hub-01#sho ip flow interface
FastEthernet0/1
ip flow ingress

Then turn on the top-talkers feature:

TEST-VPN-Hub-01#conf t
TEST-VPN-Hub-01(config)#ip flow-top-talkers
TEST-VPN-Hub-01(config-flow-top-talkers)#top 100

Then we get the option of viewing un-aggregated cache data, or aggregated cache data:

TEST-VPN-Hub-01#sho ip flow top-talkers ?

Display aggregated top talkers:
<1-100>  Number of aggregated top talkers to show

Display unaggregated top flows:
verbose  Display extra information about unaggregated top flows
|        Output modifiers

Un-aggregated provides a very granular view of flows stored in cache; one flow per source/dest IP/port and IP Protocol number (with protocol number and src/dst ports reported in very obnoxious hex), and by default sorted by bytes ingress to the interface:

TEST-VPN-Hub-01#sho ip flow top-talkers

SrcIf         SrcIPaddress    DstIf         DstIPaddress    Pr SrcP DstP Bytes
Tu110232      10.0.30.63      Fa0/1         192.168.141.81  06 170C 88ED  2074K
Tu110232      10.0.30.40      Fa0/1         192.168.141.71  06 0FD9 F727  1519K
Tu110232      10.0.30.140     Fa0/1         192.168.141.144 06 0BFE DB22  1275K
Tu110232      10.0.30.140     Fa0/1         10.1.250.81     06 0BFE B70E  1243K
Tu110232      10.0.30.140     Fa0/1         10.1.250.81     06 0BFE B70F  1242K
Tu110232      10.0.30.62      Fa0/1         192.168.141.80  06 170C 83ED   532K
Tu110232      10.0.30.140     Fa0/1         10.1.250.81     06 0BFE A204   340K
Tu110232      10.0.30.140     Fa0/1         192.168.141.144 06 0BFE D3D8   251K
Fa0/1         192.168.141.81  Tu110232      10.0.30.63      06 88ED 170C    69K
Fa0/1         192.168.141.80  Tu110232      10.0.30.62      06 83ED 170C    60K
Fa0/1         192.168.141.144 Tu110232      10.0.30.140     06 D3D8 0BFE    38K

Useful if you have a source that’s just pounding away; you can easily see where it’s coming from (and the interface through which it enters) and where it’s going (and the interface through which it leaves).

Aggregated view allows you to aggregate the NetFlow data a whole bunch of different ways (I’ve cut a bunch of ways out for sake of brevity):

TEST-VPN-Hub-01#sho ip flow top-talkers 100 aggregate ?
bytes                  number of bytes
destination-address    Destination address
destination-interface  Destination interface
destination-port       Destination port
icmp                   ICMP type and code
ip-nexthop-address     IP nexthop address
max-packet-length      Maximum packet length
min-packet-length      Minimum packet length
packets                number of packets
source-address         Source address
source-interface       Source interface
source-port            Source port
tcp-flags              TCP flags

What follows are ways to find the hot destination ports from your router’s point of view:

TEST-VPN-Hub-01#sho ip flow top-talkers 100 aggregate destination-port sorted-by packets

There are 20 top talkers:

TRNS DST PORT       bytes        pkts       flows
=============  ==========  ==========  ==========
35053     1638362        8922           1
54232     1462512        4017           1
33773      861529        3757           1
63271     1161960        2904           1
56098      950000        2609           1
46862      916876        2518           1
46863      916472        2516           1
5900      110858        2226           2
0      688278        1030          13
2048      658800         549           1
3070       12480         312           5
4056        3492          70           1
4057        2680          67           1
57556        6804          67           1
41476       15288          42           1
3092        2860          35           3
161        2556          35           3

Note that “Port 0” shows up in the above; I believe this may be related to packet fragmentation.  Non-initial fragments will not contain a transport-layer header; rather, they’ll simply have more transport-layer payload.  NetFlow can relate such a packet to a particular transport-layer protocol on account of the IP Protocol field of the IP packet (6 = UDP, 17 = TCP), but that’s as good as it can do without reassembling the entire packet.

Mind you, the traffic could also be IPSEC, which uses IP Protocol 50 or 51 for AH or ESP, respectively, and does not have port numbers for NetFlow to count.  This test bed was also running EIGRP and GRE tunnels; this traffic may have also been counted as “Port 0” traffic.

And to see some equally hot source hosts:

TEST-VPN-Hub-01#sho ip flow top-talkers 100 aggregate source-add sorted-by packets

There are 25 top talkers:

IPV4 SRC ADDR         bytes        pkts       flows
===============  ==========  ==========  ==========
10.0.30.63          1758749        9609           1
10.0.30.140         3161180        8681           5
10.0.30.62           996875        4319           1
10.0.30.40          1266040        3226           5
192.168.141.80       121738        2444           1
10.1.250.81           35960         899           3
192.168.139.66       990000         825           1
192.168.139.129      988800         824           1
192.168.141.144       24640         616           2
192.168.141.81        22520         451           2
192.168.141.71        12372         309           2
192.168.191.234       19008         288           1
192.168.191.242        9900         150           1
192.168.141.70         3944          81           2
192.168.141.66         3360          56           1
192.168.141.65         3300          55           1
192.168.191.238        2508          38           1
192.168.141.70         1680          28           1
192.168.141.76         1680          28           1
192.168.141.75         1680          28           1
192.168.141.71         1620          27           1
192.168.141.72         1620          27           1
192.168.191.230        1650          25           1
10.1.40.169              72           1           1

The command “show ip cache flow” also produces interesting results, including timers associated with the flow cache.

TEST-VPN-Hub-01#sho ip cache flow

IP packet size distribution (26090 total packets):
1-32   64   96  128  160  192  224  256  288  320  352  384  416  448  480
.001 .500 .155 .007 .005 .005 .006 .007 .007 .007 .006 .204 .045 .004 .004

512  544  576 1024 1536 2048 2560 3072 3584 4096 4608
.003 .002 .002 .012 .008 .000 .000 .000 .000 .000 .000

IP Flow Switching Cache, 278544 bytes
60 active, 4036 inactive, 675 added
29520 ager polls, 0 flow alloc failures
Active flows timeout in 30 minutes
Inactive flows timeout in 15 seconds
IP Sub Flow Cache, 25800 bytes
0 active, 1024 inactive, 0 added, 0 added to flow
0 alloc failures, 0 force free
1 chunk, 0 chunks added
last clearing of statistics 00:07:23

Protocol         Total    Flows   Packets Bytes  Packets Active(Sec) Idle(Sec)
——–         Flows     /Sec     /Flow  /Pkt     /Sec     /Flow     /Flow
TCP-other          338      0.7        29   155     22.6      18.6       9.6
UDP-NTP             37      0.0         1    76      0.0       0.0      15.4
UDP-other          184      0.4        13    75      5.8       6.2      15.5
ICMP               100      0.2         2   757      0.5       0.5      15.6
Total:             659      1.4        19   151     29.1      11.4      12.5

From the above output, you can see that flows will age out of the cache 15 seconds after data associated with the flow stops flowing.  You can test this by pinging something through the router (in my tests, locally-originated ICMP traffic was not counted by NetFlow, but there’s a chance I may have just been doing it wrong), and filtering the output of “show ip flow top-talkers” or “show ip cache flow”, until there’s been enough transferred data associated with the flow for it to work its way into the cache.

Then stop the ping.  15 seconds later, the flow won’t be there anymore; so by definition, flows that have accumulated a lot of traffic have been active for a very, very long time.  This technique is incredibly handy for tracking DoS activity; if you’re able to log into a terminal, you can work backwards to find the source address and input interface of potential DoS’ers, misbehaving hosts, etc.  Taken to its logical conclusion – assuming cooperation with a supportive and clueful ISP — you can even trace a spoofed IP address back to its real source. How this would be accomplished is left as an exercise for the reader.

There’s also a packet-size histogram; from the above, you can deduce that 50% of the packets transiting the router are between 32-64 bytes; 15.5% are between 64-96 bytes; and 20% are between 352-384 bytes.

Over at $dayJob, I use http://www.plixer.com/products/free-netflow.php to keep track of a day’s worth of NetFlow data; for a free tool, it’s incredible for providing point-in-time analysis of application use on my network.  As they say,  in network analysis, there is no substitute for knowing your network.  While longer-term analysis would be ideal, I don’t have long-term enterprise NetFlow collection in my budget, nor the time to build out my own; though after you’ve kept a watchful eye on links for a few weeks, you start to see patterns, and deviations from that pattern should be either easily explained or quickly investigated.

Posted in Management, Security | Tagged: , , | 1 Comment »

Monitoring/managing logins and config changes with IOS

Posted by qualityofservice on June 9, 2009

For the purposes of this note, I’m going to pretend Telnet doesn’t exist.  Most of the stuff applies regardless of whether you use it or not, but I’m happier working under the assumption that all VTY configs look like this:

line vty 0 15
transport input ssh

I’m going to digress already and say that it’s a good idea to restrict access to certain networks:

line vty 0 15
access-class 101 in
transport input ssh

And that some go a step further and protect the last VTY line as a last resort in the event that the other 14 or so are occupied by someone with less-than-benevolent purposes; that way, the host(s) specified in ACL 102 can still manage the router:

line vty 0 14
access-class 101 in
transport input ssh

line vty 15
access-class 102 in
transport input ssh

But back to the point.  In the early days of IOS 12.3, they introduced the “login” command-set for login security enhancement (http://www.cisco.com/en/US/docs/ios/12_3t/12_3t4/feature/guide/gt_login.html).

login block-for 60 attempts 3 within 60
login delay 3
login on-failure log
login on-success log

This gives you three chances to pass the test before the router blocks all logins for 60 seconds (this period is called “quiet mode”).  There’s also a 3-second delay between attempts.  This mitigates someone throwing the kitchen sink at your device; it takes them 9 seconds just to try three times, and they can only do so once a minute without changing IP addresses.  A “quiet mode” list can be configured to allow certain hosts to get around these restrictions; this is a good idea, because someone spamming login attempts can lock you out, and it’s a race to log in when quiet time ends.  Luckily, the “on-failure log” will tell you which IP address is responsible for the attack. Info on configuring quiet-mode bypass is in the documentation linked at the end of this note.

Of course, the problem with this is that by default, IOS will let you attempt four SSH logins before terminating the session.  You can fix that, too. I use this:

ip ssh authentication-retries 2
ip ssh logging events
ip ssh version 2

“authentication-retries” is, literally, retries.  It lets you make two additional attempts after the first failed attempt; hence three in total, which matches up with the three attempts before you’re locked out for a minute, configured above in the “login” section. “Version 2” forces the use of SSHv2 by the client side; SSHv1/v1.5 considered insecure and deprecated for well over a decade.

Finally, the built-in config-change archiver/logger:

archive
log config
logging enable
logging size 200
notify syslog
hidekeys

This will take any change made in config mode, save a small local copy of said changes to a local buffer, and spit them out to syslog. “hidekeys” keeps sensitive info obscured (syslog packets being unencrypted and all).  How many times have you asked yourself “well, what’s changed?”  This lets you know in real-time.

All this and more over at the IOS Security Configuration Guide and Command Reference, which can be found here for IOS 12.4: http://www.cisco.com/en/US/docs/ios/security/configuration/guide/12_4/sec_12_4_book.html

Whole bunch of examples below the cut!

Read the rest of this entry »

Posted in Management, Security | Tagged: , , | Leave a Comment »

What is a “slow link”?

Posted by qualityofservice on May 21, 2009

Cisco’s slightly-outdated QoS SRND (http://www.cisco.com/univercd/cc/td/doc/solution/esm/qossrnd.pdf) refers to slow, medium, and fast-speed links.

What’s the distinction? Why is anything less than 768kbps considered “slow”?

Well, one of the large ones is serialization delay.  This is somewhat covered in the SRND, but I wanted to highlight some of the key reasonings behind the distinction, as they’re applicable regardless of how you define “slow.”

Slow links suck, on that everyone can agree, but for real-time traffic –voice and interactive-video, for example–  they suck even more than only being able to torrent at 80 KB/s.

First, a few constants, some of which influence the “slow” characterization moreso than others, and are specified for informational purposes only:

Well-designed VoIP expects a one-way delay — mouth of speaker to ear of receiver — of approx 150ms.  This is largely going to be a function of processing and propagation delays along the link; as link speeds increase, the effects of serialization delay on the total delay budget are reduced.

Second, standard G.729 codec specifies 10ms of voice per frame.  Cisco IP phones pack two G.729 frames into each VoIP packet, for a total of 20ms of voice carried per packet; I’m focusing exclusively on the G.729 codec as it is the de-facto standard for VoIP over the WAN, due to its compression algorithm producing an 8kbps voice stream –before IP overhead is taken into account– compared to the 64kbps of G.711.

G.729 codecs can compensate for approx. 30ms of lost voice.  Given that a single VoIP packet contains 20ms, a loss of two consecutive VoIP packets (40ms worth of voice) will be noticed by the receiver.  Naturally you want to avoid losses, but moreso in a real-time voice environment than any other.  This is what Low-Latency Queuing is for, but it will not be the focus of this note.

Third, IP phones expect a relatively constant stream of voice packets.  Non-constant delays experienced along the path produce jitter; if one packet takes 160ms to get to the receiver, and the following packet takes 170ms, the stream has experienced 10ms of jitter.  DSPs in IP phones can compensate for approx 40ms of jitter by buffering received packets and replaying them to the receiver at a constant rate.  For this reason, you always want to have packets falling in the window of the jitter buffer (20ms to 50ms is a decent target for the IP phones I’ve read about).

So, given the above, what makes a 768kbps link slow?

Well, you want to stay within the jitter-buffer window, which means approx 30ms of jitter.  Cisco’s speed characterization is based on the assumption of 10ms of jitter per hop.

If we take one extreme, a link that carries a single voice call experiences very low jitter, as there is extremely low delay between voice packets :

———-[voice][voice][voice] ———–>

But what happens when you throw a large data, unfragmented data packet into the mix?

———-[voice][------Data Payload-----][voice] ———–>

If that data packet is 1500 bytes, and the line speed is 768kbbps, it takes 15ms to clock that packet onto the physical line (warning: math!):

(1500*8) = 12000 bits/data-packet

12kb / 768kbps = 15.625ms

Thus, the 2nd voice packet in the stream experiences over 15ms of jitter.  This is fine if you have a site-to-site WAN link, but less fine if your voice path transits multiple routers, as it significantly impacts your jitter and delay budget.

Remember, these numbers are based on Cisco’s assumptions.  If you can get within ~30ms jitter while going through a single slow link and multiple extremely fast links (with very little serialization delay), then the number of routers you transit becomes less of a concern; whereas Cisco’s numbers are based on a three-hop path experiencing 10ms of jitter per hop.

The important thing to take away is that you want to stay within the window, and that at low speeds, serialization delays profoundly impact your jitter budget.  See the following spreadsheet for reference:

Packet Size
Bandwidth 20 40 80 128 256 512 1024 1200 1500
56 2.86 5.71 11.43 18.29 36.57 73.14 146.29 171.43 214.29
128 1.25 2.50 5.00 8.00 16.00 32.00 64.00 75.00 93.75
256 0.63 1.25 2.50 4.00 8.00 16.00 32.00 37.50 46.88
512 0.31 0.63 1.25 2.00 4.00 8.00 16.00 18.75 23.44
768 0.21 0.42 0.83 1.33 2.67 5.33 10.67 12.50 15.63
1024 0.16 0.31 0.63 1.00 2.00 4.00 8.00 9.38 11.72
1544 0.10 0.21 0.41 0.66 1.33 2.65 5.31 6.22 7.77
4632 0.03 0.07 0.14 0.22 0.44 0.88 1.77 2.07 2.59
10000 0.02 0.03 0.06 0.10 0.20 0.41 0.82 0.96 1.20
100000 0.00 0.00 0.01 0.01 0.02 0.04 0.08 0.10 0.12
1000000 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.01
10000000 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Note that above 768kbps, the influence of serialization delays on the links can only get better than 15ms, regardless of packet size; at these speeds, serialization delays become less of a concern.

There are a few things you can do to overcome the serialization limitations imposed by slow-speed links, and there are good reason why [some of] these same techniques should NOT be employed on high-speed links; these will be covered in a future note.

Right now, I’ve got a buddy’s 30th birthday to attend and I plan on getting drunk enough to forget most of this.

Posted in QoS | Tagged: , , , , | Leave a Comment »

Add a little flash (to your IOS router)

Posted by qualityofservice on May 19, 2009

Can’t believe I’ve never played with these before, they’re brilliant.   12.4T Advanced IP Services images are over 32MB in size and it’s not possible to store two different images on the same stock flash drive, which introduces a risk when remote upgrades are required.  If an upgrade goes bad, there are some sites where I can count on remote hands capable of solid support; others, I’m not so fortunate.  So all the remote sites are getting USB keys, now, which will do more for my ability to keep my sites consistent and stable than any other measure implemented in my three years in this position.

 The ISR routers come with a USB port.  Insert USB stick, router recognizes it immediately. 

 Do a “format usbflash0” and it was ready to go.  TFTP’d an image, and set it to boot from the USB stick with “boot system usbflash0:[imagename]”, rebooted, and came back up on an upgraded image.  Removed the memory key, rebooted, and it ignored the “boot system” specification and booted back into the old image from flash.

 Copied the old image from flash onto the USB stick (“copy flash:[oldimage] usbflash0:”), deleted the old image from flash, copied the new image to flash, and done.  Known working image in flash, and both old and new images stored on the USB stick.  In my case, an 1841 recognized a 4GB USB key, which provides 16x more image storage capacity over the default 64MB of Flash that ships with the ISR bundles I order. 

 No need to worry about a reboot leaving you high-and-dry mid-upgrade after you’ve removed an old image to make room for the new one; which should remove any reticence to keeping IOS images current.  Just copy to USB and boot from the stick, first (caveat: takes about 220 seconds to load a 36MB image from USB into RAM on an 1841; takes about 120 seconds to load the same image from flash).  Worst case, you fall back to a known good image in flash.

 For the security conscious, yes, this opens up the ability to have someone stick their own file onto the USB key and somehow get your router to load it; but if they have the physical access to permit them to do this in the first place, it’s simpler for them to just reboot into password recovery mode and do whatever they like.

 Caveats: Cisco will sell you their own USB keys, but they’re about $300 after discount to add 256MB (part number: MEMUSB-64/128/256FT);   I’d rather pay $10 to add 4GB.  I’ve only tested this with a Kingston DataTraveller stick; YMMV.  I also move the “new” image to Flash once I’m ready to go into production with it; the risk being that if you find yourself having to work through a TAC case and they notice that you’re booting from a non-Cisco flash, they may tell you to suck rocks — which is a risk I’m willing to take in order to be able to test and upgrade on my own terms

Posted in Awesome, Management | Leave a Comment »

Eponymous.

Posted by qualityofservice on May 18, 2009

I’m currently studying/practicing for the Cisco 642-642 QoS exam, and I’ve gotta say, it’s opened up an entirely new toolset for me.  I’m but a mere enterprise admin, but I’ve seen a lot of routers and switches in my day, and it’s rare that I’ve seen anything (configuration-wise) that’s truly difficult.  We’ve all got our own little configuration quirks (given a device with a legacy configuration at $dayJob, I can likely tell you who configured it within a 95% confidence interval*) that we’ve either picked up from our peers — or on our own and thought “neat, I’m going to try that everywhere!” — but this is the first “feature” I’ve seen that requires some truly in-depth planning; I’ve no doubt this perceived difficulty contributes to my lack of having never seen it in production, so I’ll likely spend some time in an upcoming post ruminating over some of the reasons for or against QoS deployment in enterprise networks.** 

But first up, a quick note on what actually constitutes a “small link,” as far as Cisco documentation is concerned.  I say Cisco doc, but there is math involved and I believe the concepts to be vendor-independent.

But before that, there’s a hockey game on. ^_^

*We still have some routers that are explictly denying IP protocols 53, 55, 77, 103 ingress on all interfaces, for anyone with memories stretching back a few years: http://www.cisco.com/warp/public/707/cisco-sa-20030717-blocked.shtml

**I make the distinction because those with service provider backgrounds have the luxury of bandwidth; bandwidth can solve any QoS-related problem, but I think there’s still a home for the concepts in MPLS VPN networks.

Posted in Deprecated practice, QoS | Leave a Comment »

1841 Modules.

Posted by qualityofservice on May 6, 2009

I’m putting this here for my own reference.  I forget this all the time.  Stemmed from an argument with another admin who insisted that his Cisco SE told him that an 1841 supported FXS/FXO and E&M modules.

FAQ: http://www.cisco.com/en/US/prod/collateral/routers/ps5853/prod_qas0900aecd80181208.html

Module support: http://www.cisco.com/en/US/prod/collateral/routers/ps5853/product_data_sheet0900aecd8016a59b.html (bottom of page)

Long story short: supports wireless and every WAN under the sun (DSL/Cable/T1/E1/ISDN/Serial); no support for voice cards.

Voice can transit it like any other data packet, but the router itself cannot terminate voice circuits.

If the above is untrue, then Cisco’s documentation is woefully out of date.

Posted in Miscellany | Leave a Comment »