Quality of Service

Any sufficiently advanced incompetence is indistinguishable from malice.

Quis custodiet ipsos custodes?

Posted by qualityofservice on September 2, 2008

I’m playing around with a lot of bleeding edge IOS releases lately, with my newest trick based on Embedded Event Manager documentation from Cisco’s mgmt configuration guide: http://www.cisco.com/en/US/docs/ios/netmgmt/configuration/guide/nm_erm_resource_ps6441_TSD_Products_Configuration_Guide_Chapter.html

 

Ever want to immediately pinpoint a transient CPU spike?  Consistently high utilization is one thing, but quick spikes are less obvious and don’t tend to show up on graphs averaged over 5-minute intervalic measurements.  The following is employed on an 1841, using IOS 12.4(15)T7:

 

resource policy

  policy HighCPU type iosprocess

   system

    cpu process

     critical rising 40 falling 25

     major rising 20 falling 10

    !

   !

  !

  user group Hogger type iosprocess

   instance “IP Input”

   policy HighCPU

  !

snmp-server enable traps resource-policy

 

The (utilization percentage-based) numbers are arbitrary, but I’ll use those in production based on the fact that our resources are heavily oversized for the kind of work they have to do.  To test it out, I hammered the router’s control plane with 100 Mbps worth of ICMP to the router’s interface.  Here’s what immediately happened:

 

000247: *Sep  2 2008 16:38:28.991 UTC: %SYS-4-CPURESRISING: Resource group Hogger is seeing local cpu util 25% at process level more than the configured major limit 20 %

 

Then immediately after I stopped:

 

000278: *Sep  2 2008 16:39:33.971 UTC: %SYS-6-CPURESFALLING: Resource group Hogger is no longer seeing local high cpu at process level for the configured major limit 10%, current value 0%

 

EEM is phenomenal; it basically lets your router monitor itself.  The above will generate a warning and informational-level syslog message for alert triggering and reset, respectively.

 

Next on the list is Control-Plane Policing, or how to mitigate the effect of someone trying to blast your router’s interfaces just like I did for the purposes of this test (which isn’t to say that a sufficiently motivated user couldn’t simply point the firehose at your link to fill it full of DoS, but that’s a seperate issue best dealt with controls elsewhere).  =P

 

2 Responses to “Quis custodiet ipsos custodes?”

  1. Trolan said

    So the switch ran into Hogger and died?

  2. qualityofservice said

    It’s only an 1841, after all. That’s like sending a Catalyst 2950 to do battle with an XMR lololol.

    God I hate myself.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>