Friday, July 26, 2013

The joy of supervisor replacement

Every once and a while you have to replace a supervisor card. The issue then is what code is on the old supervisor and do you (or can you) upgrade the code before using it. One think you have to remember is that different versions of code support different line cards so that old version of code may not support your newer cards.

The other issue (and I am going to do a script for this) is that ports default to admin down and no shut does NOT appear in the config only shutdown. So now you have to figure what ports and vlans were up before the sup swap and bring them up. I will do a perl script to read a config and insert a no shut command on any interface that does not have a shutdown in the config.

Of course getting a sho int brief before and after and doing a windiff goes a way to ensure that you are back to where you were.

Monday, July 22, 2013

A firewall with permit any any dont always permit any any

Most of the time we consider firewall as packet filtering devices so if we are bringing up new services/DMZ/security partitions when we hear we are permitting all traffic we assume that all traffic will pass.

Of course this is not right, there are a few other things

Firewalls have to route, and most of the time it is static, so if the guys forget to add the routes you might need traffic does not forward.

Also depending on vendor, firewalls will have TCP session timers (to prevent certain DDOS attacks), here some sessions were timing out because the firewall had a timer different than the application

They may also by default block UDP or ICMP (cisco traceroute uses UDP but microsoft trace route uses ICMP), so a trace route from your PC works but from a network device does not.

Finally some firewalls will switch traffic in hardware but you have to TELL it to do that, in one case Citrix truned on the session recovery feature that uses a different TCP port than normal and the new port was not configured to hardware accelerate so CPU hit 85% and slow response time occured. To get out of this we had to enable the hardware accelleration and then kill all the active connections.

Like I said, permit any any is not the end of your firewall issues

Friday, July 19, 2013

Forcing a port channel in NX-OS

We had rmaed N7K-M132XP-12 and replaced it with a N7K-M132XP-12L card. We had ports that were part of a port channel, and we kept getting port incosistent. Tac told us that beause we had a mix of L and nonL we need to do a channel group force
channel-group number [ force ] [ mode { active | on | passive }]
That worked but it placed the active list of VLANs directly on the interface with was not the case with the nonL cards. Not sure what happens if we add more vlans to the trunk channel

non L card
interface Ethernet1/9
channel-group 24
no shutdown

L card

interface Ethernet2/9
switchport trunk allowed vlan 1-3967,4048-4093
channel-group 24

Thursday, July 18, 2013

What is this?

I am doing this blog to record the vairous odd things that I encounter in my deployment and troubleshooting work. much of this is for myself but I hope some if it benefits others

vpc peer switch

the doc below says do not use peer switch in a mixed envrionment

Peer-switch feature is supported on networks that use vPC and STP-based redundancy is not supported. If the vPC peer-link fail in a hybrid peer-switch configuration, you can lose traffic. In this scenario, the vPC peers use the same STP root ID as well same bridge ID. The access switch traffic is split in two with half going to the first vPC peer and the other half to the second vPC peer. With the peer link failed, there is no impact on north/south traffic but east-west traffic will be lost (black-holed).

This sets the bridge ID on both switches to the system ID for EVERY vlan including the non VPC ones. Now as long as you set a priority the non VPC swtiches at least work

But on the interswitch routing link, where you often forget to set a root bridge (after all it is only point to point), you get the exact same bridge IDs and the on side of the vlan goes Back BLK which means that loopguard fired.

removed peer-switch and all was well

Saturday, July 6, 2013

The fun of address reuse

Sometimes you get address space from say an acquision and you then migrate those servers to your infrastructure. Great now I have some public address space I can use for my BYOD stuff or the 'internet accessable' labs that people seem to want and other things. BUT do not forget that DMZ you first set up for those acquired servers that have that address configured inside of it. You are real sure its empty, all the servers are gone, nothing pings or is in the arp table, but you are never sure about deleting the subnet because you think there is always 1 application that only runs once a year or something like that. Then you use that address space and you push traffic into that DMZ that has the address space configured on an interface, OOPS.

Net of this is get a process to give people fair warning and clean up unused SVIs. One option is do that cleanup every 6 months, build it into your change process and actually do it

Friday, July 5, 2013

Beware of static routes on redundant interfaces and troubleshooting

Had an problem were someone when and put static routes on 1 of 2 interfaces but only on 1. The problem was that route was used to anchor a BGP network statement, so that that interface went down the route disappeared from the BGP table making the destination unreachable. My guess was this was put in at the time of implementation as part of a troubleshooting deal and no one remembered to do the other side. This is not uncommon, when you have to do something fast, you only do A side intending to do B side in an official change window and that change window never comes. it is always a good idea to keep track of all the changes you made when trying to troubleshooting so you remember to fix all the patches

More on hardware support

So as I was going to implement PBR was confused on what hardware support is. One level of support was int the DFC cards, meaning that traffic did not need to hit the supervisor card. But there is another support of hardware support being that the PFC is able to switch in its hardware (see cisco doc for the restrictions but the net of it is match on IP address and set the next hop). The nice tac engineer told me about the show platform hardware capacity pfc in order to keep track of PFC utilization.

Wednesday, July 3, 2013

Hardware support

When you are deploying a new feature, often you will check for hardware support but not always down to the card level, I deployed PBR on a 7600 and read that pbr is supported in hardware with DFC cards. Thing was that the DFC card bit did not register right away, I did not check was I connecting my firewall to DFC cards. While the firewalls are, the eithernet cards that a lot of things use are not. Turns out some are and some are not. So cpu will become interesting when I deploy this.