Musings of ccie3555: 2013

Thursday, December 5, 2013

Ping comonent parts of a multi link PPP bundle

It is helpful to put IP addresses on the component T1s of a MLPPP bundle for testing individual links. BUT if you just put the IP address on the link, it does not show up in the routing table. in order to make it work you have to add a static route pointing to the interface itself https://supportforums.cisco.com/thread/2068848 This appears to be normal behavior but to allow for the serial interfaces to show in the routing table you will have to add a static route on the router pointing to the interface, eg: ip route 172.27.3.0 255.255.255.252 Serial0/0/0:0

Tuesday, December 3, 2013

strong host/weak host

One problem when you multihome and you do selective routing (the default gateway is NOT configured on all interfaces) say multiple VRFs is a deal called strong host/weak host. It can prevent one of the NIC from responding from pings from the rest of the network even if it does work in its VRF. http://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx The default for windows 2008 is called strong host, that means that you cannot send a packet with a source IP address different than the IP address of the interface. Since we are trying to ping a backup VRF ip address from outside the backup vrf, the host has to send the ping reply out the public interface and strong host will not allow that. Could be the setting got lost in the standard build. to fix netsh interface ipv4 set interface [InterfaceNameOrIndex] weakhostsend=enabled

Friday, November 8, 2013

remember routes have masks

in a couple of previous posts (bgp and old loopbacks never die), I hinted that routers allow routes with overlapping masks and will display longer mask routes by default. For example **sho ip route 10.40.0.97 Routing entry for 10.32.0.0/11 Known via "bgp 65013", distance 20, metric 0 you see the /11 summary however you can add the subnet mask sho ip route 10.40.0.97 255.255.255.255 % Subnet not in table to get a more accurate view. It is important to closely examine the results of a display to see what route is REALLY there, as was described in the loopback post routers will have a /30 and a /32 that overlap. So use the subnet mask in the route display if it is important.

OSPF route redistribution gotcha

An often fogotten rule in OSPF is that in order to redistribute a route, the next hop must be an INTERNAL route in ospf. If you point to an interface that does not have a network statement or you did redistribute connected because you know that interface would not have any OSPF neighbors, the prefix would sho up in the sho ip ospf data BUT would not have the routing bit set so would not be redistributed.

Getting BGP to send routes

While getting a route advertiesed in an IGP like OSPF is pretty simple, getting it into BGP or redistributed can be a little more tricky. Routing process are event driven that is something has to happen to force the router (or L3 switch) to scan the tables. If you say add a network statement in BGP for a route that already exists, it may not go out to the rest of the network for awhile. This can usually be fixed with a clear ip bgp soft out but it is better to follow the following rule. Before bringing up an interface or adding a default route, make sure that all the the routing protocol configuration is in place BEFORE, so if you have a ACL on your redistribution statement, update the ACL, add any network statements you need on the routing protocol configurations, update route maps if needed. THEN bring up the interface, add the IP address to the SVI, or add the new static route. Those are events that will drive the routing protocol. Finally not everything is done in BGP by clearing a neighbor soft out, there are courner cases where the routeing logic is not fully driven. In that case shut/no shut, or remove and add is needed.

Monday, November 4, 2013

sho ip bgp will give you an entry for the default route

Was trying to troubleshoot a route distribution problem and would do a sho ip bgp for a prefix. Would see the BGP entry BUT if there is a default route, it gives you the entry for the default route not the prefix you might be looking for see below. Note you see an entry but unless you notice that the entry is for 0.0.0.0/0 you think the prefix is in BGP but its really not. rdc-all-rt100> sho ip bgp 167.127.100.0 BGP routing table entry for 0.0.0.0/0, version 4693022 Paths: (5 available, best #4, table default, RIB-failure(17) - next-hop mismatch) Multipath: eBGP Advertised to update-groups: 5 10 41 47 13979 64998

Thursday, September 19, 2013

verfiy firewall rules with telnet

Often you need to check does a firewall rule work. You can do this with telnet to the port number but you have to remember that the source interface used by the telnet will be the IP address of the next hop interface. In some versions of ios you can put a /source in the telnet command then if your firewall rule is an entire subnet you can at lest test TCP connections see below for a working example woodridge1-mdf-rsw1>telnet 174.137.37.108 14002 /source vlan200 Trying 174.137.37.108, 14002 ... Open myMethod=keepAlivemyMethod=keepAlivemyMethod=keepAlivemyMethod=keepAlivemyMethod=keepAlive^CmyMethod=keepAlivemyMethod=keepAlive^C

IP addresses in links part of an MLPPP bundle

You can put (with your carrier) ip addresses on individual links in an MLPPP bundle. It is useful to test those links for errors with pings. But they do not show up in the routing table as connected routes. This seems to be normal behavior, and perhaps you can put static routes to the interface in at the cost of more stuff in the routing table you may not use much. the ip address on the multink interface is what shows up in routing https://supportforums.cisco.com/thread/2068848 has the reference

Wednesday, September 18, 2013

Nexis 7k drops on the M1 card are for the ENTIRE asyc

Got a call that there was a spike in output drops on a nexus 7K M1 card. the odd thing was there were 3 port channels with the EXACT same number of drops. looking further there were only 1 port per port channel with drops and again they were all the same. It then occured to me to look at the port to ASIC chart and all 3 ports were using the same ASIC. While we still have a case with cisco, looks like output drops are counted at the ASIC level not the port level. Also seemed like having a running monitor port in that ASIC had a lot do with the drops in the first place. sho int counter error shows the deal -------------------------------------------------------------------------------- Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards -------------------------------------------------------------------------------- mgmt0 0 0 -- -- -- -- Eth1/1 0 0 0 0 0 0 Eth1/2 0 0 0 0 0 11285742206 Eth1/3 0 0 0 0 0 0 Eth1/4 0 0 0 0 0 11285742206 Eth1/5 0 0 0 0 0 0 Eth1/6 0 0 0 0 0 11285742206 Eth1/7 0 0 0 0 0 0 Eth1/8 0 0 0 0 0 11285742206 Eth1/9 0 0 0 0 0 0 also policy maps on the interfaces glic-core-rsw1# sho policy-map int e 1/4 Global statistics status : enabled Ethernet1/4 Service-policy (queuing) input: default-in-policy SNMP Policy Index: 301991953 Class-map (queuing): in-q1 (match-any) queue-limit percent 50 bandwidth percent 80 queue dropped pkts : 0 Class-map (queuing): in-q-default (match-any) queue-limit percent 50 bandwidth percent 20 queue dropped pkts : 565 Service-policy (queuing) output: default-out-policy SNMP Policy Index: 301991962 Class-map (queuing): out-pq1 (match-any) priority level 1 queue-limit percent 16 queue dropped pkts : 0 Class-map (queuing): out-q2 (match-any) queue-limit percent 1 queue dropped pkts : 0 Class-map (queuing): out-q3 (match-any) queue-limit percent 1 queue dropped pkts : 0 Class-map (queuing): out-q-default (match-any) queue-limit percent 82 bandwidth remaining percent 25 queue dropped pkts : 11731496978 glic-core-rsw1#

not all maps are alike

I am used to using route maps for a lot of things so I am very used to a route-map fred permit 10 what to do syntax. BUT when you do an access-map for a VACL there is no permit or deny in the access-map statement SO you have to always put in an acl reference. That being said, I am of the mind to not be cute and just use a single extended ACL in 1 access-map statement so you are not trapped in the route map logic and say close to the traditional security acl syntax

Monday, September 9, 2013

PVST+ HP swtiches

You can connect PVST swtiches to MST swtiches. What happens is that PVST has its own MAC address that the HP swtiches to not listen to, so PVST BPDUs go all they way between the two PVST domains while the MST cloud ignores them. The issue is that the native VLAN must match on the PVST trunk interfaces. The question is how do the swtiches know? As it truns out, PVST BPDUs include a TLV that has the native VLAN number that is how a mismatch is detected. Net of this is that ensure that native VLANs match when you connect PVST switches over a 'cloud' of non PVST swtiches or better yet do to that at all. The reference is at http://www.cisco.com/en/US/tech/tk389/tk621/technologies_tech_note09186a00801d11a0.shtml Troubleshooting Spanning Tree PVID- and Type-Inconsistencies

Saturday, August 17, 2013

ACLs

One of the things people always think about then they add security acls is to ensure that telnet or ssh still works. But if you adding acls to act as a lightweight firewall, you may forget something like TACACS which can be really nasty if you use command authorization, also don't forget logging, snmp, tftp and any protocol you use to manage the box. If you happen of have a standard firewall rule set use that, if you done create one. Net of this is don't forget ALL the management protocols when you are adding acls, OR you can use an SVI that is not part of the address space you are protecting or add a new SVI and change the default route to bypass your ACL when you manage the box. Just adding a new SVI does not help if the default route points to the original SVI

Thursday, August 15, 2013

Old and new nexus cards dont like each other

It seems that when you have a mix of N7K-M132XP-12 and N7K-M132XP-12L you have to do an force command in order to have a port on the 12L join and etherchannel with older 12 cards. You have to preconfig all the vlans (generally make the new interface look like the port channel) to get the thing to work. Clunky

Thursday, August 1, 2013

old loopbacks never die.

Back in the old days many people would configure their loopbacks as /30s because we were told that /31 and /32 were not valid subnet masks. It is true that OSPF should have cured this (a loopback interface is always advertised as a /32 unless you change the OSPF network type. BUT one day I happen to see

Jul 23 11:46:53 CDT: %OSPF-4-CONFLICTING_LSAID: Process 100 area dummy area: LSA
origination prevented by LSA with same LSID but a different mask
Existing Type 5 LSA: LSID 10.120.0.216/30
New Destination: 10.120.0.216/32

Turns out someone had configured a loopback address of 10.120.0.218/30 and in a different box 10.120.0.216/32 creating the situation above. I found the boxes by doing sho ip bgp <prefix> <mask> then check the AS number table. Now have to just change the subnet mask of the 10.120.0.218 interface and all will be well

Friday, July 26, 2013

The joy of supervisor replacement

Every once and a while you have to replace a supervisor card. The issue then is what code is on the old supervisor and do you (or can you) upgrade the code before using it. One think you have to remember is that different versions of code support different line cards so that old version of code may not support your newer cards.

The other issue (and I am going to do a script for this) is that ports default to admin down and no shut does NOT appear in the config only shutdown. So now you have to figure what ports and vlans were up before the sup swap and bring them up. I will do a perl script to read a config and insert a no shut command on any interface that does not have a shutdown in the config.

Of course getting a sho int brief before and after and doing a windiff goes a way to ensure that you are back to where you were.

Monday, July 22, 2013

A firewall with permit any any dont always permit any any

Most of the time we consider firewall as packet filtering devices so if we are bringing up new services/DMZ/security partitions when we hear we are permitting all traffic we assume that all traffic will pass.

Of course this is not right, there are a few other things

Firewalls have to route, and most of the time it is static, so if the guys forget to add the routes you might need traffic does not forward.

Also depending on vendor, firewalls will have TCP session timers (to prevent certain DDOS attacks), here some sessions were timing out because the firewall had a timer different than the application

They may also by default block UDP or ICMP (cisco traceroute uses UDP but microsoft trace route uses ICMP), so a trace route from your PC works but from a network device does not.

Finally some firewalls will switch traffic in hardware but you have to TELL it to do that, in one case Citrix truned on the session recovery feature that uses a different TCP port than normal and the new port was not configured to hardware accelerate so CPU hit 85% and slow response time occured. To get out of this we had to enable the hardware accelleration and then kill all the active connections.

Like I said, permit any any is not the end of your firewall issues

Friday, July 19, 2013

Forcing a port channel in NX-OS

We had rmaed N7K-M132XP-12 and replaced it with a N7K-M132XP-12L card. We had ports that were part of a port channel, and we kept getting port incosistent. Tac told us that beause we had a mix of L and nonL we need to do a channel group force

channel-group number [ force ] [ mode { active | on | passive }]

That worked but it placed the active list of VLANs directly on the interface with was not the case with the nonL cards. Not sure what happens if we add more vlans to the trunk channel

non L card

interface Ethernet1/9
channel-group 24
no shutdown

L card

interface Ethernet2/9
switchport trunk allowed vlan 1-3967,4048-4093
channel-group 24

Thursday, July 18, 2013

What is this?

I am doing this blog to record the vairous odd things that I encounter in my deployment and troubleshooting work. much of this is for myself but I hope some if it benefits others

vpc peer switch

the doc below says do not use peer switch in a mixed envrionment

Peer-switch feature is supported on networks that use vPC and STP-based redundancy is not supported. If the vPC peer-link fail in a hybrid peer-switch configuration, you can lose traffic. In this scenario, the vPC peers use the same STP root ID as well same bridge ID. The access switch traffic is split in two with half going to the first vPC peer and the other half to the second vPC peer. With the peer link failed, there is no impact on north/south traffic but east-west traffic will be lost (black-holed).

This sets the bridge ID on both switches to the system ID for EVERY vlan including the non VPC ones. Now as long as you set a priority the non VPC swtiches at least work

But on the interswitch routing link, where you often forget to set a root bridge (after all it is only point to point), you get the exact same bridge IDs and the on side of the vlan goes Back BLK which means that loopguard fired.

removed peer-switch and all was well

Saturday, July 6, 2013

The fun of address reuse

Sometimes you get address space from say an acquision and you then migrate those servers to your infrastructure. Great now I have some public address space I can use for my BYOD stuff or the 'internet accessable' labs that people seem to want and other things. BUT do not forget that DMZ you first set up for those acquired servers that have that address configured inside of it. You are real sure its empty, all the servers are gone, nothing pings or is in the arp table, but you are never sure about deleting the subnet because you think there is always 1 application that only runs once a year or something like that. Then you use that address space and you push traffic into that DMZ that has the address space configured on an interface, OOPS.

Net of this is get a process to give people fair warning and clean up unused SVIs. One option is do that cleanup every 6 months, build it into your change process and actually do it

Friday, July 5, 2013

Beware of static routes on redundant interfaces and troubleshooting

Had an problem were someone when and put static routes on 1 of 2 interfaces but only on 1. The problem was that route was used to anchor a BGP network statement, so that that interface went down the route disappeared from the BGP table making the destination unreachable. My guess was this was put in at the time of implementation as part of a troubleshooting deal and no one remembered to do the other side. This is not uncommon, when you have to do something fast, you only do A side intending to do B side in an official change window and that change window never comes. it is always a good idea to keep track of all the changes you made when trying to troubleshooting so you remember to fix all the patches

More on hardware support

So as I was going to implement PBR was confused on what hardware support is. One level of support was int the DFC cards, meaning that traffic did not need to hit the supervisor card. But there is another support of hardware support being that the PFC is able to switch in its hardware (see cisco doc for the restrictions but the net of it is match on IP address and set the next hop). The nice tac engineer told me about the show platform hardware capacity pfc in order to keep track of PFC utilization.

Wednesday, July 3, 2013

Hardware support

When you are deploying a new feature, often you will check for hardware support but not always down to the card level, I deployed PBR on a 7600 and read that pbr is supported in hardware with DFC cards. Thing was that the DFC card bit did not register right away, I did not check was I connecting my firewall to DFC cards. While the firewalls are, the eithernet cards that a lot of things use are not. Turns out some are and some are not. So cpu will become interesting when I deploy this.

Friday, June 14, 2013

Troubleshooting and 'target fixation'

I teach CCNP troubleshooting and one thing that is not talked about is that people will very often focus on a single feature or configuration because that is where they expect the problem to be when it is in fact not. Examples are, the last change that was made, the cause of the last problem that looked like the one before you. The issue is that you do not look at simpler causes. For example students will focus on what feature the lab was about, spend an hour on debugs and stuff and the issue was a SVI was no shut. Another example was I bought up a new VPN router with HSRP and focused in the crypto when in fact an ACL on the public interface blocked HSRP packets and it split brained.

The point is, if you can have someone just check the basic stuff like pings, and CDP, routing and packet drops on the interfaces even if your change was to BGP.

Monday, June 3, 2013

Fun with traffic shaping

The largest expsource with traffic shaping at megabit speeds is the number of zeros that you have to type in and the fact that IOS does not support the use of commas. Thus it is very easy to set your shaping rate to 10meg when you mean 100meg. This will often not show up at the time of the change because traffic is less than your shaping rate in your change window. This is even a bigger problem when you are upgrading bandwidth.

The problem will be visible in a sho policyt-map int <your interface>

Class-map: class-default (match-any)
      38004728421 packets, 23186941391069 bytes
      30 second offered rate 0000 bps, drop rate 0000 bps
      Match: any
      Queueing
      queue limit 23500 packets
      (queue depth/total drops/no-buffer drops) 0/5907809/0
      (pkts output/bytes output) 37998852043/23183155616853
      shape (average) cir 94000000, bc 376000, be 376000
      target shape rate 94000000

and you see the large number of drops but 0 no buffer drops, so traffic is being dropped by the shaping policy.

Moral of the story allways cut and paste these can put your on commas in

Friday, May 24, 2013

first OSPF flood war

Has OSPF-4-FLOOD_WAR today, one would expect that the problem would be with duplicate router IDs but in this case we had duplicate /30 point to pint OSPF links