The most fundamental fault detection for Ethernet interfaces is the same as that for any system interface—whether the interface is operational or not. Changes in interface state trigger linkUp or linkDown traps. You may also monitor ifOperStatus. For further details, on this type of fault monitoring, please refer to the section on system interfaces in Chapter 12, “Monitoring System Interfaces.” The rest of this section discusses Ethernet-specific fault management.
Ethernet errors come in a variety of flavors, but all represent a framing error of some sort. There is either an FCS error or the frame is too short or too long. Following are the basic Ethernet errors and a brief explanation of each:
CRC or FCS errors The Frame Check Sequence (FCS) is a 32-bit cyclic redundancy check (CRC) appended to the end of each Ethernet frame transmitted. If the receiving station detects an error in the FCS, one or more of the bits in the Ethernet frame are in error and the frame is discarded. CRC errors should occur very rarely. Ethernets (according to the IEEE 802.3 specification) should have an error rate no greater that 10–8. On an Ethernet transmitting 1500-byte packets at 100 percent load, that translates to one erred frame out of approximately 82 million frames. Cable problems, poor connections, or faulty interfaces can cause CRC errors. If you detect such errors, it is time to utilize some Layer 1/Layer 2 test equipment, such as a cable tester, to determine whether the cables and connections are up to specification.
Alignment Errors The frame does not have an integer number of octets and does have a bad frame check sequence. These errors indicate a faulty transmitter or a cable problem.
Runts or Fragments Runts or fragments are frames on the wire that are shorter than 64 bytes and that usually have an invalid FCS. The observance of fragments is normal because they can be the result of collisions.
Jabbers A jabber is a frame longer than 1518 bytes and with a bad FCS error. Jabbers are usually due to a malfunctioning interface.
Collisions These are not errors; they are normal occurrences on shared Ethernet segments. You do want to monitor them, however, because a higher-than-normal collision count can indicate a congested segment or other problems on the segment.
And just as in monitoring traffic, there can be different sources for very similar data. The next few sections discuss various sources of fault and error data.
The following list shows the MIB objects you can poll to collect the statistics on Ethernet interface errors:
ifInErrors from RFC 2233 For an Ethernet interface, this counter is the sum of three error conditions: alignment, giants, and FCS errors.
dot3StatsAlighnmentErrors, dot3StatsFrameTooLongs, dot3StatsFCSErrors from RFC 2358 These three variables summed together equal ifInErrors.
etherStatsCRCAlignErrors, etherStatsJabbers from RFC 1757 These variables are only available on the 2500 routers and Catalyst switches. Summed together, they equal ifInErrors.
For Ethernet, any framing error or data corruption is bad because it causes the MAC layer to discard the frame. Any request for retransmission of the lost frame must come after timeouts from the upper-layer protocols. Even very small numbers of framing errors can cause major degradation in performance.
Most framing errors and data corruptions are due to a physical layer problem such as a faulty interface or bad cable. In general, it is best to monitor ifInErrors because it is the sum of the main types of framing errors.
The following list shows the MIB objects you can poll to collect the statistics on Ethernet collisions:
dot3StatsSingleCollisionFrames, dot3StatsMultipleCollisionFrames from RFC 2358 These counters are available on either switches or routers. As their names indicate, they are the number of frames that encountered either a single collision or multiple collisions before transmission was possible.
etherStatsCollisions from RFC 1757 The total number of collisions (single or multiple) on a given interface.
The RMON collision counter is easier to monitor because it is one object with the complete count of all collisions, but it is available only on 2500 routers and Catalyst switches. However, for the 2500 routers, the lance Ethernet chip used will detect collisions only when transmitting. Do not use this counter on the 2500 router to get collision counts for the whole segment. For other routers, the dot3 MIB objects are the best choice. Remember that collisions are natural on shared Ethernet or half-duplex Ethernet. The presence of collisions does not mean there is a problem. It is important to baseline the collision rate on a given segment and then watch for sudden inexplicable increases.
However, on a full-duplex segment, you should see no collisions. The presence of collision on a full-duplex segment often means that one interface on the link is configured for half-duplex transmission and the other is for full-duplex transmissions.
For routers, the best show command for examining Ethernet interface errors is the show interface command. It gives in details the current state of the interface. Example 13-4 provides sample output for show interface.
nms-7010a#sh int fa0/0
FastEthernet0/0 is up, line protocol is up
Hardware is cyBus FastEthernet Interface, address is 0060.5490.f800 (bia
0060.5490.f800)
MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec, rely 255/255, load 1/255
Encapsulation ARPA, loopback not set, keepalive set (10 sec), fdx, 100BaseTX/FX
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:01, output 00:00:01, output hang never
Last clearing of "show interface" counters 00:43:40
Queueing strategy: fifo
Output queue 0/40, 0 drops; input queue 0/75, 0 drops
5 minute input rate 2376 bits/sec, 27 packets/sec
5 minute output rate 1000 bits/sec, 7 packets/sec
3653980 packets input, 269895525 bytes, 0 no buffer
Received 3499 broadcasts, 0 runts,A 0 giantsB
1119 input errors,D 1119 CRC,C 540 frameG,0 overrun,0 ignored,0 abort
0 watchdog, 744 multicast
0 input packets with dribble condition detectedH
1507771 packets output, 161101301 bytes, 0 underruns
0 output errors, 0 collisionsE, 2 interface resetsF
0 babbles, 0 late collisionI, 0 deferredJ
0 lost carrier, 0 no carrier
0 output buffer failures, 0 output buffers swapped out
|
The annotated information in Example 13-4 is as follows:
A runts: The number of input packets discarded because they were less than 64 bytes long.
B giants: Equivalent to jabbers, the number of packets discarded because they were greater that 1518 bytes long.
C CRC: The number of input frames where the checksum calculated by the router does not match the checksum at the end of the frame.
D input error: The total number of errors.
E collisions: The number of frames that had to be retransmitted because of a collision.
F interface resets: The number of times the interface has been reset—either by an internal error condition or through an administrative shutdown.
G frame: The number of frames received with a CRC error and a non-integral number of octets. Could be the result of a collision or a faulty interface.
H dribble condition: The device received a frame that was slightly too long, but the frame is accepted and forwarded. The counter is for information only.
I late collision: An error indicating that something in the Ethernet is out of specification. Either the cable is too long or perhaps there are too many repeaters.
J deferred: A packet has not been transmitted due to excessive number of collisions.
For Catalyst switches, the best command to examine Ethernet interface errors is the show port counters command, as illustrated in Example 13-5.
nms-5505a (enable) show port counters 1/1
Port Align-ErrA FCS-ErrB Xmit-ErrC Rcv-ErrD UnderSizeE
----- ---------- ---------- ---------- ---------- ---------
1/1 0 0 0 0 0
Port Single-ColF Multi-CollG Late-CollH Excess-ColI Carri-Sen RuntsJ GiantsK
----- ---------- ---------- ---------- ---------- --------- --------- ---------
1/1 0 0 0 0 0 0 -
Last-Time-Cleared
--------------------------
Thu Jan 21 1999, 16:55:02
|
A Align-Err: The number of frames that do not have an integer number of octets and have an incorrect frame check sequence.
B FCS-Err: The number of frames with an incorrect frame check sequence.
C Xmit-Err: The internal transmit buffer is full.
D Rcv-Err: The internal receive buffer is full.
E UnderSize: Frames smaller than 64 bytes with a good FCS.
F Single-Col: The number of times the port had a single collision before transmitting the frame.
G Multi-Coll: The number of times the port had more than one collision before transmitting the frame. Note that this counter does not count how many actual collisions occurred trying to transmit the frame—only that it was more than once.
H Late-Coll: An error indicating that the Ethernet is out of specification. A cable is too long or there are too many repeaters.
I Excess-Col: The number of frames that were dropped because the port saw 16 sequential collisions attempting to transmit that one frame.
J Runts: Frames less than 64 bytes long and with a bad FCS.
K Giants: Frames greater than 1518 bytes long and with a bad FCS—the same as a jabber.
Table 13-2 outlines several common System Error messages and their general causes. Each message is specific to the chipset used on the Ethernet interface. Please refer to the “Cisco IOS Software System Error Messages” guide for details on each individual message.
Message | Explanation |
---|---|
AMDP2-1-MEMERR
AMDP2_FE-3-SPURIDON AMDP2_FE-1-DISCOVER AMDP2_FE-1-INITFAIL AMDP2_FE-3-UNDERFLO DEC21140-1-DISCOVER DEC21140-3-ERRINT DEC21140-3-ERRINT LANCE-4-BABBLE LANCE-3-BADCABLE LANCE-1-MEMERR | This list of messages usually indicates a problem on the device—such as faulty interface hardware, software problems, or memory problems. |
AMDP2_FE-5-COLL
AMDP2_FE-5-LATECOLL DEC21140-5-COLL ETHERNET-1-TXERR LANCE-5-COLL | These types of messages are most likely the result of a duplex mismatch or just general congestion on the line. A sudden flurry of these messages may also indicate cabling problems. |