Over the past month, our development team has been hard at work delivering a host…
Tips4Tiks – What are FCS errors?
Recently we had a support question related to asking about FCS ERROR alerts in the Admiral Dashboard.
In a MikroTik, each interface tracks tons of data about its link, traffic and errors, and if you’ve opted in to notifications the Admiral dashboard provides alerts on FCS errors. You can see if any of your MikroTiks is having interface errors by double clicking the interface you want to view and clicking on RX and TX stats. FCS errors are a type of error that’s related to RX, or when the interface RECEIVES packets or frames.
So what is an FCS error? And what should you do about it?
FCS stands for Frame Check Sequence, which is a mechanism for ensuring that what a sender sends is the same as what a receiver receives. Let’s break that down a little bit…
Remember that childhood game, where you sit in a circle and whisper something to your neighbor, and then it goes all around the circle whispering neighbor to neighbor until it gets back to you? If the message you whispered to start things off isn’t the same as the message whispered back to you, then you know the message was CORRUPTED, meaning that it changed during the sending and receiving process. As kids, it felt like it was impossible to get the same message at the end, as the one that you started with.
Fortunately with computers, the message is digital instead of analog – it’s all 1s and 0s instead of whispers, and it’s incredibly easy for a computer to mathematically identify if any corruption has occurred on the message. Computers are insanely good at exact copies of messages.
So What happens is that when creating the string of 1s and 0s to transmit the data, the sender runs a predetermined algorithm against the block of data and comes up with a mathematical answer, called a checksum. So the sender says, here’s a FRAME worth of data and at the end of it, here’s my math answer to running this FRAME’S worth of data through the FCS algorithm. It’s like taking a test question and writing down the answer after it. Then, the receiver takes that stream of data that includes the FRAME of 1s and 0s and the CHECKSUM provided by the sender, and he runs the exact same FCS math algorithm against the frame’s 1s and 0s, and if no corruption has occurred, he’ll get the same answer!
FCS is there to verify that what YOU sent and YOUR answer to the test question is the same as what I received and MY answer to the same test question. 99% of the time, the FCS of the sender and receiver matches and that means that the 1s and 0s that were SENT were EXACTLY the same as the 1s and 0s that were RECEIVED. Yesssss!
So what about those rare occasions when FCS doesn’t match?
When the FCS’s doesn’t match up, we drop that frame – throw it away and increment a counter of discards. It’s a dropped packet at the interface that has received it (RX), because guess what? It’s BOGUS data. Somewhere along the way, one or more 1s and 0s got all screwed up and now we can’t trust that data anymore! So we might as well throw it away.
But wait, you’re asking…
Won’t throwing away frames and packets be bad for the network?
Wouldn’t a user notice that? The answer to that question is, it depends.
If we throw away 1 frame out of every thousand, or maybe one frame every couple of days, things won’t be very bad. Almost all applications use TCP – or Transmission Control Protocol which is a stateful and reliable protocol – and that’s also the topic of another discussion – but rest assured that 99% of the time, losing one packet here or there will not be perceptible to a user. Even if that traffic is UDP, real-time or otherwise unreliable, most applications are OK with a tiny amount of loss.
BUT – what if we increase FCS errors every hour? New errors every minute? What about EVERY SECOND!? The more frames that our switches and routers discard, the worse the user experience is. If we get up to 1% packet loss, meaning throwing away more than 1 out of 100 frames, it’s GUARANTEED that a user will notice that the network is incredibly slow as applications keep asking for retransmission of lost packets, and real-time applications like voice and video will get choppy with tiling, artifacting and even those robot voices you may have sometimes heard
Got it, so what can I do?
So now what? On the plus side, FCS errors are almost always an indicator of a physical layer problem, and they’re always a problem you should take action on if your counters are incrementing.
Here’s what you can do
- First and most likely, maybe you just have a bad cable – Replace the cable or connector (doesn’t matter whether it’s copper, fiber, or even wireless)
- If you made the cable yourself, you could try re-terminating the connector because it could be just the connector itself is bad and the cable is still good
- If it’s a fiber optic cable, a heavy bend or sag when cable management isn’t at its best can be the cause of FCS errors and just adding a tension loop or relaxing the bend on the cable with some additional support can prevent FCS loss.
- Next, if you’re using Small Form Pluggable, or SFPs – you could try replacing the SFP module
- Up next, maybe you’ve got a Bad port – you can change your configuration and move traffic from ether2 to ether3, for example.
- Here’s a TIP: If moving traffic fixes your FCS problem, make sure to mark the old port as bad so that you don’t plug something else into it and start this rabbit hunt all over again.
- Update the firmware. Your device could just have a bug in the interface processing software driver, so make sure that the firmware is updated.
- Next up, it could be speed/duplex mismatch – you could try forcing the speed/duplex to match at both ends of the connection as appropriate. Use care here, if you pick a speed and duplex that the device doesn’t actually support, you could lock yourself out until you have local access to the UI while directly attached.
- Also, it could be a faulty entire MikroTik, motherboard or switch, so if you get this far, you could try replacing the whole MikroTik itself (or the other device connected to the MikroTik at the other end of the cable)
- One thing to note: in case you haven’t noticed, FCS errors will NOT tell you exactly what the problem is. whether the sender is sending you garbage, the receiver is making garbage out of a perfectly good message, or whether somewhere along the transmission path – like that copper or fiber cable (or even wireless) between networking devices – something is totally messing with our stream of 1s and 0s on the way between the sender and receiver, and it could even be the sender or receiver itself as well
- Finally, it could be something completely external to networking that’s interfering with the transmission. For example, if it is a copper cable like a cat7 patch cord and it’s run next to a strong power source, like some bright lights or a line to an appliance like a MikroWave or whatnot, RF interference can be introduced from the power lines into the data lines.
- Coincidentally, this is the reason that inside the sleeve of that cat7 cable there are 8 wires that are tightly wound – the induction of the signal along the intertwined wire – called TWISTED PAIR – helps to prevent RF interference and crosstalk! So cool!
One example from my past, an access point that was very nearby a microwave oven would introduce FCS errors whenever the microwave was running because the power to the microwave and the ethernet were run by the previous homeowner in the same conduit. It could also be that some microwave ovens leak a little bit of 2.4 Ghz frequency, creating WiFi interference. In any case, I moved the AP into another room and my FCS errors went away.
Hopefully this information has helped you understand what FCS errors are, how they happen and what you can do about them. If you learned something and enjoyed this, be sure to like and subscribe to our channel for more content.
Can Admiral Platform help?
Here’s a screenshot of my Admiral account where one of my Data Center (DC) switches probably has a bad cable! With Admiral, my dashboard shows me important events like FCS errors and I can even opt-in to emails, text messages or other alerts and notifications for stuff like this as well as speed/duplex changes, light level on fiber and tons more. I’d say it’s much better for a network monitoring system like Admiral to tell you that you’re having errors that you should investigate as opposed to logging in to each MikroTik and opening up each interface and clicking on every RX stats tab to see if you have any with errors. (I’ve done it manually in the past and let me tell you, it’s very boring and tedious!)
If you operate a MikroTik network, you should definitely check out admiralplatform.com where we help centralize visibility, control and automation for MikroTik networks of any size and scale.
See you next time and keep routing the world!