A quick note on watchdog timers

A watchdog is essentially a timer-based mechanism that is used to periodically detect if the system is in a healthy state, and if it is deemed not to be, to reboot it.

This is achieved by setting up a (kernel) timer (with, say, a 60-second timeout). If all is well, a watchdog daemon process will consistently disarm the timer before it expires, and subsequently re-enable (arm) it; this is known as petting the dog. If the daemon does not disarm the watchdog timer (due to something having gone badly wrong), the watchdog is annoyed and reboots the system.

daemon is a system background process; more on daemons in Appendix B, Daemon Processes.

A pure software watchdog implementation will not be protected against kernel bugs and faults; a hardware watchdog (which latches into the board-reset circuitry) will always be able to reboot the system as and when required. 

Watchdog timers are very often used in embedded systems, especially deeply embedded ones (or those unreachable by a human for whatever reason); in a worst-case scenario, it can reboot, and hopefully move along with its designated tasks again. A famous example of a watchdog timer causing reboots is the Pathfinder robot, NASA sent to the Martian surface back in 1997 (yes, the one that encountered the priority inversion concurrency bug while on Mars. We shall explore this a little in Chapter 15, Multithreading with Pthreads Part II - Synchronization, on multithreading and concurrency). And, yes, that's the very same Pathfinder robot that is given a role in the superb movie The Martian! More on this in the Further reading section on the GitHub repository.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset