Day 17. Hardware Troubleshooting, Part 1

CompTIA A+ 220-901 Exam Topics

Image Objective 4.1: Given a scenario, troubleshoot common problems related to motherboards, RAM, CPU, and power with appropriate tools.

Image Objective 4.2: Given a scenario, troubleshoot hard drives and RAID arrays with appropriate tools.

Image Objective 4.3: Given a scenario, troubleshoot common video, projector, and display issues.

Key Topics

In this day, we will use different tools to troubleshoot motherboard, CPU, power supply, and RAM issues. We also will discuss how to troubleshoot hard drives and RAID arrays. Finally, we will explore how to troubleshoot video and display problems.

Troubleshooting Motherboards

Any part in a computer can fail. It is not often that, when troubleshooting a computer problem, signs point to the motherboard. The symptoms of a motherboard problem are varied and often intermittent, usually increasing in frequency over time. Heat, manufacturing defects, electrostatic discharge (ESD), electrical shorts, and physical damage from replacing components are the most common causes of motherboard failures.

An important step to troubleshooting the motherboard is to check all the connections it has to all other components. Disconnect and reconnect the cables both inside and out. Remove and reinstall any expansion cards. Also check to make sure there are no shorts from the motherboard to the case or any other device. Look for loose screws below the motherboard and remove any that are found.

Perhaps the most common reason for a motherboard failure are BIOS/UEFI issues. A new CPU or adapter card might not be supported by the current BIOS/UEFI, so it must be updated. Also, if the computer boots straight to the BIOS/UEFI or has an incorrect time and date, change out the CMOS battery with a fresh one to correct these problems.

To come to the conclusion that the motherboard is at fault, first test and—if necessary—replace the power supply, RAM, expansion cards, hard drive, and CPU with known good components. If the reported problem is still present, the motherboard is most likely faulty. It is common for a technician to have many components on hand to change them out, but the motherboard isn’t usually one of them. Trying to find a motherboard that is compatible with the current components can be challenging.

ESD damage can cause any number of symptoms from blue screen of death (BSOD) to random reboots. Turn off Automatic reboot on system failure in the System properties so the STOP error message and number on the BSOD can be read and researched. This often can lead to determining the likely cause of the problem.

Component failures on motherboards also can happen. These are rare and hard to diagnose. Check POST settings to ensure that all the devices are detected and operating correctly. Also verify that the computer is operating within the acceptable temperature range, and replace any fans that are not functioning properly. In the earlier 2000s, many motherboard manufacturers used defective capacitors. Check that there are no distended or leaking capacitors anywhere on the motherboard. These capacitors will need to be replaced by a skilled technician to fix the motherboard.

When a computer is completely unresponsive, a device failure might be preventing the POST process from completing. Without beep codes or error messages on the screen, the problem is nearly impossible to diagnose. A POST card can be used to determine where the POST process is failing. The card is inserted into an adapter slot or USB port and will display an error code that indicates what component is causing the problem.

Another tool used to troubleshoot motherboards is the loopback plug. This plug is used to check the basic functionality of the RJ45 port. To use the plug, connect it to the port and use a command window to issue a ping command to the loopback address of 127.0.0.1. If the ping is not successful, the port might be broken or TCP/IP might not be functioning correctly.

Troubleshooting RAM

Unexpected shutdowns, system lockups, continuous reboots, BSOD, and intermittent device failure are all symptoms of memory failure. This does not mean that these are the only symptoms or that they apply only to memory errors, but they are common indicators. It is easier to test the RAM than it is to test a motherboard when symptoms point to both components.

Errors including parity errors, page faults, and error messages for error checking and correction (ECC) sound like they are related directly to memory, but this is not always the case. Parity errors, for example, will show up at the same hexadecimal location each time, indicating a real problem with a RAM stick. If the location changes or the system does not use ECC, the problem might be caused by a failing power supply, heat, poorly written software, or driver problems. In this case, troubleshoot the power supply first.

After you have determined that the RAM is at fault, first check to ensure that the correct RAM is installed properly in the correct slot(s). Next, check the settings in the BIOS/UEFI to make sure they are correct.

After checking hardware and configuration settings, there are basically three ways to test the RAM sticks:

Image Removal/replacement—If the computer has more than one stick, remove all but one stick and boot the computer. If errors occur, replace the bad stick. If it does not, try booting with a different stick, and replace it if errors occur. It is possible all sticks are faulty, so test them all.

Image Software tester—This program can be downloaded and run from bootable media or directly from some operating system (OS) versions. Windows offers the Memory Diagnostic tool to test RAM as well. The software writes data to the sticks in many different patterns to determine whether data is being stored properly. Any failures indicate a bad RAM stick that must be replaced.

Image Hardware tester—These expensive devices physically test the sticks to ensure they are working properly.

Troubleshooting CPUs

After installation, CPUs rarely fail. The most common cause of CPU failure is overheating. Manufacturing defects also can cause failure, but this is rare due to extensive testing before they leave the factory. Another cause of CPU failure is improper installation. The CPU must be oriented and seated properly while the heatsink must be correctly installed with thermal paste.

If the computer does nothing when it is turned on, or if the computer locks up soon after startup, the CPU may have been improperly installed. First, ensure that the CPU fan is plugged in to the correct socket and operating properly. Make sure the fan is free of dust, hair, or any other debris.

Check that there is enough thermal paste between the heatsink and CPU, and also not too much. Both of these issues can cause the CPU to heat up very quickly. Check that the heatsink is flush with the CPU and attached to the motherboard properly. The heatsink needs to be clamped to the CPU evenly.

When a problem exists with the CPU, RAM, or motherboard, the BIOS/UEFI might detect the error and emit a series of beeps to indicate what the problem is. Count the number of beeps and consult the manufacturer of the motherboard and BIOS/UEFI to determine what the problem is. Things like RAM failures, motherboard failures, display failures, or even a low CPU fan speed can be indicated with beep codes. Some common beep codes are shown in Table 17-1.

Image

Table 17-1 Common Beep Codes

Troubleshooting Power

The power supply is a common point of failure in computers. They may suddenly stop working or exhibit intermittent problems or failures. Like RAM, this often increases in frequency over time. The first check when diagnosing power supply problems is to ensure that all connections are correct and that the AC outlet is getting power. Following this, check the voltages in the BIOS/UEFI settings if the computer will start.

If the computer will not start, check each connector with a power supply tester to make sure they are all functioning properly, as shown in Figure 17-1. Additionally, you can use a multimeter to check each pin of each connecter. If the power supply does not register the correct voltages, it will need to be replaced.

Image

Figure 17-1 Testing a Power Supply with a Power Supply Tester

Remember that the power supply must be connected to a motherboard to start. A power supply tester can be used in place of a motherboard to send the proper signal to the power supply to start. If the power supply is plugged into a good outlet and connected properly to the power supply tester and it still will not start, the computer case power switch might be broken. You can use a screwdriver to jumper the pins to which the power button is connected on the motherboard. This will send the power-on signal to the motherboard.

When the computer exhibits intermittent problems, the easiest way to rule out the power supply is to replace it with a known-good power supply. Intermittent problems can be lockups, error codes, or BSODs. A multimeter or power supply tester might not show bad voltage, but if replacing the power supply solves the problem, it was surely broken.

If a power supply fails with a pop, the fuse might have blown. The power supply will need to be replaced. Also, if the power supply smells like it is burning or if there is any smoke, it will need to be replaced.

Troubleshooting Hard Drives

Hard disk drives are mechanical in nature, with parts that move and platters that spin. Eventually, they will fail. It is uncommon for a hard drive to fail with no notice; most hard drives die slowly over time, often destroying data in the process. If a hard drive produces unusual noise, becomes unavailable in the Windows/File Explorer or Disk Management console, or produces read or write failures, the drive will need to be replaced.

Before replacement, check that the power and data cables are attached properly. Exchange the drive with a known good drive to ensure the cables and ports are not damaged. Examine the BIOS/UEFI to see whether the drive is detected properly. Also examine the Self-Monitoring, Analysis, and Reporting Technology (S.M.A.R.T.) data. S.M.A.R.T. errors can indicate that a drive is failing.

After replacement, the format command can be used to securely wipe the old hard drive. The /p parameter will overwrite the entire drive with 0s. Use diskpart to partition the new drive and mark a partition as active.

As a hard drive fails, it may corrupt or even delete critical OS files that are needed to start the operating system. Use the chkdsk tool or the Windows error checking tool to fix logical file system errors and mark any bad sectors on the disk surface. If the boot sector becomes damaged, use the Bootrec tool to write a new one to the system partition.

Hard drives make some noise during normal operation. Loud clicking, constant squealing, or grinding is an indication that the hard drive will fail soon. A failed hard drive will lock up the computer if it contains the OS. When restarting the computer, a No Boot Device Present, OS Not Found, or Failure to Boot error message will be displayed. The computer also might show a BSOD or pinwheel when the hard drive fails.

If a hard drive is suspected of failing, remove the drive from the host and connect it to another computer to back up any accessible data. Third-party file recovery software might be necessary for recovering lost files or data. You can use an external enclosure to connect the drive to another computer. These enclosures can be connected through USB or eSATA.

Over time, hard drives might exhibit slowed performance. This does not necessarily indicate a failing drive. The drive may have become fragmented, or it might be too full to use the page file (swap file) or infected by a virus. Analyze the drive and run disk cleanup or defragment the drive if necessary. Finally, scan the drive for any virus infections and remove them.

Troubleshooting RAID Arrays

Just like a single hard drive, a redundant array of independent disks (RAID) array can fail. Often, the cause is a loose cable or loose RAID card. In some cases, RAID will stop working due to a drive failure. The RAID array might not be able to be found by the BIOS/UEFI or OS. Sometimes a disk failure in a RAID array will cause the computer to perform very slowly. The disk will need to be replaced and the RAID rebuilt.

A disk failure in a RAID array can prevent the OS from being found. Disk failure in a RAID 0 array will cause a BSOD or pinwheel, system lockup, or OS not Found error on boot. Should this happen, the drive will need to be replaced and the data restored from backup. Other RAID levels will most likely produce an error message so that you can replace the drive and rebuild the array.

In some cases, the RAID array will not recognize a replaced drive. Check all cable connections and the card connections. Ensure that the motherboard BIOS/UEFI or RAID card firmware are up-to-date. Also make sure that the drives are connected to the right ports on the motherboard when using built-in RAID.

Troubleshooting Video, Projector, and Display Issues

The video subsystem of the computer is made up of the video card with its drivers and the display. Either of these components can cause video problems and errors. When beginning the troubleshooting process, check all connections first. Ensure that the display is connected and plugged in. Also make sure the video card is installed properly, with any extra power connected. Finally, check that the video cable is securely connected at both ends.

If a driver has become corrupted, the computer will either boot into VGA mode or simply produce no image on the screen. A corrupt driver also can cause distorted geometry and images or oversized images and icons on the screen. Boot the computer into Safe Mode and reinstall the driver.

If the fan on the video card fails or excess heat is not being properly vented from the case, the card can overheat and shut down. Heat can cause the display to show strange artifacts or distorted images. Make sure all the fans are connected, clean, and operating properly.

The monitor might have problems that cannot be fixed with software or hardware tools. Dead pixels are a common occurrence, but they usually have no cure. Monitors are allowed to have a limited number of dead pixels right from the factory. Image burn-in can sometimes be cured by power cycling the monitor.

If a monitor has incorrect color patterns or a dim image, you often can calibrate it using the menu on the monitor. Windows Display settings can be used to make adjustments. The backlight might need replacement if the brightness of the screen cannot be improved. It also may need to be replaced if the panel shows flickering images.

Image Activity 17-1: Match the Troubleshooting Method to the Component

Refer to the Digital Study Guide to complete this activity.

Study Resources

For today’s exam topics, refer to the following resources for more study.

Image

Image Check Your Understanding

Refer to the Digital Study Guide to take a quiz covering the content of this day.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset