Chapter 2: Troubleshooting Procedures and Guidelines

Exam Objectives

check.png Identifying uses and purpose for standard troubleshooting tools

check.png Understanding and applying troubleshooting techniques to isolate and resolve problems for major computer system components

In the preceding chapter, I show you preventive maintenance routines for your system’s devices. Still, no matter how much preventive maintenance you perform, “something” will eventually go wrong. This chapter continues with the topic of troubleshooting devices — after those “somethings” go wrong.

As a CompTIA A+ Certified Professional, you will be required to repair components from many areas of the computer system. This chapter covers the hardware and software tools of the trade that will help you complete troubleshooting tasks quickly. You will also review the major components found in computer systems and address any troubleshooting steps related to those components.

Identifying Troubleshooting Tools

A good troubleshooting arsenal contains many weapons, both hardware and software. After all, not every computer problem is related to the hardware. And even when it is, software tools can sometimes help with the diagnosis. If you are doing field support, create a troubleshooting kit in a handy carrying case with all the tools you use most often.

Hardware tools

To properly troubleshoot equipment-related issues, you want to use the right tool for the job. This section takes a look at the hardware tools you should have, and what jobs you will perform with them.

Multimeter

You could buy a meter to measure voltage (volts; V), a meter to measure resistance (ohms; Ω), and a meter to measure continuity and current (milliamps [mA] and amps [A]). Or you could just buy a multimeter, which is a combination of all of these different types of meters.

fortheexam.eps Make sure that you know what the multimeter is capable of measuring. You will need to identify the proper times to use one, although not details of its operation.

You can find both digital and analog multimeters, and your choice is based solely on personal preference. Many people find digital multimeters easier to read. Also, when testing resistance in circuits, digital meters use only 1.5V rather than 9V, and the lesser voltage is less likely to damage the circuit.

Both digital and analog multimeters are shown in Figure 2-1. The meter usually has a dial that lets you choose what you want to measure and also the scale that you are measuring on, such as 20VDC or 200VDC.

remember.eps Always use a scale higher than the reading that you are taking to avoid damaging the multimeter. The closer the scale is to the value you are attempting to measure, the more accurate your reading will be.

technicalstuff.eps Make sure you return the dial back to the off position when you are done with the multimeter, or else you find a dead battery when you go to use it the next time.

Some other features to look for in a multimeter are

diamonds.jpg Size

Large: Large meters are usually more feature-rich.

Small: For computer work, a smaller meter will likely be more convenient to transport with your troubleshooting gear.

diamonds.jpg Overload protection: Protects you when voltage or current exceeds what the meter is capable of handling

diamonds.jpg Auto ranging: Automatically selects the appropriate scale but usually allows for manual override

diamonds.jpg Detachable probe leads: Allows for greater flexibility by substituting different leads for different jobs

diamonds.jpg Audible continuity test: Allows you to perform the test without having to look at the meter, as it beeps when there is continuity

diamonds.jpg Automatic power off: Conserves battery life in the meter

tip.eps It’s a pain when you discover that the meter was left on and that the battery died.

diamonds.jpg Automatic display hold: Keeps the latest stable reading visible on the display, making it easier to record the value if you cannot see the meter when you are using the probes

diamonds.jpg Minimum and maximum trap: Records the highest and lowest readings in the same manner as the automatic display hold

warning_bomb.eps When measuring resistance, always remove the power from the device you are checking, unlike when you’re checking voltage, which is done when the device has power.

In “Power supplies and batteries,” later in this chapter, you see how to use a multimeter to verify voltages coming from power supply hard drive ­connectors.

Figure 2-1: Analog and digital multimeters can both be used to test many system compo­nents.

9781118237038-fg040201.tif

Antistatic mat and strap

In my discussion of electrostatic discharge (ESD) in the preceding chapter, one of the best solutions to avoid ESD is to use a grounding or bonding strap, as shown in Figure 2-2. In addition to wrist straps, you can get antistatic mats for your workbench or floor. Both mats and wrist straps are made of nonconductive material and can be grounded to prevent charge build-up.

fortheexam.eps Taking ESD reduction steps, such as using a wrist strap, is always required when working on computer equipment.

Figure 2-2: An antistatic wrist strap with a removable clip, allowing the plug to be used with some hardware.

9781118237038-fg040202.tif

Dealing with screws

This section introduces you to the screws that are common to computer systems, and also talks about problems that you might encounter with screws (such as dropping them).

Many types of pickup tools can be helpful when that screw, jumper, or other component slips from your fingertips. In some cases, you have no other means of retrieving that component where it fell. The two varieties of pickup tools — mechanical and magnetic — are inexpensive and have their uses. As long as you have one in your kit, you will be well off.

warning_bomb.eps Some magnetic pickup tools, such as floor sweepers, have extremely high-power magnets that could damage hard drives. And although most small pickup tools use low-power magnets that are safe near hard drives, avoid placing them next to floppy disk media.

To go along with the pickup tool for small bits and pieces, tweezers and ­needle-nosed pliers are beneficial when it comes to removing and placing small elements, such as jumpers.

And even though a screwdriver is not really a glamorous tool, it will be your most-used tool. In addition to a standard screwdriver, you need a multibit screwdriver as well as a set of security bits, which have a small hole in their center, allowing you to easily remove security screws that have small posts in their centers. These screws are another anti-tamper step that some manufacturers have introduced to help corporations reduce theft of expensive computer components, but they can cause a repair to come to a grinding halt as you try to figure out how to get into the case.

Because you will find two common sizes of hex nuts on computers, having both 3⁄16" and 1⁄4" nut drivers handy will make working with them much easier. You use 3⁄16" nuts for case standoffs (the small posts that support the motherboard and keep it off the surface of the case) and for the posts that mount most of the ports on the back of the computer, such as monitors and printers. You use 1⁄4" nuts for expansion slots, power supplies, and case panels.

With all the screws on computers, you would think that you would be able to find one when you need it, but that is not always the situation, so having a supply of standard screws on hand (listed in Table 2-1) will be a great help to you. Compaq (and now HP) computers have standardized the two hex screw sizes for optical drives and hard drives, and they use the hard drive screws for everything else. They have also been nice enough to usually provide a few spares for adding extra drives, which you will usually find screwed into the inside of the case somewhere.

Table 2-1 Standard Screws

Description

Thread

Length

Hard drive

6-32

5/32" (4mm)

Case

6-32

3/16" (4.8mm)

Optical and floppy drive

M3

1/4" (6.35mm)

Cable plates and ports

4-40

3/16" (4.8mm)

Case fan

Self-tapping

7/16" (11mm)

When disassembling systems — especially laptops — you should keep track where screws and other small parts come from, even going to the point of making a quick diagram.

Miscellaneous tools

You need a variety of other hardware tools in your troubleshooting kit. This list presents a hodge-podge of miscellaneous tools of all sorts.

diamonds.jpg Digital infrared thermometers: Use these devices to find the temperature of CPUs, memory, and other internal devices without having to come into contact with them.

diamonds.jpg Mirrors: It is surprising how many times you need to read information on components that are already mounted inside the computer or need to see exactly where something has fallen. A small mirror can be a big help, and one mounted on an extension arm can be even better. This is useful when you want to see jumper settings or read hidden serial or model numbers. Again, for the cost, mirrors are well worth the investment.

diamonds.jpg Flashlight: The inside of computer cases can be dark, and when you are working on a computer under a desk, the inside of the computer is even darker. Having a small flashlight in your set of tools is often more convenient than moving the computer to a location with more light.

diamonds.jpg Case cracker: Some cases come apart easily after removing all the screws; others do not. The original Macintosh computer was one of those that did not come apart easily — and by design. A case cracker helps you separate the pieces of a case that do not easily come apart, without causing the damage that a flathead screwdriver would cause. This device, which comes in a range of sizes and shapes, is inserted in the gap of the case and then used to separate the sections. They were commonly used for Macintosh computers and monitors, which have a groove where the two halves of the case come apart.

diamonds.jpg Monitor adjustment tools: Vertical and horizontal hold isn’t an issue with today’s monitors. In the past, though, it was not uncommon to make adjustments to this element of the monitor. The adjustment knobs — usually located inside the case to keep users from playing with them — had to be reached with special tools. Monitor adjustment tools are shown amongst the tools in Figure 2-3 (from left to right: multibit screwdriver with case cracker tool, various hex nut bits, a 3⁄16" hex nut bit in screwdriver adapter, various screwdriver bits, flashlight, needle-nosed pliers, diagonal cutter, long-nosed tweezers, extension pickup magnet, extension mirror, and two monitor adjustment tools at the top).

Figure 2-3: Keep these tools in your trouble­shooting kit.

9781118237038-fg040203.tif

diamonds.jpg Loopback plugs: Loopback plugs serve two purposes: as a diagnostic tool and to eliminate errors messages. When you test parallel and serial ports, it is advantageous to have the system think that there is something connected to the port so that you can test sending and receiving data though the port. Rather than testing a serial port using a modem or null modem cable, and a remote computer to communicate with, you can use a loopback plug to test the port without any outside devices. Many multiport gigabit network adapters ship with loopback plugs, which usually look like a short network cable, an inch or two long, with only one connector on it. This connector makes the network port think that it is plugged into a switch or hub, thus eliminating error messages from the operating system regarding errors with that port.

diamonds.jpg Wire cutters and strippers: There are always cables and cable ties that need to be cut, or stripped, or connected within the computer. Having a wire cutter, diagonal cutter, or wire stripper handy makes that job much easier. To go along with that, a knife or a box cutter can be useful tools when making detailed cuts.

diamonds.jpg Electrical tape: After your cutting is done, use electrical tape to wrap any exposed wires.

diamonds.jpg Slot covers: All slots should be covered, so you need a few spare slot covers with you for those cases in which you are removing a card from a system.

diamonds.jpg Drive faceplates: Similar to slot covers, drive faceplates should be replaced when you remove a drive from a computer. These tend to not be standard across systems, so having spares that actually fit might be a little more challenging.

diamonds.jpg Mounting kit/adapter: When mounting a 1.8", 2.5", or 3.5" drives in a system, you may choose to mount them in a bay that is larger than the drive, such as a 5.25" bay, but this requires using a mounting kit or adapter. Typically, this kit consists of a couple of side rails or a bracket to make the drive wider.

diamonds.jpg Extra cables: Spare internal cables, especially the new 80-wire ATA-100/ATA-133 cables, are useful to have in your kit, as well as other standard internal and network cables. These help you diagnose cable or device issues. If you can replace a questionable cable with a known good cable, you can rule out another element that might be causing the problem.

diamonds.jpg Crimper: Network cables, telephone cables, and television cables all have ends that are installed using a crimping tool. As an A+ professional, if you use a crimper, it will likely be for replacing RJ-45 connectors on the ends of network data cables. Crimpers resemble a pair of pliers, with an opening for the RJ-45 connector.

diamonds.jpg Punch-down tools: When connecting network or telephone cables to the back of a patch panel, you must insert the individual wires in the cable into small plastic slots that contain metal connectors. To do this quickly and easily, punch-down tools were created. Depending on the type of slot into which you are punching cables, a specific type of tool should be used.

diamonds.jpg Cable testers: Network and other cable testers can be used to verify not only continuity, but also configuration of cables. In many cases, when people crimp their own network cables, errors can be made with pin configurations.

fortheexam.eps The “Hardware tools” section should be reviewed to a point where you can identify when you would want to use each type of tool.

Diagnostic software

In addition to various hardware tools that are available to help you troubleshoot a system, various software tools can help you out, too.

Boot/rescue disk

To deal with many hard drive issues, you want a boot disk at your disposal. Just booting your system with a Windows boot disk lets you determine whether you can access your hard drive. But to really be able to accomplish a troubleshooting task, or to recover a system, you need a little more power — which you can find in many third-party boot disks and bootable CDs. Most of these solutions include a variety of testing and troubleshooting tools:

diamonds.jpg Knoppix: www.knoppix.net

diamonds.jpg BartPE: www.nu2.nu/pebuilder

diamonds.jpg System Rescue: www.sysresccd.org

diamonds.jpg Microsoft Diagnostics and Recovery Toolkit (DART): www.microsoft.com/windows/enterprise/products/dart.aspx

Software diagnostic tools

Diagnostic tools are available with which you can test several major components in your computer, including (but not limited to) drives, processors, memory, serial and parallel ports, keyboards and mice, and network adapters. These testing tools typically verify integrity of components or stress components by performing multiple random or sequential reads and writes on the system.

BIOS and hard drive self-test

In addition to other software solutions, many BIOS routines include built-in testing software. These routines usually can test disks, RAM, processors, and other system components. Like the diagnostic software, these tests will usually perform random or sequential reads and writes to verify the integrity of the components that are being tested.

fortheexam.eps In addition to these tools, all the monitoring-performance tools listed in Book VI, Chapter 3 can also be used for troubleshooting. Make sure you know the appropriate uses for each tool in both of these chapters.

The Art of Troubleshooting

In this section, you take a look at some general factors and then perform an overview of troubleshooting specific components. This chapter covers these components in the same manner as in the preventive maintenance chapter, Book IV, Chapter 1, by touching on each major component one at a time.

Troubleshooting basics

Computer components used to be very expensive. In today’s market, though, most components have been turned into commodities and can be purchased very cheaply. Because so many elements are so cheap, replacing components is now more common than repairing them. Because of the reduced cost, you can easily have a small supply of spare components and test components by swapping in new and reliable components.

Most failures in components occur either near installation or around the expected wear-out period; very few components fail during the normal use period. With technology improvements proceeding at their current rates, technology commonly becomes functionally obsolete prior to hitting the wear-out period for that device. This can be illustrated by CD and DVD drives over the last few years, where most CD drives that fail will be replaced with a CD-RW, CD-RW/DVD combo, or DVD-RW drive.

Physical environment

To troubleshoot and repair computer systems, you need a large, clean work surface and enough available power connections to power the equipment you are testing as well as your diagnostic tools. You should use antistatic straps and mats for working on equipment and a place to organize your tools nearby.

When doing remote repairs at a client site, you want your travel tools, an antistatic wrist strap, and a clean space to work, although the latter is sometimes difficult to achieve in a cluttered cube farm.

Audio and visual troubleshooting

After you conduct your interview to get a list of symptoms (as suggested in Book I, Chapter 2), you will often start the first stage of active troubleshooting by looking for audio and visual cues that could be the cause of the described symptoms. This can come in the form of listening to POST (Power-on Self-Test) errors or beep codes or an examination of the physical environment.

POST errors

Each BIOS manufacturer has its own diagnostic codes that identify specific errors. You need to consult documentation for the specific beep codes for your BIOS. Many motherboard manufacturers use codes similar to the original IBM POST codes, which are summarized in Table 2-2. If you get only one beep, all is good. In some cases, these beeps are also accompanied by a diagnostic code, which you also have to look up in the BIOS documentation.

Table 2-2 IBM POST Beep Codes

Beep Code

Description

One short beep

Normal POST; system is okay

Two short beeps

POST error; error code shown onscreen

No beep

Power supply or system board problem

Continuous beep

Power supply, system board, or keyboard ­problem

Repeating short beeps

Power supply or system board problem

One long and one short beep

System board problem

One long and two short beeps

Display adapter problem (MDA, CGA)

In addition to the beep codes, you may also be presented with a numeric code representing the error. You need to consult the documentation for the diagnostic codes for your BIOS, but the general breakdown of the code categories is as follows:

diamonds.jpg 100–199: Motherboard error

diamonds.jpg 200–299: Memory error

diamonds.jpg 300–399: Keyboard error

diamonds.jpg 600–699: Floppy drive error

diamonds.jpg 1400–1499: Printer error

diamonds.jpg 1700–1799: Hard drive error

Connectors and ventilation

As part of the visual troubleshooting, you will also likely check that intake/exhaust ports and vents are free of obstructions and that cables and connectors are securely attached. As part of the audio troubleshooting, you want to verify that you hear typical sounds coming from your computer, such as fans running properly and hard drives spinning. If the computer is too quiet, it may lead you to fan failures or other sources for your problems.

Warning or indicator lights

Many devices inside of or attached to your computer will come with status or warning lights. Servers typically have a series of status lights letting you know the state of hard drives, power supplies, RAM, CPU, and temperature. Pay attention to these lights, as they will let you know whether issues exist with components or with the device itself. Some devices will have a limited number of lights or LEDs, and may use patterns to indicate the specific error condition; in these cases, you will need to consult the documentation for that device to identify what the issue is.

More sounds

You have already seen POST beep codes and noises made by fans as potential notifications of problems with a device. In addition to these sounds, pay attention to other noises that might come from your electronic devices. Hums, grinding sounds, high-pitched whines, or other unusual sounds from a device may indicate that a current or imminent problem exists with the device. These may be loud or quiet noises, and your ability to hear them may vary from other people’s ability. I often state to my wife, “How long has the car been making that noise,” to which I get a “What noise?” Some people are better at picking out abnormal sounds, and while you should not consume your life by examining every abnormal sound, you should at least do some cursory investigation, or acknowledge a deviation from normal.

Using your other senses

Other senses besides sight and hearing can be used to troubleshoot issues with computer systems. These include smell and touch. I will leave taste out as I don’t expect that you will want to chew on any of your computing equipment.

When it comes to smell, you may notice smoke or a burning smell coming from the electronics when running. This is typically not a good thing. Valid reasons may exist, such as replaced parts that generate friction burning off shipping oils; but this is more common in the automotive world, rather than in electronics. It is usually safe to err on the side of caution; if you witness smoke or an odor of burning, you have a problem. That problem is likely related to the locations that generate heat or friction; focus on those first.

Heat from electrical current, processing, or friction can generate smoke or a burning smell, but prior to that, you will have heat. All electronics tend to generate some level of heat when operating, and certain problems can cause excessive heat to be generated. Testing for excessive heat is another way to see where the problem may be. Thermometers and touch can be used to identify excessive heat. Many server rooms are equipped with thermometers that can be monitored, and operators can be sent alerts when the temperature is outside a normal range; small-scale heat identification can be used as well.

If you are servicing a computer, excessive dust and lint inside the computer can insulate components, preventing them from staying cool. This lint, like the lint that builds up on electric baseboard heaters over the summer, can cause a burning smell when heated. Some components that are nearing a failure condition will generate a hot spot, which will eventually overheat and fail; this condition is sometimes seen on stove elements right before failure. The key to remember here, unexpected heat is to be of concern, and that heat may cause smells that may lead you to the source of the heating problem.

CMOS and BIOS

Modern CMOS settings are numerous and easy to access. If you have a computer that requires specific nondefault settings to operate correctly, you want to verify that default settings have not been loaded back onto the system. This can happen if certain jumpers have been moved on the motherboard, the CMOS battery fails, firmware has been upgraded, or you note a strange failure within the system. Improper changes to the CMOS settings (such as boot order, video memory, power options, disk detection, swapping floppy drive letters, and CPU timing) can render the computer unbootable until the settings are corrected.

In some cases, you will have a problem with a specific BIOS or firmware version, and the motherboard or device vendor documents a fix through a newer version of the BIOS or firmware. If this is the case, refer to the update procedures and follow them. If they give you an option of backing up the current version of the BIOS, perform that step as a recovery option.

Sometimes BIOS and firmware updates can cause more problems than they fix. In some cases, the update is one-way, and you won’t be able to revert to older versions. After some upgrades, custom settings need to be reapplied.

Power-saving features in the BIOS may allow you to reduce the speed of the CPU to save power and reduce heat. For some situations, this reduction is used as thermal protection when the heat sink fan has failed. If you’re looking at performance issues, this is one area to examine.

Motherboard

Motherboards are made up of many integrated components, and management of these components is done through CMOS settings or by changing jumper settings. You can find specific settings in your motherboard documentation. In addition, you can perform a visual inspection for damage, such as broken connections or capacitor damage, as shown in Figure 2-4. If you are looking for a failure in a component related to the motherboard, you should also verify that all relevant cables are connected to the motherboard correctly.

Figure 2-4: Some of these capacitors have damage. Note the residue on the top of the one that has leaked.

9781118237038-fg040204.eps

Processor/memory

Processor failures are rare but not unheard of. In some cases, if failure occurs immediately after installation, the issue may be related to seating of the processor. When the processor is added to the system, a cooling mechanism is added as well, such as a heat sink and a fan. When you install the cooling mechanism, you should use either a thermal pad or thermal gel to improve heat transfer between the processor and the heat sink. If fan or heat sink fan failure occurs, many processors reduce their speed to help reduce heat generation.

Memory errors are often identified during the POST process, when memory is tested. It should be noted that soft reboots (pressing Ctrl+Alt+Del) usually skip POST tests, so full power cycles (using the power button) should be conducted from time to time. Fan and ventilation problems can also result in memory overheating and generating failures. Other errors occur when technicians unknowingly mix different types of memory, such as ECC and non-ECC memory (covered in Book II, Chapter 3), or memory that runs at different speeds.

Floppy drive

Many issues can affect floppy drives, but the most common is head alignment. Back when these drives cost hundreds of dollars, head misalignment was corrected by having a technician manually realign the heads. These drives are rarely used today — USB flash drives have become much more popular — so replacing faulty floppy drives is typically more cost effective then repairing them.

Other floppy drive errors that can occur include damaged cables and connectors, improperly seated connectors for data or power, dirty heads, and failures in the media. Improper media selection can also be the culprit with some reported errors; there are still some single-density and double-density disks floating around with the high-density disks. Some users with older computers might have used high-density disks in their double-density drives, making them unreadable in high-density drives. Configuration settings in the CMOS (covered in Book II, Chapter 4) could also cause a floppy drive to not function.

Hard drives

When you’re troubleshooting hard drives, don’t overlook the physical problems. Many people dive straight to the software before checking all the hardware. If the drive is not detected or has data errors, you should check the connectors on the drive and motherboard to verify that the cable doesn’t have any signs of damage. Sometimes just replacing the drive cable does the trick.

See whether anything might have recently happened that could have caused a shock to the hard drive, such as dropping the computer. Other causes for errors and drive failure include heat problems, so verify drive placement, airflow, and heat dissipation for the drive. Other issues that can cause detection problems are the jumper settings for the drive. If the drive is set for Master, Slave, or Cable Select operation, you could have problems if the setup is wrong for your situation. In some cases, you might have multiple jumper configurations for each setting (as shown in Figure 2-5), and all should be tried if there is a problem. For complete information about hard drive configuration, read through Book II, Chapter 5.

If you encounter problems with disk size detection, verify LBA (Logical Block Addressing) mode, if available, or the use of drive-overlay programs. Drive-overlay programs are not used very much anymore because they add another layer of code to translate disk locations, which leads to slower access times, but they do let you use drives that are larger than what your system BIOS actually supports. Common disk size limits of some computer BIOS versions are summarized in Table 2-3. When possible, avoid drive-overlay programs because of the added complexity and reduced compatibility when working with that drive. Drive-mounted LEDs can be used to identify disk activity.

Figure 2-5: Some drives have multiple jumper configu­rations for the same setting.

9781118237038-fg040205.tif

Table 2-3 Common Disk Size Limits

Size

Limit Reason

10MB

Early PC/XT limit

16MB

Fat12 limit

32MB

DOS 3.x limit

128MB

DOS 4.x limit

528MB

Limit of CHS (cylinders, heads, sectors) mode, which has a limit of 1,024 cylinders; Fixed by LBA or INT13H translation

2.1GB

Limited by manufacturer-imposed LBA 4,095-cylinder limit or 22-bit LBA translation space

4.2GB

CMOS extended CHS addressing limit of 8,191 cylinders (which was not widely used) or 23-bit LBA translation space

8.4GB

BIOS-imposed INT13H 16,383 cylinder limit or 24-bit LBA translation space

33.8GB

BIOS-imposed 26-bit LBA translation space

136.9GB

Limit of ATA/66/100 BIOS 28-bit translation space

150,994,944GB (144PB)

Limit of UDMA/133, ATA/133 “Big Drives 2001 specification” drives 48-bit translation space

technicalstuff.eps The ATA/ATAPI-6, or “Big Drives” specification, was passed in 2001 and was incorporated into ATA/133 drives. This system was backward compatible with older drives and involved a change in the basic IDE circuitry. This also involved an introduction of the 80-wire, 40-pin ATA/133 cable. To ensure compatibility, the newer cables can be used for all drives.

On the software side, you can use chkdsk.exe or a similar tool on a rescue disk to test for file system and disk errors if the problem is with unreadable files or an inaccessible disk. If the problem is performance based, checking for disk errors is still a valid step, but you can also check for disk contention on the data bus and possibly move multiple drives to separate IDE/ATA channels or purchase an additional controller. Running Performance Monitor and monitoring disk counters will tell you whether the disk is being overused and is suffering from contention issues, which can be solved by moving some of the data files that are heavily accessed to an additional hard drive. To choose which data files to move to a new hard drive, look at the actual applications running on the drive.

Other issues affecting drive performance include the type of bus that is being used. To improve disk I/O performance, ATA/66 controllers and drives can be upgraded to ATA/100 or ATA/133 components; SATA 150 controllers and drives can be upgraded to SATA 300; and 5400-rpm drives can be upgraded with 7200-rpm, 10,000 rpm, or SSD (solid state disk) drives. When selecting a new drive, rotational speed, onboard cache, access time, and seek time should all be considered.

CD/DVD-ROM

As with hard drives, the connectors, cables, power, and jumper settings affect how the CD/DVD-ROM drives work. In addition, problems with optics and alignment can be a factor with optical drives; and a major shock can knock the drive mechanism out of alignment, requiring that you replace the drive. These drives also support analog audio, which makes an analog connection directly between your CD/DVD player and your sound card, allowing audio CDs to be played without requiring transfer of the data through your ATA data cable. This analog audio cable needs to have its pin configuration verified at both ends of the cables.

Media problems can occur because of media storage or abuse. As I describe in the preceding chapter, some scratches may be repaired to a point that you can recover some or all data from damaged disks. Disk damage from glues, chemicals, or heat may be mostly or entirely unrecoverable.

Keyboard and mouse

Sometimes, keyboard problems are easy to fix. For example, the user enables scroll or number locks, which can be verified by the keyboard’s LEDs. Most of the time, though, the problems are a result of damage from abuse or an accident, with the most common accident being spilled liquid. For the most part, these devices can be considered disposable because they cost so little to replace, but they still might require a few troubleshooting steps to identify where the problem lies.

If the issue involves liquids or debris, such as food stuck under the keys, you might be able recover this device with little effort. For example, food and other debris can be blown out with compressed air. However, if the liquid contaminant contains suspended particles (such as sugar or salt) or is acidic, the keyboard is likely a write-off, especially if the liquid caused a short circuit or corrosion has set in. Cables may be damaged by keyboard trays, and both the cables and connectors should be checked for damage. For wireless devices, the batteries should be checked; failing batteries can cause a variety of usage problems.

If you have an urge to recover a keyboard with liquid or debris damage, you can soak it in deionized water, which is free of particles, and is also electrically and acidically neutral.

warning_bomb.eps If you soak the keyboard in deionized water, the keyboard should be hung to dry for 24 to 48 hours. Do not apply heat to it to dry it faster. Adding heat can cause even further damage to the keyboard.

You can disassemble some keyboard models by removing the keycaps and also the case screws on the bottom of the keyboard. When disassembling a keyboard, remember that it contains a lot of little pieces, as shown in Figure 2-6, and they all have to go back in the correct places. Some keyboards also have springs under each keycap. And in addition all the small pieces for all the keys, don’t forget the circuit board, which can be cleaned with an isopropyl alcohol–based cleaner and cotton swabs or lint-free cloths.

If the dirt is just on the keycaps, remove the keycaps with a special removal tool that has two wires that fit under the key and pull the key directly up. After the keycaps are removed, clean them in a mild cleaning solution and replace them.

In general, keyboard repairs are not cost effective because of the time it takes to make repairs.

Tracking problems are the most common ones that occur with mice; optical mice are fussy about the gloss level of the surfaces that they are used on, and internal rollers of mechanical mice can build up a layer of dirt. For an overview of maintenance of mice, see Book IV, Chapter 1.

Figure 2-6: Keyboards can be taken apart and cleaned, but they do contain a lot of little compo­nents.

9781118237038-fg040206.tif

Sound card/audio

Sound card and audio problems are often related to the connections to the external or internal speakers. Some sound cards and motherboards have additional connections for digital audio using an S/PDIF (Sony/Philips Digital Interface Format) connection, and others support 5.1- and 6.1-speaker surround sound. When using these types of connections, you should carefully verify the connections against the manufacturer’s guide. When working with external speakers or surround sound systems, note that all of them require power to allow for amplification or separation of audio signals, so make sure that the power is plugged in.

Monitor/video

Some common and easy problems to fix with monitors involve user errors: say, by, adjusting the picture by using the knobs or buttons on the front of the monitor. There’s nothing quite as annoying as going to a user’s desk to look at a “broken” monitor only to find that the user has reduced the contrast and brightness to zero, leaving the image faint or nonexistent. In addition to contrast and brightness, monitors usually offer horizontal and vertical size and position, curvature, and keystone adjustments.

When monitor problems are not related to these settings, they might be related to power input or video cable problems (which are easy to correct); or the internal power supply, electron gun, or picture tube. Because of the high-voltage capacitors and charges that are maintained by internal components like the CRT, repairs of the internal components should be left to a good repair shop.

Users have many common monitor symptoms. Here are some of the most common:

diamonds.jpg Incorrect colors may be caused by front panel adjustments, bent or damaged pins on the video cable connectors (which can be easily fixed), or a damaged electron gun (which involves a trip to the repair shop or simply replacing the monitor).

diamonds.jpg Burnt or damaged pixels are irreparable, so after the number is sufficient to affect the person using the monitor, the monitor should be replaced or rotated to a user who is more tolerant of the missing pixels. I have had users extremely bothered by one burnt pixel in the top-left corner of their screen, but other users with a dozen random burnt pixels on their screen hardly even notice the dots are missing. In some cases, the damaged pixel will be visible only when displaying one of the primary colors.

diamonds.jpg Blurry images are usually an issue related to the monitor design and specification (such as dot pitch) or a damaged or misaligned electron gun. In the instance of poor monitor design, do not buy any more of that make or model of monitor. If you suspect a defect with the monitor, contact the manufacturer to see whether the monitor can be replaced or repaired.

diamonds.jpg Screen flicker might be related to the internal power supply but is more often related to trying to run the monitor above specifications for color depth, resolution, and refresh frequency. Or, the refresh frequency is just set too low. These are OS-level settings, so they can easily be resolved in the Display control panel.

diamonds.jpg Image skew or discoloration — although possibly related to electromagnetic interference (EMI) — is often corrected by moving the monitor as little as a few inches. This is often traced back to nearby power cables or other sources of EMI.

Video problems can also be caused by your video card, which might not support settings to correctly communicate with your monitor and use appropriate resolution, color depth, and refresh settings for the monitor. If you have slow screen redraws, which can result in slow screen refresh or screen flicker, the video card might not have enough processing power or video RAM. If your computer has an integrated video card, you might be able to allocate more RAM to be used as video RAM. The final resolution for video card problems is getting your video card replaced.

fortheexam.eps All video problems that are reported by a user will have to do with the video card or the monitor or both, and will likely have to do with trying to get more out of the system than it is capable of.

Modem

With the proliferation of broadband Internet connections and companies being connected to the Internet, modem use has fallen off, but they are still used for special purposes, such as pager modems. As with keyboards and other components, the cost of low-end modems has made the repair/replace evaluation lean toward replacement. Common modem problems include driver issues, line issues, and speed issues.

For several years, software modems — Win modems — were popular because they had become so inexpensive compared with the old hardware modems. The migration from hardware modems to OS-level software modems meant that if the software drivers were not installed correctly, the modem would not work properly.

Phone line issues occur with your phone company’s network and can often be identified by testing the phone line with another device or by using modem or OS diagnostic features (found in the modem’s properties). A common line issue is with call-waiting, when the tone interrupts the modem’s signal. You can disable call-waiting per call by adding a disable command to the beginning of the number you are dialing. Speed issues are often caused by improper configuration or by using both hardware and software compression (which can greatly reduce transfer speed).

Serial and parallel ports

Serial and parallel port problems can be caused by connectors, OS settings, and CMOS settings. Here are a few troubleshooting tips if you’re having a problem with one of these ports:

diamonds.jpg Check connectors for pin damage and check cables for damage at the ends or along the cables themselves. Often cables running along the floor get run over by users’ office chairs.

diamonds.jpg Watch cable length. Both serial and parallel specifications have maximum limits to the overall cable length, and exceeding these limits can cause intermittent communication problems with devices. Parallel devices may support a standard Centronics interface (cable length limit of 15 feet [5 meters]), or they may support IEEE 1284 (cable length limit of 30 feet [10 meters]). Serial (RS-232) cables, on the other hand, have a length limit of 50 feet (15 meters) when communicating at 19,200 bps, but that limit increases to 3,000 feet (914 meters) if you communicate at only 2,400 bps.

diamonds.jpg Serial and/or parallel ports might be disabled in the Device Manager. Right-click My Computer, choose Properties, click the Device Manager button on the Hardware tab of the System Properties dialog box, and make sure the ports are enabled. If you are using Windows Vista, then right-click Computer in the Start menu, choose Properties, and click the Device Manager link on the left pane of the System screen. There’s nothing worse than spending a few hours trying to install a printer only to find out that the parallel port was disabled.

In addition to being disabled at the OS level, the port may be disabled in the CMOS. Or, in the case of parallel ports, you may be using a standard port when you need to support the bidirectional features of an ECP or EPP port. You can switch between SPP, ECP, and EPP standards for your printer port by changing your system CMOS settings.

diamonds.jpg Verify the connection speed and data characteristics for both serial ports and modems, and verify that match with the device you are communicating with. These are specified by speed, data bits, parity, stop bits, and flow control.

• The speed is measured in bps (bits per second) and ranges from 110 bps to 115,200 bps.

technicalstuff.eps Speed used to be measured as baud rate (after its creator Jean Maurice Emile Baudot), which is a modulation rate or state change. Baud used to match up with bps (bits per second). With current modulation, encoding, and compression techniques, the bps rate is substantially higher than the phone line’s baud rate (2,400 baud). Many people incorrectly use the term baud when referring to bps.

Data bits are either 7 bits or 8 bits.

Parity may be set to odd or even and is used to verify the contents of a byte (8 bits) of data. Parity uses the eighth bit of a byte and sets the total value of the byte to either odd or even, depending on the parity setting.

Stop bits are sent after the data to signify that the data byte has finished. Stop bits will be 1 or 2 bits long.

Flow control manages the rate of data transfer between the devices, and it allows for transfer at higher rates. Without flow control, the sender can send data only at a rate that guarantees that the receiver can deal with, or process, the data that it is receiving. When using flow control, if the receiver is about to be overwhelmed or needs a pause to clear a backlog of processing, it can use the agreed system to signal a pause or to request more data.

USB

Some common problems that you can encounter with USB include BIOS support, OS support, driver issues, version incompatibilities, and power requirements for devices. Here are some USB troubleshooting tips:

diamonds.jpg Like with all the peripherals that I have covered, check all cables and connectors to ensure that they are properly connected and that the cables are not showing signs of damage or exceeding distance limitations for the technology.

diamonds.jpg Not all system BIOS support or have adequate support for USB, so if you are encountering problems, check with the motherboard manufacturer for a firmware update that addresses your issue.

diamonds.jpg Just like serial and parallel ports, USB ports can be disabled in the CMOS or through the OS, and this should be checked if the ports don’t seem to be working.

diamonds.jpg If the USB ports are enabled but are not working, the issue could be hardware related. You can look at doing a repair; or, in the case of a desktop computer, adding a separate USB controller via PCI or PCI Express.

diamonds.jpg If the devices are detected but are not functioning, you might want to ensure that correct drivers are available to the OS.

diamonds.jpg Although most devices are backward compatible, you might encounter performance or other errors when mixing USB 3.0, 2.0, and 1.x devices and controllers.

remember.eps USB 3.0 or USB 2.0 devices plugged into USB 1.x ports will perform slower because of the restrictions of the USB 1.x specification. If devices are not performing at the expected level, check for this situation. Ensure that you match the type of device to the matching port to ensure that you get optimal performance from the device.

diamonds.jpg Finally, because most USB devices are powered from the bus, some devices may have power-related problems. Although the USB specification provides up to 500mA of power to a single device, some USB controllers enforce the specification’s low-power-mode startup at 100mA and then expect the device to request the necessary amount of power in 100mA increments.

If the device is dumb, such as a USB lamp, it does not have circuitry to communicate with the USB controller. Because it cannot communicate with the controller, it will take whatever power it can get, so it might not get the power that it requires to function properly. Other devices, like some USB hard drives, require up to 1A of power, and this can be supplied by special double-connector USB cables or cables to draw power from other USB ports. Figure 2-7 shows the back of a USB hard drive, with a thick, silver USB cable and a thin, black power cable to be used for additional power. The connector on the end of the power cable is a pass-though connector, which would allow for another device to share the port that is being used to supply power to this drive. This sort of cable is required because some controllers will not allow devices to exceed the USB specified power per port.

All the troubleshooting items for USB devices can also be applied to FireWire devices.

Figure 2-7: Here is a device powered by two USB ports, using a pass-through connector for the second port.

9781118237038-fg040207.tif

Power supplies and batteries

When troubleshooting power supplies, you first need to do the following two things:

diamonds.jpg Country: Make sure the power supply is configured for use in the country where the computer operates. Different countries have different standards when it comes to supplying power, and the computer’s power source needs to match that.

diamonds.jpg Flow: Make sure the power supply is actually getting power. Checking this is easy enough — just see whether the fan is running.

In addition, ensure that all power supply connectors are correctly attached to devices, including the motherboard. You can test the power coming from the wall receptacle with

diamonds.jpg An inexpensive receptacle tester: Recommended

diamonds.jpg A noncontact voltage indicator: Recommended

diamonds.jpg A multimeter: Not recommended because of the risk involved at the voltage level being tested

remember.eps Power cords can be tested for continuity by using a continuity tester or a multimeter.

Testing power supplies

Power supplies can be tested with inexpensive testers, as shown in Figure 2-8. The tester will ensure that the power supply works and provides the proper voltages on each pin of the connectors. The tester in Figure 2-8 supports 20/24 pin power, 12V, peripheral, floppy drive, and SATA connectors. If you suspect a problem with just one of the power connectors or one of the lines leading from the power supply, you can use a multimeter to test the connector, as described later in this section.

Figure 2-8: A power supply tester can save time when testing voltage supplied by power supplies.

9781118237038-fg040208.tif

In some cases, your power supply might be functioning correctly, but the problem might be that you are attempting to power too many devices, exceeding the total output of the power supply. Power supply testers cannot verify that your total draw does not exceed the amount supplied by the power supply; for that, you need to add up the power usage by hand. For information regarding the draw by devices, see Book II, Chapter 7. If you suspect that you are exceeding the power output of the power supply or the device bus, you can disconnect or unplug some of the devices and see whether the system returns to normal operation (with the exception of the missing devices).

I was recently troubleshooting a system with three out of five PCIe device slots occupied, and a Host Bus Adapter (HBA). We had sporadic failures on the HBA connecting to the SAN, and had the same issue on five identical servers. It turned out that one of the cards on the server was drawing four times the power of a typical PCIe card; the HBA was the last card on the bus and was being starved for power. Upon removing the high-draw card, all other devices started working properly. Power issues can be tricky to locate and resolve. In this case, the problem was not the power supply, but rather the power limits of the PCIe bus.

Unexpected shutdowns, continuous reboots, and no power

If the power supply is not functioning or does not have enough juice to power all the components of your system, replace it. Other issues that are often a result of defective power supplies include

diamonds.jpg Power-up or system startup failures or lockups

diamonds.jpg Intermittent rebooting or lockups during normal operation

diamonds.jpg Intermittent memory errors

diamonds.jpg Intermittent device failures

diamonds.jpg Continual rebooting

diamonds.jpg Hard disk and fan simultaneously failing to run because of a shortage of current

diamonds.jpg Thermal failures and overheating because of fan failure

diamonds.jpg Electric shocks received when touching system case or connectors

diamonds.jpg System operations halted or rebooting from static discharges

Because ATX power supplies use a soft power switch, they need to be connected to a motherboard to be turned on. So, to test the main power connector — the one that connects to a motherboard — you need to

diamonds.jpg Have the power-on pin shorted: Not recommended

diamonds.jpg Use a power supply tester: Recommended and preferred; refer to Figure 2-8

diamonds.jpg Back-probe the connector when the computer is running: Not preferred, but acceptable

Most Molex-type power connectors can be back-probed, which is done by having your multimeter’s black probe connected to ground and your multimeter’s red probe inserted into the back of the connector (which, in most situations, has sufficient space to allow the probe; see Figure 2-9). In this figure, the black probe is grounded, and the red probe is testing the 12V lead on the Molex peripheral connector. This result is valid because it is +/–5%, which is within the specification for power connectors. There is a minimal risk of damaging equipment if this procedure is done correctly, and it allows you to see on the live system what the power issues might be. Read Book II, Chapter 7 to see what the appropriate voltages are for each pin.

Figure 2-9: Back-probing power connectors is a standard method for testing power in systems.

9781118237038-fg040209.tif

fortheexam.eps Power supply problems can result in problems with other systems. Check the question for keywords that suggest an issue with the power supply, in addition to the other device(s) mentioned.

Getting an A+

In this chapter, you examine

diamonds.jpg Using a multimeter to test power supplies

diamonds.jpg Using antistatic mats and straps to reduce the chance of ESD damage

diamonds.jpg Third-party disks and tools that are usually helpful when troubleshooting

diamonds.jpg How each major component inside the computer system has unique troubleshooting steps

Prep Test

1 What is not typically measured by a multimeter?

A checkbox.jpg Current

B checkbox.jpg Capacity

C checkbox.jpg Voltage

D checkbox.jpg Continuity

2 What is the purpose of a network card loopback plug?

A checkbox.jpg To return signals from your network card back to your computer to verify accuracy and pin configuration of network cable

B checkbox.jpg To allow use of 127.0.0.1 TCP/IP address range

C checkbox.jpg To eliminate network error messages from the operating system

D checkbox.jpg To capture network traffic for future analysis

3 Which issue doesn’t indicate a possibly failing power supply?

A checkbox.jpg CPU thermal speed reduction

B checkbox.jpg Power-up failure

C checkbox.jpg System halt

D checkbox.jpg Internal devices receiving 12V and 5V

4 Why would you use an extension magnet?

A checkbox.jpg To erase magnetic media in a drive

B checkbox.jpg To suspend floppy drives inside metal cases

C checkbox.jpg To pick up dropped metal objects

D checkbox.jpg To remove Molex filings from inside of the connector

5 What components can be tested by BIOS self-tests? (Select all that apply.)

A checkbox.jpg Hard drives

B checkbox.jpg CPU temperature deviations

C checkbox.jpg Magnetic media

D checkbox.jpg Memory

6 How does the system BIOS report major startup errors or configuration issues?

A checkbox.jpg Screen flashes

B checkbox.jpg Beep errors

C checkbox.jpg Error message dialog boxes

D checkbox.jpg Audio interruptions

7 What can be the cause of reduced CPU performance?

A checkbox.jpg Overuse of CPU cycles

B checkbox.jpg Addition of RAM

C checkbox.jpg Heat sink fan failure attributable to power supply failure

D checkbox.jpg Removal of the J7 motherboard jumper as defined by the ATX 1.2 standard

8 What is the purpose of a security bit?

A checkbox.jpg To reduce the chance of users opening system cases

B checkbox.jpg To secure TCP/IP data packets

C checkbox.jpg Eight of them create a security byte

D checkbox.jpg To allow for tracking of sensitive data

Answers

1 B. Capacity is not measured by a multimeter. Voltage, current, continuity, and resistance are commonly measured with a multimeter. See “Multimeter.”

2 C. Network loopback plugs are used to reduce error messages related to unplugged network cables. Review “Miscellaneous tools.”

3 D. Power levels for internal devices are supposed to be 12V and 5V. Peruse “Power supplies and batteries.”

4 C. Extension magnets are used to pick up dropped metal items. Take a look at “Miscellaneous tools.”

5 A, D. The system BIOS on some systems allow for testing of IRQs, memory, hard disks, CPU processing ability, and most internal components. Peek at “CMOS and BIOS.”

6 B. Beep errors are the standard way that major system startup failures are reported to the user. Look over “POST errors.”

7 C. Reduced CPU performance is often caused by the temperature of the CPU exceeding limits, and it reduces power to reduce the temperature. Excess temperatures usually occur when the heat sink fan fails. Study “Processor/memory.”

8 A. Security bits are paired with security screws to prevent users from opening system cases. Refer to “Miscellaneous tools.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset