Appendix 1

Understanding basic equipment specifications

The performance of an audio system may be measured to determine its effect on a sound signal passed through it, and it may also be assessed subjectively (in other words, by listening to it). Theoretically, if a system introduces an audible modification to the sound signal then one should also be able to measure it, provided that the right test can be devised and suitable equipment is available, but the difficulty of achieving this ideal is always increasing as the audible differences between systems become ever smaller, and the absolute fidelity of recording and reproduction improves. Digital recording and processing has brought with it the possibility for recording and transmission with zero degradation, and there is still considerable debate about just what can be heard and what cannot. In the following sections it is not the intention to get involved in the debates which always rage in hi-fi circles concerning minute differences in sound quality between systems (which do of course exist, but are often explained badly), but rather that the reader should gain an insight into the most commonly encountered system specifications and what they mean, as well as describing the audible effects of different distortions on sound signals.

Frequency response – technical

The most commonly quoted specification for a piece of audio equipment is its frequency response. It is a parameter which describes the frequency range covered by the device – that is, the range of frequencies which it can record or reproduce. To take a simple view, for high-quality reproduction the device would normally be expected to cover the whole audio-frequency range, which was defined earlier as being from 20 Hz to 20 kHz, although some have argued that a response which extends above the human hearing range has audible benefits. It is not enough, though, simply to consider the range of frequencies reproduced, since this says nothing about the relative levels of different frequencies or the amplitude of signals at the extremes of the range. If further qualification was not given then a frequency response specification of 20 Hz–20 kHz could mean virtually anything. It is important to compare devices’ specifications on the same grounds, since otherwise little useful information can be gained.

The ideal frequency response is one which is ‘flat’ – that is, with all frequencies treated equally and none amplified more than others. Technically, this means that the gain of the system should be the same at all frequencies, and this could be verified by plotting the amplitude of the output signal on a graph, over the given frequency range, assuming a constant-level input signal. An example of this is shown in Figure A1.1(a), and it will be seen that the graph of output level versus frequency is a straight horizontal line between the limits of 20 Hz and 20 kHz – that is, a flat frequency response. Also shown in Figure A1.1 are examples of non-flat responses, and it will be seen that these boost some frequencies and cut others, affecting the balance between different parts of the sound spectrum (the audible effects of which are discussed in ‘Harmonic distortion – technical’, below). Typically, frequency response is quoted with reference to the response at 1 kHz. This means that the output level at 1 kHz is chosen as the level against which all other frequencies are compared, and would be given a relative level of 0 dB for this purpose. If the response at 5 kHz was said to be +3 dB ref. 1 kHz, this would mean that signals of 5 kHz would be amplified 3 dB more than signals at 1 kHz.

Images

Figure A1.1   (a) Plot of a flat frequency response from 20 Hz to 20 kHz. (b) Examples of two non-flat frequency responses

Although a graph gives the most detail about frequency response, since it shows what happens at every point in the range, often only figures are given in specifications. It is common to quote frequency response as the upper and lower limits of the frequency range handled by the device, giving two frequencies at which the response is ‘3 dB down’ – in other words, where the response is 3 dB lower than the response at 1 kHz. It is implied in such a case that the response between these points is moderately flat, although this cannot be taken for granted in practice. Thus a response of 45 Hz-17 kHz (-3 dB) suggests that the device handles a frequency range between 45 and 17 000 Hz, at which extremes the response is 3 dB down. Below 45 Hz and above 17 000 Hz one would expect the response to fall off even further.

A more accurate way of specifying frequency response in figures, and one which leaves less room for misinterpretation, is to state a tolerance for allowed variations in level over the specified range. Thus a specification of 45 Hz-17 kHz (±3 dB ref. 1 kHz) states that the response at any frequency between the limits will not deviate from the response at 1 kHz by more than 3 dB upwards or downwards.

Frequency response – practical examples

Some practical examples may help to illustrate the above discussion of frequency response, and a table of typical specifications for a selection of devices is given in Table A1.1, so that they can be compared.

Firstly, purely electronic devices tend to have a flatter response than devices which involve a recording or reproduction process, since the latter usually incorporate mechanical, magnetic or optical processes which are more prone to distortions of all kinds. An amplifier is an example of the former case, and it is unusual to find a well-designed amplifier which does not have a flat frequency response these days – flat often to within a fraction of a decibel from 5 Hz up to perhaps 60 kHz. In the other category, though, there are LP turntables, tape recorders, loudspeakers and microphones, to name but a few, and these are all much more difficult to design for a flat response.

Table A1.1   Examples of typical frequency responses of audio system

Device

Typical frequency response

Telephone system

300 Hz–3 kHz

AM radio

50 Hz–6 kHz

Consumer cassette machine

40 Hz–15 kHz (±3 dB)

Professional analogue tape recorder

30 Hz–25 kHz (±1 dB)

CD player

20 Hz–20 kHz (±0.5 dB)

Good-quality small loudspeaker

60 Hz–20 kHz (−6 dB)

Good-quality large loudspeaker

35 Hz–20 kHz (−6 dB)

Good-quality power amplifier

6 Hz–60 kHz (± 3 dB)

Good-quality omni microphone

20 Hz–20 kHz (±3 dB)

Fact file A1.1   Frequency response - subjective

Subjectively, deviations from a flat frequency response will affect sound quality. If the aim is to carry through the original signal without modifying it, then a flat response will ensure that the original amplitude relationships between different parts of the frequency spectrum are not changed. If, say, low frequencies are boosted with respect to high frequencies, then the original sound will be modified, making it sound more bass heavy. It is important not to be side-tracked by the fact that the human ear’s frequency response is not flat (see Chapter 2), since this fact has no bearing on the need for a flat response in audio equipment. In audio equipment the important factor is that sounds come out of a system as they went in.

Some forms of modification to the ideal flat response are more acceptable than others. For example, a gentle roll-off at the high-frequency (HF) end of the range often goes unnoticed, since there is not much sound energy at this point. Domestic cassette machines and FM radio receivers, for example, tend to have upper limits of around 15 kHz, but are relatively flat below this, and thus do not sound unpleasant. Frequency responses which deviate wildly from ‘flat’ over the audio-frequency range, on the other hand, sound much worse, even if the overall range of reproduction is wider than that of FM radio.

If the frequency response of a system rises at high frequencies then the sibilant components of the sound will be emphasised, music will sound very ‘bright’ and ‘scratchy’, and any background hiss will be emphasised. If the response is down at high frequencies then the sound will become dull and muffled, and any background hiss may appear to be reduced. If the frequency response rises at low frequencies then the sound will be more ‘boomy’, and bass notes will be emphasised. If low frequencies are missing, the sound will be very ‘thin’ and ‘tinny’. A rise in the middle frequency range will result in a somewhat ‘nasal’ sound, perhaps having a rather harsh quality, depending on the exact frequency range concerned.

Concerning the effects of very low and very high frequencies, close to the limits of human hearing, it can be shown that the reproduction of sounds below 20 Hz does sometimes offer an improved listening experience, since it can cause realistic vibrations of the surroundings. Also, the ear’s frequency response does not cut off suddenly at the extremes, but gradually decreases, and thus it is not true that one hears nothing below 20 Hz and above 20 kHz – one simply hears much less. Similarly, extended HF responses can sometimes help sound quality, and a gentle HF roll-off usually implies less steep filtering of the signal which in turn may result in improved quality.

Devices which convert sound into electricity or vice versa (in other words, transducers) are most prone to frequency response errors, and some loudspeakers have been known to exhibit deviations of some 10 dB or more from ‘flat’. Since such devices are also affected by the acoustics of their surroundings it is difficult to divorce a discussion of their own response from a discussion of the way in which they interact with their surroundings. The room in which a loudspeaker is placed has a significant effect on the perceived response, since the room will resonate at certain frequencies, creating pressure peaks and troughs throughout the room. Depending on the location of the listener, some frequencies may be emphasised more than others, and this therefore makes it difficult to say what is the fault of the room and what is the fault of the speaker. A loudspeaker’s response can be measured in so-called ‘anechoic’ conditions, where the room is totally absorbent and cannot produce significant effects of its own, although other methods now exist which do not require the use of such a room. A good loudspeaker will have a response which covers the majority of the audio-frequency range, with a tolerance of perhaps ± 3 dB, but the LF end is less easy to extend than the HF end. Smaller loudspeakers will only go down to perhaps 50 or 60 Hz.

Analogue magnetic tape recorders use a number of equalisation processes to ensure that the frequency response is as flat as possible, but this can usually only be achieved with one of a few specified tape types and formulations. An unsuitable tape may result in a non-flat response, unless the machine can be realigned for the new tape. Cassette machines have a number of different tape-type settings to ensure that the frequency response and other parameters are optimum for each tape formulation. The frequency response of an analogue tape machine is likely to vary with recording level, since at higher recording levels the high frequencies become ‘compressed’ because the tape is incapable of retaining them. For this reason the response of such a machine is usually quoted at a relatively low recording level, perhaps 20 dB below reference level (see ‘Magnetic recording levels’, Chapter 6).

LP records are equalised before they are cut, intentionally to give them a non-flat response, but this is re-equalised before reproduction to restore the correct frequency balance. The reason for this is explained in ‘RIAA equalisation’, Appendix 2. If the RIAA equaliser in the amplifier which reproduces the recording is not properly designed then it will not restore the correct frequency balance, and this can occur in cheap hi-fi equipment.

Microphones vary enormously in their characteristics, and their frequency response depends a lot on their polar pattern and design (see Chapter 3). Cheap consumer microphones may have a response which only extends up to 10 or 12 kHz, whereas professional mics may cover a range at least up to 20 kHz. The LF end of the spectrum is equally variable, with omnidirectional microphones having a much more extended LF response than other pickup patterns. A microphone’s response often varies with the angle of incidence of the source.

Harmonic distortion – technical

Harmonic distortion is another common parameter used in the specifications of audio systems. Such distortion is the result of so-called ‘non-linearity’ within a device – in other words, what comes out of the device is not exactly what went in. There are a number of types of non-linearity, but here it is the type which affects the shape of the waveform that is referred to. In Chapter 1 it was shown that only simple sinusoidal waveforms are completely ‘pure’, consisting only of one frequency without harmonics. More complex repetitive waveforms can be analysed into a set of harmonic components based on the fundamental frequency of the wave. Harmonic distortion in audio equipment arises when the shape of the sound waveform is changed slightly between input and output, such that harmonics are introduced into the signal which were not originally present, thus modifying the sound to some extent (see Figure A1.2). It is virtually impossible to avoid a small amount of harmonic distortion, since no device carries through a signal entirely unmodified, but it can be reduced to extremely low levels in amplifiers.

Images

Figure A1.2   A sine wave input signal is subject to harmonic distortion in the device under test. The waveform at the output is a different shape to that at the input, and its equivalent line spectrum contains components at harmonics of the original sine-wave frequency

Harmonic distortion is normally quoted as a percentage of the signal which caused it (e.g.: THD 0.1 per cent @ 1 kHz), but, as with frequency response, it is important to be specific about what type of harmonic distortion is being quoted, and under what conditions. One should distinguish between third-harmonic distortion and total harmonic distortion, and unfortunately both can be abbreviated to ‘THD’ (although THD most often refers to total harmonic distortion). Total harmonic distortion is the sum of the contributions from all the harmonic components introduced by the device, assuming that the original wave has been filtered out, and is normally measured by introducing a 1 kHz sine wave into the device and measuring the resulting distortion at a recognised input level. The level and frequency of the sine wave used depends very much on the type of device and the test standard used. Third-harmonic distortion is a measurement of the amplitude of the third harmonic of the input frequency only, and is commonly found in tape recorder tests since the third harmonic is the most prominent in magnetic recording systems.

It may be important to be specific about the level and frequency at which the distortion specification is made, since in many audio devices distortion varies enormously with these parameters.

Harmonic distortion – practical examples

In electrical devices such as amplifiers, distortion percentage does not usually change much with input level, but may vary slightly with frequency. With tape recorders distortion is very much a function of recording level and frequency, and can vary widely. In transducers, distortion usually remains at a moderately constant percentage with input level variation, but cheaper transducers can introduce fairly high levels of distortion.

Table A1.2 shows some typical quoted THD percentages for different audio devices, and it will be seen that they vary widely, with amplifiers and digital audio equipment having the lowest typical figures.

The distortion characteristics of digital audio systems are discussed in more detail in Chapter 8, and will not be covered further here. Analogue tape machine distortion performance is often quoted in the form of a maximum output level or MOL figure, which is the recording level at which third-harmonic distortion reaches a certain percentage, considered to be the maximum sensible recording level. In professional recorders this is 3 per cent at 1 kHz, and in consumer cassette machines it is 5 per cent at 315 Hz. Typically, one might expect this distortion percentage to be reached at a recording level of around 10–12 dB above reference level in professional recorders with a good modern tape, and around 4–8 dB above reference level in cassette machines. Analogue tape machines and reference levels are discussed in more detail in ‘Magnetic recording levels’, Chapter 6.

Fact file A1.2   Harmonic distortion – subjective

Harmonic distortion is not always unpleasant, indeed many people find it quite satisfying and link it with such subjective parameters as ‘warmth’ and ‘fullness’ in reproduced sound, calling sound which has less distortion ‘clinical’ and ‘cold’. Since the distortion is harmonically related to the signal which caused it, the effect may not be unmusical and may serve to reinforce the pitch of the fundamental in the case of even-harmonic distortion.

The sound of third-harmonic distortion is easy to detect on pure tones, but less easy on music, and can be heard when recording a tone on to a tape recorder at high level whilst comparing the sound ‘off tape’ with the input signal. The tone no longer sounds ‘pure’, but has an edge to it. It contains a component one octave and a fifth above the fundamental tone.

Because distortion tends to increase gradually with increasing recording level in tape recorders, the onset of distortion is less noticeable than it is when an amplifier ‘clips’, for example, and many analogue tape recordings contain large percentages of harmonic distortion which have been deemed acceptable. Amplifier clipping, on the other hand, is very sudden and results in a ‘squaring-off’ of the audio waveform when it exceeds a certain level, at which point the distortion becomes severe. This effect can be heard when the batteries are going flat on a transistor radio, or when a hi-fi loudspeaker is driven exceedingly hard from a low-powered amplifier, and sounds like a serious breaking up of the sound on peaks of the signal. If tested with sine-wave sources, the result is as shown in Fact File 5.8.

Table A1.2   Typical THD percentages

Device

% THD

Good power amplifier @ rated power

< 0.05% (20 Hz–20 kHz)

16 bit digital recorder (via own convertors)

< 0.05% (−15 dB input level)

Loudspeaker

< 1% (25 W, 200 Hz)

Professional analogue tape recorder

< 1% (ref. level, 1 kHz)

Professional capacitor microphone

< 0.5% (1 kHz, 94 dB SPL)

Dynamic range and signal-to-noise ratio

Dynamic range and signal-to-noise (S/N) ratio are often considered to be interchangeable terms for the same thing. This may be true, but depends on how the figures are arrived at. S/N ratio is normally considered to be the number of decibels between the ‘reference level’ and the noise floor of the system (see Figure A1.3). The noise floor may be weighted according to one of the standard curves which attempts to account for the potential ‘annoyance’ of the noise by amplifying some parts of the frequency spectrum and attenuating others (see Fact Files 1.4 and A1.3). Dynamic range may be the same thing, or it may be the number of decibels between the peak level and the noise floor, indicating the ‘maximum-to-minimum’ range of signal levels which may be handled by the system. Either parameter quoted without qualification is difficult to interpret.

For example, the specification ‘Dynamic range = 68 dB’ for a tape recorder means very little, since there is no indication of the reference points or weightings, whereas ‘S/N ratio, CCIR 468-3 (ref. 1 kHz, 320 nWbm−1) = 68 dB’ tells the reader virtually all that is required. It says that the noise has been measured to the CCIR 468-3 weighting standard, and that it measures at 68 dB below the level of a 1 kHz tone recorded at a magnetic level of 320 nWbm−1. This could at least be compared directly with another machine measured in the same way, even if the reference level was different, although the difference between the two reference levels would have to be taken into account. It is difficult, though, to compare S/N ratios between devices measured using different weighting curves.

Images

Figure A1.3   Signal-to-noise ratio is often quoted as the number of decibels between the reference level and the noise floor. Available dynamic range may be greater than this, and is often quoted as the difference between the peak level and the noise floor

Fact file A1.3   Noise weighting curves

As discussed in Fact File 1.4, weighting filters are used when measuring noise to produce a Figure which more closely represents the subjective annoyance value of the noise.

Some examples of regularly used weighting curves are shown in the diagram, and it will be seen that they are similar but not the same. Here 0 dB on the vertical axis represents the point at which the gain of the filter is ‘unity’, that is where it neither attenuates or amplifies the signal. The ‘A’ curve is not normally used for measuring audio equipment noise, since it was designed for measuring acoustic background noise in buildings. The various DIN and CCIR curves are more commonly used in audio equipment specifications.

Images

In analogue tape recorders, dynamic range is sometimes quoted as the number of decibels between the 3 per cent MOL (see ‘Harmonic distortion – practical examples’, above) and the weighted noise floor. This gives an idea of the available recording ‘window’, since the MOL is often well above the reference level. In digital recorders, the peak recording level is really also the reference level, since there is no point in recording above this point due to the sudden clipping of the signal, and thus dynamic range and S/N ratio are often referred to this point, although some manufacturers have chosen to refer to a level 15 dB below it.

Table A1.3 gives some typical values for S/N ratio found in audio equipment.

Wow and flutter

Wow and flutter are names used to describe speed (pitch) variations of a tape machine or turntable. Wow is applied to slow variations in speed and flutter is applied to faster variations in speed. The figures depend very much on the mechanical quality of the device and its state of wear and cleanliness. Again a weighting filter (usually to the DIN standard) is used when measuring to produce a figure which closely correlates with one’s perception of the annoyance of speed variations. Specifications are usually quoted as WRMS (Weighted Root-Mean Square, a form of average), but occasionally peak figures may be used which will be worse than the RMS figures. Long-term speed accuracy is also quoted in many cases – this being the anticipated overall drift in speed of the machine over a reel of tape. These days speed drift is less of a problem than it used to be, with machines remaining stable to within hundredths of a per cent over the length of a reel. Drift is only really a problem if two machines are to be synchronised.

Table A1.3   Typical values for CCIR weighted S/N ratio

Device

S/N ratio

Consumer cassette machine
(without noise reduction)

50 dB (ref. 315 Hz, 200 nWbm−1)

Professional analogue tape machine
(without noise reduction) @ 15 ips (38 cm s−1)

65 dB (ref. 1 kHz, 320 nWbm−1)

16 bit digital audio recorder

94 dB (ref. peak level)

Professional power amplifier

108 dB (ref. max. output)

A typical figure for a good analogue tape machine would be better than 0.02 per cent WRMS, and good cassette machines can approach this figure also. Cheap cassette machines coupled with poor tapes can cause the figure to rise considerably, with some examples approaching 0.5 per cent or more. A good LP turntable may achieve 0.02 per cent results, but again cheaper models will be worse. Digital audio recorders and CD players do not suffer from wow and flutter in the same way as analogue transports, since the audio data from the tape or disc is first passed through a so-called timebase corrector, prior to conversion, which removes any speed variations resulting from mechanical instability in the transport.

A machine with poor W&F results will sound most unpleasant, with either uncomfortable ‘wowing’ in the pitch of notes, or fast flutter which gives rise to a ‘roughness’ in the sound aptly described by the word ‘flutter’, and possibly some intermodulation distortion (see next section).

Intermodulation (IM) distortion

IM distortion results when two or more signals are passed through a non-linear device. Since all audio equipment has some non-linearity there will always be small amounts of IM distortion, but these can be very low.

Unlike harmonic distortion, IM distortion may not be harmonically related to the frequency of the signals causing the distortion, and thus it is audibly more unpleasant. If two sine-wave tones are passed through a non-linear device, sum and difference tones may arise between them (see Figure A1.4). For example, a tone at f1 = 1000 Hz and a tone at f2 = 1100 Hz might give rise to IM products at f1f2 = 100 Hz, and also at f1 + f2 = 2100 Hz, as well as subsidiary products at 2f1f2 and so on. The dominant components will depend on the nature of the non-linearity.

IM distortion can also arise when speed variations of a tape transport or LP turntable modulate the signals reproduced from them. For example, a tape transport with speed variations at 25 Hz modulating a reproduced signal at 1000 Hz could give rise to IM products at 975 Hz and 1025 Hz.

Low IM distortion figures are an important mark of a high-quality system, since such distortion is a major contributor to poor sound quality, but it is less often quoted than THD (see ‘Harmonic distortion – practical examples’, above).

Crosstalk

Crosstalk figures describe the expected amount of break-through from one channel of a device to another. For example, in a stereo tape recorder crosstalk may arise between the left and right channels, or in a multitrack recorder between adjacent tracks. In general, crosstalk is undesirable. It is usually quoted either as negative decibels relative to the causatory signal (e.g.: -53 dB), or as decibels channel separation (e.g.: 53 dB).

Crosstalk may arise within the electronics of a device (such as by electromagnetic induction between tracks on a printed-circuit board), magnetically (by induction within the heads of a tape machine), or externally (say between cables in a duct). For good performance in a multitrack tape machine and mixer it is essential that crosstalk is very low, since the operator will not want components of one channel being audible on another. For stereo recorders and reproducers the requirement is not so stringent.

Images

Figure A1.4   Intermodulation distortion between two input signals in a non-linear device results in low level sum-and-difference components in the output signal

A typical figure for a good multitrack analogue recorder in reproduce mode is between 40 and 50 dB, but this is usually much worse between tracks in record and adjacent tracks in sync reproduce (see Fact File 6.3). In digital equipment the crosstalk between channels is exceptionally low (around -90 dB), since crosstalk is rejected as part of the replay decoding process. In analogue LP cartridges separation is quite poor, as it is also in analogue stereo FM radio and TV broadcasting (around 25–30 dB), but is normally adequate for maintaining stereo separation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset