Chapter 1

What is sound?

A vibrating source

Sound is produced when an object (the source) vibrates and causes the air around it to move. Consider the sphere shown in Figure 1.1. It is a pulsating sphere which could be imagined as something like a squash ball, and it is pulsating regularly so that its size oscillates between being slightly larger than normal and then slightly smaller than normal. As it pulsates it will alternately compress and then rarefy the surrounding air, resulting in a series of compressions and rarefactions travelling away from the sphere, rather like a three-dimensional version of the ripples which travel away from a stone dropped into a pond. These are known as longitudinal waves since the air particles move in the same dimension as the direction of wave travel. The alternative to longitudinal wave motion is transverse wave motion (see Figure 1.2), such as is found in vibrating strings, where the motion of the string is at right angles to the direction of apparent wave travel.

Characteristics of a sound wave

The rate at which the source oscillates is the frequency of the sound wave it produces, and is quoted in hertz (Hz) or cycles per second (cps). 1000 hertz is termed 1 kilohertz (1 kHz). The amount of compression and rarefaction of the air which results from the sphere’s motion is the amplitude of the sound wave, and is related to the loudness of the sound when it is finally perceived by the ear (see Chapter 2). The distance between two adjacent peaks of compression or rarefaction as the wave travels through the air is the wavelength of the sound wave, and is often represented by the Greek letter lambda (λ). The wavelength depends on how fast the sound wave travels, since a fast-travelling wave would result in a greater distance between peaks than a slow-travelling wave, given a fixed time between compression peaks (i.e.: a fixed frequency of oscillation of the source).

As shown in Figure 1.3, the sound wave’s characteristics can be represented on a graph, with amplitude plotted on the vertical axis and time plotted on the horizontal axis. It will be seen that both positive and negative ranges are shown on the vertical axis: these represent compressions (+) and rarefactions (–) of the air. This graph represents the waveform of the sound. For a moment, a source vibrating in a very simple and regular manner is assumed, in so-called simple harmonic motion, the result of which is a simple sound wave known as a sine wave. The most simple vibrating systems oscillate in this way, such as a mass suspended from a spring, or a swinging pendulum (see also ‘Phase’ below). It will be seen that the frequency (f) is the inverse of the time between peaks or troughs of the wave (f = 1/t). So the shorter the time between oscillations of the source, the higher the frequency. The human ear is capable of perceiving sounds with frequencies between approximately 20 Hz and 20 kHz (see ‘Frequency perception’, Chapter 2); this is known as the audio frequency range or audio spectrum.

Images

Figure 1.1   (a) A simple sound source can be imagined as like a pulsating sphere radiating spherical waves. (b) The longitudinal wave thus created is a succession of compressions and rarefactions of the air

Images

Figure 1.2   In a transverse wave the motion of any point on the wave is at right angles to the apparent direction of motion of the wave

Images

Figure 1.3   A graphical representation of a sinusoidal sound waveform. The period of the wave is represented by t, and its frequency by 1/t

How sound travels in air

Air is made up of gas molecules and has an elastic property (imagine putting a thumb over the end of a bicycle pump and compressing the air inside – the air is springy). Longitudinal sound waves travel in air in somewhat the same fashion as a wave travels down a row of up-ended dominoes after the first one is pushed over. The half-cycle of compression created by the vibrating source causes successive air particles to be moved in a knock-on effect, and this is normally followed by a balancing rarefaction which causes a similar motion of particles in the opposite direction.

It may be appreciated that the net effect of this is that individual air particles do not actually travel – they oscillate about a fixed point – but the result is that a wave is formed which appears to move away from the source. The speed at which it moves away from the source depends on the density and elasticity of the substance through which it passes, and in air the speed is relatively slow compared with the speed at which sound travels through most solids. In air the speed of sound is approximately 340 metres per second (m s−1), although this depends on the temperature of the air. At freezing point the speed is reduced to nearer 330 m s−1. In steel, to give an example of a solid, the speed of sound is approximately 5100 m s−1.

The frequency and wavelength of a sound wave are related very simply if the speed of the wave (usually denoted by the letter c) is known:

c = f λ or λ = c/f

To show some examples, the wavelength of sound in air at 20 Hz (the low-frequency or LF end of the audio spectrum), assuming normal room temperature, would be:

λ = 340/20 = 17 metres

whereas the wavelength of 20 kHz (at the high-frequency or HF end of the audio spectrum) would be 1.7 cm. Thus it is apparent that the wavelength of sound ranges from being very long in relation to most natural objects at low frequencies, to quite short at high frequencies. This is important when considering how sound behaves when it encounters objects – whether the object acts as a barrier or whether the sound bends around it (see Fact File 1.5).

Simple and complex sounds

In the foregoing example, the sound had a simple waveform – it was a sine wave or sinusoidal waveform – the type which might result from a very simple vibrating system such as a weight suspended on a spring. Sine waves have a very pure sound because they consist of energy at only one frequency, and are often called pure tones. They are not heard very commonly in real life (although they can be generated electrically) since most sound sources do not vibrate in such a simple manner. A person whistling or a recorder (a simple wind instrument) produces a sound which approaches a sinusoidal waveform. Most real sounds are made up of a combination of vibration patterns which result in a more complex waveform. The more complex the waveform, the more like noise the sound becomes, and when the waveform has a highly random pattern the sound is said to be noise (see ‘Frequency spectra of non-repetitive sounds’, below).

The important characteristic of sounds which have a definite pitch is that they are repetitive: that is, the waveform, no matter how complex, repeats its pattern in the same way at regular intervals. All such waveforms can be broken down into a series of components known as harmonics, using a mathematical process called Fourier analysis (after the mathematician Joseph Fourier). Some examples of equivalent line spectra for different waveforms are given in Figure 1.4. This figure shows another way of depicting the characteristics of the sound graphically – that is, by drawing a so-called line spectrum which shows frequency along the horizontal axis and amplitude up the vertical axis. The line spectrum shows the relative strengths of different frequency components which make up a sound. Where there is a line there is a frequency component. It will be noticed that the more complex the waveform the more complex the corresponding line spectrum.

For every waveform, such as that shown in Figure 1.3, there is a corresponding line spectrum: waveforms and line spectra are simply two different ways of showing the characteristics of the sound. Figure 1.3 is called a time-domain plot, whilst the line spectrum is called a frequency-domain plot. Unless otherwise stated, such frequency-domain graphs in this book will cover the audio-frequency range, from 20 Hz at the lower end to 20 kHz at the upper end.

In a reversal of the above breaking-down of waveforms into their component frequencies it is also possible to construct or synthesise waveforms by adding together the relevant components.

Images

Figure 1.4   Equivalent line spectra for a selection of simple waveforms. (a) The sine wave consists of only one component at the fundamental frequency f. (b) The sawtooth wave consists of components at the fundamental and its integer multiples, with amplitudes steadily decreasing. (c) The square wave consists of components at odd multiples of the fundamental frequency

Frequency spectra of repetitive sounds

As will be seen in Figure 1.4, the simple sine wave has a line spectrum consisting of only one component at the frequency of the sine wave. This is known as the fundamental frequency of oscillation. The other repetitive waveforms, such as the square wave, have a fundamental frequency as well as a number of additional components above the fundamental. These are known as harmonics, but may also be referred to as overtones or partials.

Harmonics are frequency components of a sound which occur at integer multiples of the fundamental frequency, that is at twice, three times, four times and so on. Thus a sound with a fundamental of 100 Hz might also contain harmonics at 200 Hz, 400 Hz and 600 Hz. The reason for the existence of these harmonics is that most simple vibrating sources are capable of vibrating in a number of harmonic modes at the same time. Consider a stretched string, as shown in Figure 1.5. It may be made to vibrate in any of a number of modes, corresponding to integer multiples of the fundamental frequency of vibration of the string (the concept of ‘standing waves’ is introduced below). The fundamental corresponds to the mode in which the string moves up and down as a whole, whereas the harmonics correspond to modes in which the vibration pattern is divided into points of maximum and minimum motion along the string (these are called antinodes and nodes). It will be seen that the second mode involves two peaks of vibration, the third mode three peaks, and so on.

Images

Figure 1.5   Modes of vibration of a stretched string. (a) Fundamental. (b) Second harmonic. (c) Third harmonic

In accepted terminology, the fundamental is also the first harmonic, and thus the next component is the second harmonic, and so on. Confusingly, the second harmonic is also known as the first overtone. For the waveforms shown in Figure 1.4, the fundamental has the highest amplitude, and the amplitudes of the harmonics decrease with increasing frequency, but this will not always be the case with real sounds since many waveforms have line spectra which show the harmonics to be higher in amplitude than the fundamental. It is also quite feasible for there to be harmonics missing in the line spectrum, and this depends entirely on the waveform in question.

It is also possible for there to be overtones in the frequency spectrum of a sound which are not related in a simple integer-multiple fashion to the fundamental. These cannot correctly be termed harmonics, and they are more correctly referred to as overtones or inharmonic partials. They tend to arise in vibrating sources which have a complicated shape, and which do not vibrate in simple harmonic motion but have a number of repetitive modes of vibration. Their patterns of oscillation are often unusual, such as might be observed in a bell or a percussion instrument. It is still possible for such sounds to have a recognisable pitch, but this depends on the strength of the fundamental. In bells and other such sources, one often hears the presence of several strong inharmonic overtones.

Frequency spectra of non-repetitive sounds

Non-repetitive waveforms do not have a recognisable pitch and sound noise-like. Their frequency spectra are likely to consist of a collection of components at unrelated frequencies, although some frequencies may be more dominant than others. The analysis of such waves to show their frequency spectra is more complicated than with repetitive waves, but is still possible using a mathematical technique called Fourier transformation, the result of which is a frequency-domain plot of a time-domain waveform.

Single, short pulses can be shown to have continuous frequency spectra which extend over quite a wide frequency range, and the shorter the pulse the wider its frequency spectrum but usually the lower its total energy (see Figure 1.6). Random waveforms will tend to sound like hiss, and a completely random waveform in which the frequency, amplitude and phase of components are equally probable and constantly varying is called white noise. A white noise signal’s spectrum is flat, when averaged over a period of time, right across the audio-frequency range (and theoretically above it). White noise has equal energy for a given bandwidth, whereas another type of noise, known as pink noise, has equal energy per octave. For this reason white noise sounds subjectively to have more high-frequency energy than pink noise.

Images

Figure 1.6   Frequency spectra of non-repetitive waveforms. (a) Pulse. (b) Noise

Phase

Two waves of the same frequency are said to be ‘in phase’ when their compression (positive) and rarefaction (negative) half-cycles coincide exactly in time and space (see Figure 1.7). If two in-phase signals of equal amplitude are added together, or superimposed, they will sum to produce another signal of the same frequency but twice the amplitude. Signals are said to be out of phase when the positive half-cycle of one coincides with the negative half-cycle of the other. If these two signals are added together they will cancel each other out, and the result will be no signal.

Clearly these are two extreme cases, and it is entirely possible to superimpose two sounds of the same frequency which are only partially in phase with each other. The resultant wave in this case will be a partial addition or partial cancellation, and the phase of the resulting wave will lie somewhere between that of the two components (see Figure 1.7(c)).

Phase differences between signals can be the result of time delays between them. If two identical signals start out at sources equidistant from a listener at the same time as each other then they will be in phase by the time they arrive at the listener. If one source is more distant than the other then it will be delayed, and the phase relationship between the two will depend upon the amount of delay (see Figure 1.8). A useful rule-of-thumb is that sound travels about 30 cm (1 foot) per millisecond, so if the second source in the above example were 1 metre (just over 3 ft) more distant than the first it would be delayed by just over 3 ms. The resulting phase relationship between the two signals, it may be appreciated, would depend on the frequency of the sound, since at a frequency of around 330 Hz the 3 ms delay would correspond to one wavelength and thus the delayed signal would be in phase with the undelayed signal. If the delay had been half this (1.5 ms) then the two signals would have been out of phase at 330 Hz.

Phase is often quoted as a number of degrees relative to some reference, and this must be related back to the nature of a sine wave. A diagram is the best way to illustrate this point, and looking at Figure 1.9 it will be seen that a sine wave may be considered as a graph of the vertical position of a rotating spot on the outer rim of a disc (the amplitude of the wave), plotted against time. The height of the spot rises and falls regularly as the circle rotates at a constant speed. The sine wave is so called because the spot’s height is directly proportional to the mathematical sine of the angle of rotation of the disc, with zero degrees occurring at the origin of the graph and at the point shown on the disc’s rotation in the diagram. The vertical amplitude scale on the graph goes from minus one (maximum negative amplitude) to plus one (maximum positive amplitude), passing through zero at the halfway point. At 90° of rotation the amplitude of the sine wave is maximum positive (the sine of 90° is +1), and at 180° it is zero (sin 180° = 0). At 270° it is maximum negative (sin 270° = −1), and at 360° it is zero again. Thus in one cycle of the sine wave the circle has passed through 360° of rotation.

Images

Figure 1.7   (a) When two identical in-phase waves are added together, the result is a wave of the same frequency and phase but twice the amplitude. (b) Two identical out-of-phase waves add to give nothing. (c) Two identical waves partially out of phase add to give a resultant wave with a phase and amplitude which is the point-by-point sum of the two

Images

Figure 1.8   If the two loudspeakers in the drawing emit the same wave at the same time, the phase difference between the waves at the listener’s ear will be directly related to the delay t2t1

It is now possible to go back to the phase relationship between two waves of the same frequency. If each cycle is considered as corresponding to 360°, then one can say just how many degrees one wave is ahead of or behind another by comparing the 0° point on one wave with the 0° point on the other (see Figure 1.10). In the example wave 1 is 90° out of phase with wave 2. It is important to realise that phase is only a relevant concept in the case of continuous repetitive waveforms, and has little meaning in the case of impulsive or transient sounds where time difference is the more relevant quantity. It can be deduced from the foregoing discussion that (a) the higher the frequency, the greater the phase difference which would result from a given time delay between two signals, and (b) it is possible for there to be more than 360° of phase difference between two signals if the delay is great enough to delay the second signal by more than one cycle. In the latter case it becomes difficult to tell how many cycles of delay have elapsed unless a discontinuity arises in the signal, since a phase difference of 360° is indistinguishable from a phase difference of 0°.

Images

Figure 1.9   The height of the spot varies sinusoidally with the angle of rotation of the wheel. The phase angle of a sine wave can be understood in terms of the number of degrees of rotation of the wheel

Images

Figure 1.10   The lower wave is 90° out of phase with the upper wave

Sound in electrical form

Although the sound that one hears is due to compression and rarefaction of the air, it is often necessary to convert sound into an electrical form in order to perform operations on it such as amplification, recording and mixing. As detailed in Fact File 3.1 and Chapter 3, it is the job of the microphone to convert sound from an acoustical form into an electrical form. The process of conversion will not be described here, but the result is important because if it can be assumed for a moment that the microphone is perfect then the resulting electrical waveform will be exactly the same shape as the acoustical waveform which caused it.

The equivalent of the amplitude of the acoustical signal in electrical terms is the voltage of the electrical signal. If the voltage at the output of a microphone were to be measured whilst the microphone was picking up an acoustical sine wave, one would measure a voltage which changed sinusoidally as well. Figure 1.11 shows this situation, and it may be seen that an acoustical compression of the air corresponds to a positive-going voltage, whilst an acoustical rarefaction of the air corresponds to a negative-going voltage. (This is the norm, although some sound reproduction systems introduce an absolute phase reversal in the relationship between acoustical phase and electrical phase, such that an acoustical compression becomes equivalent to a negative voltage. Some people claim to be able to hear the difference.)

The other important quantity in electrical terms is the current flowing down the wire from the microphone. Current is the electrical equivalent of the air particle motion discussed in ‘How sound travels in air’, above. Just as the acoustical sound wave was carried in the motion of the air particles, so the electrical sound wave is carried in the motion of tiny charge carriers which reside in the metal of a wire (these are called electrons). When the voltage is positive the current moves in one direction, and when it is negative the current moves in the other direction. Since the voltage generated by a microphone is repeatedly alternating between positive and negative, in sympathy with the sound wave’s compression and rarefaction cycles, the current similarly changes direction each half cycle. Just as the air particles in ‘Characteristics of a sound wave’, above, did not actually go anywhere in the long term, so the electrons carrying the current do not go anywhere either – they simply oscillate about a fixed point. This is known as alternating current or AC.

Images

Figure 1.11   A microphone converts variations in acoustical sound pressure into variations in electrical voltage. Normally a compression of the air results in a positive voltage and a rarefaction results in a negative voltage

A useful analogy to the above (both electrical and acoustical) exists in plumbing. If one considers water in a pipe fed from a header tank, as shown in Figure 1.12, the voltage is equivalent to the pressure of water which results from the header tank, and the current is equivalent to the rate of flow of water through the pipe. The only difference is that the diagram is concerned with a direct current situation in which the direction of flow is not repeatedly changing. The quantity of resistance should be introduced here, and is analogous to the diameter of the pipe. Resistance impedes the flow of water through the pipe, as it does the flow of electrons through a wire and the flow of acoustical sound energy through a substance. For a fixed voltage (or water pressure in this analogy), a high resistance (narrow pipe) will result in a small current (a trickle of water), whilst a low resistance (wide pipe) will result in a large current. The relationship between voltage, current and resistance was established by Ohm, in the form of Ohm’s law, as described in Fact File 1.1. There is also a relationship between power and voltage, current and resistance.

Images

Figure 1.12   There are parallels between the flow of water in a pipe and the flow of electricity in a wire, as shown in this drawing

Fact file 1.1   Ohm’s law

Ohm’s law states that there is a fixed and simple relationship between the current flowing through a device (I), the voltage across it (V), and its resistance (R), as shown in the diagram:

Images

V = IR

or:

I = V/R

or:

R = V/I

Thus if the resistance of a device is known, and the voltage dropped across it can be measured, then the current flow may be calculated, for example.

There is also a relationship between the parameters above and the power in watts (W) dissipated in a device:

W = I2R = V2/R

In AC systems, resistance is replaced by impedance, a complex term which contains both resistance and reactance components. The reactance part varies with the frequency of the signal; thus the impedance of an electrical device also varies with the frequency of a signal. Capacitors (basically two conductive plates separated by an insulator) are electrical devices which present a high impedance to low-frequency signals and a low impedance to high-frequency signals. They will not pass direct current. Inductors (basically coils of wire) are electrical devices which present a high impedance to high-frequency signals and a low impedance to low-frequency signals. Capacitance is measured in farads, inductance in henrys.

Displaying the characteristics of a sound wave

Two devices can be introduced at this point which illustrate graphically the various characteristics of sound signals so far described. It would be useful to (a) display the waveform of the sound, and (b) display the frequency spectrum of the sound. In other words (a) the time-domain signal and (b) the frequency-domain signal.

Images

Figure 1.13   (a) An oscilloscope displays the waveform of an electric signal by means of a moving spot which is deflected up by a positive signal and down by a negative signal. (b) A spectrum analyser displays the frequency spectrum of an electrical waveform in the form of lines representing the amplitudes of different spectral components of the signal

An oscilloscope is used for displaying the waveform of a sound, and a spectrum analyser is used for showing which frequencies are contained in the signal and their amplitudes. Examples of such devices are pictured in Figure 1.13. Both devices accept sound signals in electrical form and display their analyses of the sound on a screen. The oscilloscope displays a moving spot which scans horizontally at one of a number of fixed speeds from left to right and whose vertical deflection is controlled by the voltage of the sound signal (up for positive, down for negative). In this way it plots the waveform of the sound as it varies with time. Many oscilloscopes have two inputs and can plot two waveforms at the same time, and this can be useful for comparing the relative phases of two signals (see ‘Phase’, above).

The spectrum analyser works in different ways depending on the method of spectrum analysis. A real-time analyser displays a constantly updating line spectrum, similar to those depicted earlier in this chapter, and shows the frequency components of the input signal on the horizontal scale together with their amplitudes on the vertical scale.

The decibel

The unit of the decibel is used widely in sound engineering, often in preference to other units such as volts, watts, or other such absolute units, since it is a convenient way of representing the ratio of one signal’s amplitude to another’s. It also results in numbers of a convenient size which approximate more closely to one’s subjective impression of changes in the amplitude of a signal, and it helps to compress the range of values between the maximum and minimum sound levels encountered in real signals. For example, the range of sound intensities (see next section) which can be handled by the human ear covers about fourteen powers of ten, from 0.000 000 000 001 Wm−2 to around 100 Wm−2, but the equivalent range in decibels is only from 0 to 140 dB.

Some examples of the use of the decibel are given in Fact File 1.2. The relationship between the decibel and human sound perception is discussed in more detail in Chapter 2. Operating levels in recording equipment are discussed further in ‘Metering systems’, Chapter 5 and ‘Magnetic recording levels’, Chapter 6.

Fact file 1.2   The decibel

Basic decibels

The decibel is based on the logarithm of the ratio between two numbers. It describes how much larger or smaller one value is than the other. It can also be used as an absolute unit of measurement if the reference value is fixed and known. Some standardised references have been established for decibel scales in different fields of sound engineering (see below).

The decibel is strictly ten times the logarithm to the base ten of the ratio between the powers of two signals:

dB = 10 log10 (P1/P2)

For example, the difference in decibels between a signal with a power of 1 watt and one of 2 watts is 10 log (2/1) = 3 dB.

If the decibel is used to compare values other than signal powers, the relationship to signal power must be taken into account. Voltage has a square relationship to power (from Ohm’s law:W = V2/R); thus to compare two voltages:

dB = 10 log(V12/V22), or 10 log (V1/V2)2, or 20 log (V1/V2)

For example, the difference in decibels between a signal with a voltage of 1 volt and one of 2 volts is 20 log (2/1) = 6 dB. So a doubling in voltage gives rise to an increase of 6 dB, and a doubling in power gives rise to an increase of 3 dB. A similar relationship applies to acoustical sound pressure (analogous to electrical voltage) and sound power (analogous to electrical power).

Decibels with a reference

If a signal level is quoted in decibels, then a reference must normally be given, otherwise the figure means nothing; e.g.: ‘Signal level = 47 dB’ cannot have a meaning unless one knows that the signal is 47 dB above a known point. ‘+8 dB ref. 1 volt’ has a meaning since one now knows that the level is 8 dB higher than 1 volt, and thus one could calculate the voltage of the signal.

There are exceptions in practice, since in some fields a reference level is accepted as implicit. Sound pressure levels (SPLs) are an example, since the reference level is defined worldwide as 2 × 10−6 Nm−2 (20 μPa). Thus to state ‘SPL = 77 dB’ is probably acceptable, although confusion can still arise due to misunderstandings over such things as weighting curves (see Fact File 1.4). In sound recording, 0 dB or ‘zero level’ is a nominal reference level used for aligning equipment and setting recording levels, often corresponding to 0.775 volts (0 dBu) although this is subject to variations in studio centres in different locations. (Some studios use +4 dBu as their reference level, for example.) ‘0 dB’ does not mean ‘no signal’, it means that the signal concerned is at the same level as the reference.

Often a letter is placed after ‘dB’ to denote the reference standard in use (e.g.: ‘dBm’), and a number of standard abbreviations are in use, some examples of which are given below. Sometimes the suffix denotes a particular frequency weighting characteristic used in the measurement of noise (e.g.: ‘dBA’).

Abbrev.

Ref. Level

dBV

1 volt

dBu

0.775 volt (Europe)

dBv

0.775 volt (USA)

dBm

1 milliwatt (see Chapter 12)

dBA

dB SPL, A-weighted response

A full listing of suffixes is given in CCIR Recommendation 5741,1982.

Useful decibel ratios to remember (voltages or SPLs)

It is more common to deal in terms of voltage or SPL ratios than power ratios in audio systems. Here are some useful dB equivalents of different voltage or SPL relationships and multiplication factors:

dB

Multiplication factor

  0

     1

+3

   √2

+6

     2

+20

   10

+60

1000

Decibels are not only used to describe the ratio between two signals, or the level of a signal above a reference, but they are also used to describe the voltage gain of a device. For example, a microphone amplifier may have a gain of 60 dB, which is the equivalent of multiplying the input voltage by a factor of 1000, as shown in the example below:

20 log 1000/1 = 60 dB

Sound power and sound pressure

A simple sound source, such as the pulsating sphere used at the start of this chapter, radiates sound power omnidirectionally – that is, equally in all directions, rather like a three-dimensional version of the ripples moving away from a stone dropped in a pond. The sound source generates a certain amount of power, measured in watts, which is gradually distributed over an increasingly large area as the wavefront travels further from the source; thus the amount of power per square metre passing through the surface of the imaginary sphere surrounding the source gets smaller with increasing distance (see Fact File 1.3). For practical purposes the intensity of the direct sound from a source drops by 6 dB for every doubling in distance from the source (see Figure 1.14).

The amount of acoustical power generated by real sound sources is surprisingly small, compared with the number of watts of electrical power involved in lighting a light bulb, for example. An acoustical source radiating 20 watts would produce a sound pressure level close to the threshold of pain if a listener was close to the source. Most everyday sources generate fractions of a watt of sound power, and this energy is eventually dissipated into heat by absorption (see below). The amount of heat produced by the dissipation of acoustic energy is relatively insignificant – the chances of increasing the temperature of a room by shouting are slight, at least in the physical sense.

Acoustical power is sometimes confused with the power output of an amplifier used to drive a loudspeaker, and audio engineers will be familiar with power outputs from amplifiers of many hundreds of watts. It is important to realise that loudspeakers are very inefficient devices – that is, they only convert a small proportion of their electrical input power into acoustical power. Thus, even if the input to a loudspeaker was to be, say, 100 watts electrically, the acoustical output power might only be perhaps 1 watt, suggesting a loudspeaker that is only 1 per cent efficient. The remaining power would be dissipated as heat in the voice coil.

Sound pressure is the effect of sound power on its surroundings. To use a central heating analogy, sound power is analogous to the heat energy generated by a radiator into a room, whilst sound pressure is analogous to the temperature of the air in the room. The temperature is what a person entering the room would feel, but the heat-generating radiator is the source of power. Sound pressure level (SPL) is measured in newtons per square metre (Nm−2). A convenient reference level is set for sound pressure and intensity measurements, this being referred to as 0 dB. This level of 0 dB is approximately equivalent to the threshold of hearing (the quietest sound perceivable by an average person) at a frequency of 1 kHz, and corresponds to an SPL of 2 × 10−5 Nm−2, which in turn is equivalent to an intensity of approximately 10−12 Wm−2 in the free field (see below).

Fact file 1.3   The inverse-square law

The law of decreasing power per unit area (intensity) of a wavefront with increasing distance from the source is known as the inverse-square law, because intensity drops in proportion to the inverse square of the distance from the source. Why is this? It is because the sound power from a point source is spread over the surface area of a sphere (S), which from elementary maths is given by:

Images

S = 4πr2

where r is the distance from the source or the radius of the sphere, as shown in the diagram.

If the original power of the source is W watts, then the intensity, or power per unit area (I) at distance r is:

I = W/4πr2

For example, if the power of a source was 0.1 watt, the intensity at 4 m distance would be:

I = 0.1÷ (4 × 3.14 × 16) 0.0005 Wm−2

The sound intensity level (SIL) of this signal in decibels can be calculated by comparing it with the accepted reference level of 10−12 Wm−2 :

SIL(dB) = 10 log((5 × 10−4) ÷ (10−12))

= 87 dB

Sound pressure levels are often quoted in dB (e.g.: SPL = 63 dB means that the SPL is 63 dB above 2 × 10−5 Nm−2). The SPL in dB may not accurately represent the loudness of a sound, and thus a subjective unit of loudness has been derived from research data, called the phon. This is discussed further in Chapter 2. Some methods of measuring sound pressure levels are discussed in Fact File 1.4.

Images

Figure 1.14   The sound power which had passed through 1 m2 of space at distance r from the source will pass through 4 m2 at distance 2r, and thus will have one quarter of the intensity

Free and reverberant fields

The free field in acoustic terms is an acoustical area in which there are no reflections. Truly free fields are rarely encountered in reality, because there are nearly always reflections of some kind, even if at a very low level. If the reader can imagine the sensation of being suspended out-of-doors, way above the ground, away from any buildings or other surfaces, then he or she will have an idea of the experience of a free-field condition. The result is an acoustically ‘dead’ environment. Acoustic experiments are sometimes performed in anechoic chambers, which are rooms specially treated so as to produce almost no reflections at any frequency – the surfaces are totally absorptive – and these attempt to create near free-field conditions.

In the free field all the sound energy from a source is radiated away from the source and none is reflected; thus the inverse-square law (Fact File 1.3) entirely dictates the level of sound at any distance from the source. Of course the source may be directional, in which case its directivity factor must be taken into account. A source with a directivity factor of 2 on its axis of maximum radiation radiates twice as much power in this direction as it would have if it had been radiating omnidirectionally. The directivity index of a source is measured in dB, giving the above example a directivity index of 3 dB. If calculating the intensity at a given distance from a directional source (as shown in Fact File 1.3), one must take into account its directivity factor on the axis concerned by multiplying the power of the source by the directivity factor before dividing by 4πr2.

In a room there is both direct and reflected sound. At a certain distance from a source contained within a room the acoustic field is said to be diffuse or reverberant, since reflected sound energy predominates over direct sound. A short time after the source has begun to generate sound a diffuse pattern of reflections will have built up throughout the room, and the reflected sound energy will become roughly constant at any point in the room. Close to the source the direct sound energy is still at quite a high level, and thus the reflected sound makes a smaller contribution to the total. This region is called the near field. (It is popular in sound recording to make use of so-called ‘near-field monitors’, which are loudspeakers mounted quite close to the listener, such that the direct sound predominates over the effects of the room.)

The exact distance from a source at which a sound field becomes dominated by reverberant energy depends on the reverberation time of the room, and this in turn depends on the amount of absorption in the room, and the room’s volume (see Fact File 1.5). Figure 1.15 shows how the SPL changes as distance increases from a source in three different rooms. Clearly, in the acoustically ‘dead’ room, the conditions approach that of the free field (with sound intensity dropping at close to the expected 6 dB per doubling in distance), since the amount of reverberant energy is very small. The critical distance at which the contribution from direct sound equals that from reflected sound is further from the source than when the room is very reverberant. In the reverberant room the sound pressure level does not change much with distance from the source because reflected sound energy predominates after only a short distance. This is important in room design, since although a short reverberation time may be desirable in a recording control room, for example, it has the disadvantage that the change in SPL with distance from the speakers will be quite severe, requiring very highly powered amplifiers and heavy-duty speakers to provide the necessary level. A slightly longer reverberation time makes the room less disconcerting to work in, and relieves the requirement on loudspeaker power.

Fact file 1.4   Measuring SPLs

Typically a sound pressure level (SPL) meter is used to measure the level of sound at a particular point. It is a device that houses a high quality omnidirectional (pressure) microphone (see ‘Omnidirectional pattern’, Chapter 3) connected to amplifiers, filters and a meter (see diagram).

Weighting filters

The microphone’s output voltage is proportional to the SPL incident upon it, and the weighting filters may be used to attenuate low and high frequencies according to a standard curve such as the ‘A’-weighting curve, which corresponds closely to the sensitivity of human hearing at low levels (see Chapter 2). SPLs quoted simply in dB are usually unweighted – in other words all frequencies are treated equally – but SPLs quoted in dBA will have been A-weighted and will correspond more closely to the perceived loudness of the signal. A-weighting was originally designed to be valid up to a loudness of 55 phons, since the ear’s frequency response becomes flatter at higher levels; between 55 and 85 phons the ‘B’ curve was intended to be used; above 85 phons the ‘C’ curve was used. The ‘D’ curve was devised particularly for measuring aircraft engine noise at very high level.

Now most standards suggest that the ‘A’ curve may be used for measuring noise at any SPL, principally for ease of comparability of measurements, but there is still disagreement in the industry about the relative merits of different curves. The ‘A’ curve attenuates low and high frequencies and will therefore under-read quite substantially for signals at these frequencies. This is an advantage in some circumstances and a disadvantage in others. The ‘C’ curve is recommended in the USA and Japan for aligning sound levels using noise signals in movie theatres, for example. This only rolls off the very extremes of the audio spectrum and is therefore quite close to an unweighted reading. Some researchers have found that the ‘B’ curve produces results that more closely relate measured sound signal levels to subjective loudness of those signals.

Noise criterion or rating (NC or NR)

Noise levels are often measured in rooms by comparing the level of the noise across the audible range with a standard set of curves called the noise criteria (NC) or noise rating (NR) curves. These curves set out how much noise is acceptable in each of a number of narrow frequency bands for the noise to meet a certain criterion. The noise criterion is then that of the nearest curve above which none of the measured results rises. NC curves are used principally in the USA, whereas NR curves are used principally in Europe. They allow considerably higher levels in low-frequency bands than in middle- and high-frequency bands, since the ear is less sensitive at low frequencies.

In order to measure the NC or NR of a location it is necessary to connect the measuring microphone to a set of filters or a spectrum analyser which is capable of displaying the SPL in one octave or one-third octave bands.

Further reading

British Standard 5969. Specification for sound level meters.

British Standard 6402. Sound exposure meters.

Images

Fact file 1.5   Absorption, reflection and RT

Absorption

When a sound wave encounters a surface some of its energy is absorbed and some reflected. The absorption coefficient of a substance describes, on a scale from 0 to 1, how much energy is absorbed. An absorption coefficient of 1 indicates total absorption, whereas 0 represents total reflection. The absorption coefficient of substances varies with frequency.

The total amount of absorption present in a room can be calculated by multiplying the absorption coefficient of each surface by its area and then adding the products together. All of the room’s surfaces must be taken into account, as must people, chairs and other furnishings. Tables of the performance of different substances are available in acoustics references (see Recommended further reading). Porous materials tend to absorb high frequencies more effectively than low frequencies, whereas resonant membrane- or panel-type absorbers tend to be better at low frequencies. Highly tuned artificial absorbers (Helmholtz absorbers) can be used to remove energy in a room at specific frequencies. The trends in absorption coefficient are shown in the diagram below.

Images

Reflection

The size of an object in relation to the wavelength of a sound is important in determining whether the sound wave will bend round it or be reflected by it. When an object is large in relation to the wavelength the object will act as a partial barrier to the sound, whereas when it is small the sound will bend or diffract around it. Since sound wavelengths in air range from approximately 18 metres at low frequencies to just over 1 cm at high frequencies, most commonly encountered objects will tend to act as barriers to sound at high frequencies but will have little effect at low frequencies.

Reverberation time

W. C. Sabine developed a simple and fairly reliable formula for calculating the reverberation time (RT60) of a room, assuming that absorptive material is distributed evenly around the surfaces. It relates the volume of the room (V) and its total absorption (A) to the time taken for the sound pressure level to decay by 60 dB after a sound source is turned off.

RT60 = (0.16V)/A seconds

In a large room where a considerable volume of air is present, and where the distance between surfaces is large, the absorption of the air becomes more important, in which case an additional component must be added to the above formula:

RT60 = (0.16V)/(A + xV) seconds

where x is the absorption factor of air, given at various temperatures and humidities in acoustics references.

The Sabine formula has been subject to modifications by such people as Eyring, in an attempt to make it more reliable in extreme cases of high absorption, and it should be realised that it can only be a guide.

Images

Figure 1.15   As the distance from a source increases direct sound level drops but reverberant sound level remains roughly constant. The resultant sound level experienced at different distances from the source depends on the reverberation time of the room, since in a reverberant room the level of reflected sound is higher than in a ‘dead’ room

Standing waves

The wavelength of sound varies considerably over the audible frequency range, as indicated in Fact File 1.5. At high frequencies, where the wavelength is small, it is appropriate to consider a sound wavefront rather like light – as a ray. Similar rules apply, such as the angle of incidence of a sound wave to a wall is the same as the angle of reflection. At low frequencies where the wavelength is comparable with the dimensions of the room it is necessary to consider other factors, since the room behaves more as a complex resonator, having certain frequencies at which strong pressure peaks and dips are set up in various locations.

Images

Figure 1.16   When a standing wave is set up between two walls of a room there arise points of maximum and minimum pressure. The first simple mode or eigentone occurs when half the wavelength of the sound equals the distance between the boundaries, as illustrated, with pressure maxima at the boundaries and a minimum in the centre

Standing waves or eigentones (sometimes also called room modes) may be set up when half the wavelength of the sound or a multiple is equal to one of the dimensions of the room (length, width or height). In such a case (see Figure 1.16) the reflected wave from the two surfaces involved is in phase with the incident wave and a pattern of summations and cancellations is set up, giving rise to points in the room at which the sound pressure is very high, and other points where it is very low. For the first mode (pictured), there is a peak at the two walls and a trough in the centre of the room. It is easy to experience such modes by generating a low-frequency sine tone into a room from an oscillator connected to an amplifier and loudspeaker placed in a corner. At selected low frequencies the room will resonate strongly and the pressure peaks may be experienced by walking around the room. There are always peaks towards the boundaries of the room, with troughs distributed at regular intervals between them. The positions of these depend on whether the mode has been created between the walls or between the floor and ceiling. The frequencies (f) at which the strongest modes will occur is given by:

f = (c/2) × (n/d)

where c is the speed of sound, d is the dimension involved (distance between walls or floor and ceiling), and n is the number of the mode.

A more complex formula can be used to predict the frequencies of all the modes in a room, including those secondary modes formed by reflections between four and six surfaces (oblique and tangential modes). The secondary modes typically have lower amplitudes than the primary modes (the axial modes) since they experience greater absorption. The formula is:

f = (c/2)√((p/L)2 + (q/W)2 + (r/H)2)

where p, q and r are the mode numbers for each dimension (1, 2, 3 …) and L, W and H are the length, width and height of the room. For example, to calculate the first axial mode involving only the length, make p = 1, q = 0 and r = 0. To calculate the first oblique mode involving all four walls, make p = 1, q = 1, r = 0, and so on.

Some quick sums will show, for a given room, that the modes are widely spaced at low frequencies and become more closely spaced at high frequencies. Above a certain frequency, there arise so many modes per octave that it is hard to identify them separately. As a rule-of-thumb, modes tend only to be particularly problematical up to about 200 Hz. The larger the room the more closely spaced the modes. Rooms with more than one dimension equal will experience so-called degenerate modes in which modes between two dimensions occur at the same frequency, resulting in an even stronger resonance at a particular frequency than otherwise. This is to be avoided.

Since low-frequency room modes cannot be avoided, except by introducing total absorption, the aim in room design is to reduce their effect by adjusting the ratios between dimensions to achieve an even spacing. A number of ‘ideal’ mode-spacing criteria have been developed by acousticians, but there is not the space to go into these in detail here. Larger rooms are generally more pleasing than small rooms, since the mode spacing is closer at low frequencies, and individual modes tend not to stick out so prominently, but room size has to be traded off against the target reverberation time. Making walls non-parallel does not prevent modes from forming (since oblique and tangential modes are still possible); it simply makes their frequencies more difficult to predict.

Fact file 1.6   Echoes and reflections

Early reflections

Early reflections are those echoes from nearby surfaces in a room which arise within the first few milliseconds (up to about 50 ms) of the direct sound arriving at a listener from a source (see the diagram). It is these reflections which give the listener the greatest clue as to the size of a room, since the delay between the direct sound and the first few reflections is related to the distance of the major surfaces in the room from the listener. Artificial reverberation devices allow for the simulation of a number of early reflections before the main body of reverberant sound decay, and this gives different reverberation programs the characteristic of different room sizes.

Images

Echoes

Echoes may be considered as discrete reflections of sound arriving at the listener after about 50 ms from the direct sound. These are perceived as separate arrivals, whereas those up to around 50 ms are normally integrated by the brain with the first arrival, not being perceived consciously as echoes. Such echoes are normally caused by more distant surfaces which are strongly reflective, such as a high ceiling or distant rear wall. Strong echoes are usually annoying in critical listening situations and should be suppressed by dispersion and absorption.

Flutter echoes

A flutter echo is sometimes set up when two parallel reflective surfaces face each other in a room, whilst the other surfaces are absorbent. It is possible for a wavefront to become ‘trapped’ into bouncing back and forth between these two surfaces until it decays, and this can result in a ‘buzzing’ or ‘ringing’ effect on transients (at the starts and ends of impulsive sounds such as hand claps).

The practical difficulty with room modes results from the unevenness in sound pressure throughout the room at mode frequencies. Thus a person sitting in one position might experience a very high level at a particular frequency whilst other listeners might hear very little. A room with prominent LF modes will ‘boom’ at certain frequencies, and this is unpleasant and undesirable for critical listening. The response of the room modifies the perceived frequency response of a loudspeaker, for example, such that even if the loudspeaker’s own frequency response may be acceptable it may become unacceptable when modified by the resonant characteristics of the room.

Room modes are not the only results of reflections in enclosed spaces, and some other examples are given in Fact File 1.6.

Recommended further reading

General acoustics

Alton Everest, F. (2000) The Master Handbook of Acoustics, 4th edn. McGraw-Hill

Benade, A. H. (1991) Fundamentals of Musical Acoustics. Oxford University Press

Campbell, M. and Greated, C. (2001) The Musician’s Guide to Acoustics. Oxford University Press

Eargle, J. (1995) Music, Sound, Technology, 2nd edition. Van Nostrand Rheinhold

Egan, M. D. (1988) Architectural Acoustics. McGraw-Hill

Hall, D. E. (2001) Musical Acoustics, 3rd edition. Brooks/Cole Publishing Co.

Howard, D. and Angus, J. (2000) Acoustics and Psychoacoustics, 2nd edition. Focal Press

Rettinger, M. (1988) Handbook of Architectural Acoustics and Noise Control. TAB Books

Rossing, T. D. (2001) The Science of Sound, 3rd edition. Addison-Wesley

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset