2
Digital processing principles


 

 

In this chapter the basics of digital processing will be considered. Starting from elements such as logic gates, the chapter builds steadily to an explanation of how computers and digital signal processors work.

2.1 Introduction

However complex a digital process, it can be broken down into smaller stages until finally one finds that there are really only two basic types of element in use, and these can be combined in some way and supplied with a clock to implement virtually any process. Figure 2.1 shows that the first type is a logic element. This produces an output which is a logical function of the input with minimal delay. The second type is a storage element which samples the state of the input(s) when clocked and holds or delays that state. The strength of binary logic is that the signal has only two states, and considerable noise and distortion of the binary waveform can be tolerated before the state becomes uncertain. At every logic element, the signal is compared with a threshold, and can thus can pass through any number of stages without being degraded. In addition, the use of a storage element at regular locations throughout logic circuits eliminates time variations or jitter. Figure 2.1 shows that if the inputs to a logic element change, the output will not change until the propagation delay of the element has elapsed. However, if the output of the logic element forms the input to a storage element, the output of that element will not change until the input is sampled at the next clock edge. In this way the signal edge is aligned to the system clock and the propagation delay of the logic becomes irrelevant. The process is known as reclocking.

image

Figure 2.1 Logic elements have a finite propagation delay between input and output and cascading them delays the signal an arbitrary amount. Storage elements sample the input on a clock edge and can return a signal to near coincidence with the system clock. This is known as reclocking. Reclocking eliminates variations in propagation delay in logic elements.

2.2 Logic elements

The two states of the signal when measured with an oscilloscope are simply two voltages, usually referred to as high and low. The actual voltage levels will depend on the type of logic family in use, and on the supply voltage used. Supply voltages have tended to fall as designers seek to reduce power consumption. Within logic, the exact levels are not of much consequence, and it is only necessary to know them when interfacing between different logic families or when driving external devices. The pure logic designer is not interested at all in these voltages, only in their meaning.

Just as the electrical waveform from a microphone represents sound velocity, so the waveform in a logic circuit represents the truth of some statement. As there are only two states, there can only be true or false meanings. The true state of the signal can be assigned by the designer to either voltage state. When a high voltage represents a true logic condition and a low voltage represents a false condition, the system is known as positive logic, or high true logic. This is the usual system, but sometimes the low voltage represents the true condition and the high voltage represents the false condition. This is known as negative logic or low true logic. Provided that everyone is aware of the logic convention in use, both work equally well.

image

Figure 2.2 The basic logic gates compared.

In logic systems, all logical functions, however complex, can be configured from combinations of a few fundamental logic elements or gates. It is not profitable to spend too much time debating which are the truly fundamental ones, since most can be made from combinations of others. Figure 2.2 shows the important simple gates and their derivatives, and introduces the logical expressions to describe them, which can be compared with the truth-table notation. The figure also shows the important fact that when negative logic is used, the OR gate function interchanges with that of the AND gate.

If numerical quantities need to be conveyed down the two-state signal paths described here, then the only appropriate numbering system is binary, which has only two symbols, 0 and 1. Just as positive or negative logic could be used for the truth of a logical binary signal, it can also be used for a numerical binary signal. Normally, a high voltage level will represent a binary 1 and a low voltage will represent a binary 0, described as a ‘high for a one' system. Clearly a ‘low for a one' system is just as feasible. Decimal numbers have several columns, each of which represents a different power of ten; in binary the column position specifies the power of two.

Several binary digits or bits are needed to express the value of a binary sample. These bits can be conveyed at the same time by several signals to form a parallel system, which is most convenient inside equipment or for short distances because it is inexpensive, or one at a time down a single signal path, which is more complex, but convenient for cables between pieces of equipment because the connectors require fewer pins. When a binary system is used to convey numbers in this way, it can be called a digital system.

2.3 Storage elements

The basic memory element in logic circuits is the latch, which is constructed from two gates as shown in Figure 2.3(a), and which can be set or reset. A more useful variant is the D-type latch shown at (b) which remembers the state of the input at the time a separate clock either changes state, for an edge-triggered device, or after it goes false, for a level-triggered device. A shift register can be made from a series of latches by connecting the Q output of one latch to the D input of the next and connecting all the clock inputs in parallel. Data are delayed by the number of stages in the register. Shift registers are also useful for converting between serial and parallel data formats.

Where large numbers of bits are to be stored, cross-coupled latches are less suitable because they are more complicated to fabricate inside integrated circuits than dynamic memory, and consume more current.

image

Figure 2.3 Digital semiconductor memory types. In (a), one data bit can be stored in a simple set-reset latch, which has little application because the D-type latch in (b) can store the state of the single data input when the clock occurs. These devices can be implemented with bipolar transistors or FETs, and are called static memories because they can store indefinitely. They consume a lot of power. In (c), a bit is stored as the charge in a potential well in the substrate of a chip. It is accessed by connecting the bit line with the field effect from the word line. The single well where the two lines cross can then be written or read. These devices are called dynamic RAMs because the charge decays, and they must be read and rewritten (refreshed) periodically.

In large random access memories (RAMs), the data bits are stored as the presence or absence of charge in a tiny capacitor as shown in Figure 2.3(c). The capacitor is formed by a metal electrode, insulated by a layer of silicon dioxide from a semiconductor substrate, hence the term MOS (metal oxide semiconductor). The charge will suffer leakage, and the value would become indeterminate after a few milliseconds. Where the delay needed is less than this, decay is of no consequence, as data will be read out before they have had a chance to decay. Where longer delays are necessary, such memories must be refreshed periodically by reading the bit value and writing it back to the same place. Most modern MOS RAM chips have suitable circuitry built in. Large RAMs store many megabits, and it is clearly impractical to have a connection to each one. Instead, the desired bit has to be addressed before it can be read or written. The size of the chip package restricts the number of pins available, so that large memories use the same address pins more than once. The bits are arranged internally as rows and columns, and the row address and the column address are specified sequentially on the same pins.

Just like recording devices, electronic data storage devices come in many varieties. The basic volatile RAM will lose data if power is interrupted. However, there are also non-volatile RAMS or NVRAMs which retain the data in the absence of power. A type of memory which is written once is called a read-only-memory or ROM. Some of these are programmed by using a high current which permanently vaporizes conductors in each location so that the data are fixed. Other types can be written electrically, but cannot be erased electrically. These need to be erased by exposure to ultraviolet light and are called UVROMS. Once erased they can be reprogrammed with new data. Another type of ROM can be rewritten electrically a limited number of times. These are known as electric alterable ROMs or EAROMS.

2.4 Binary coding

In many cases a binary code is used to represent a sample of an audio or video waveform. Practical digital hardware places a limit on the wordlength which in turn limits the range of values available. In the eight-bit samples used in much digital video equipment, there are 256 different numbers, whereas in the sixteen-bit codes common in digital audio, there are 65 536 different numbers.

Figure 2.4(a) shows the result of counting upwards in binary with a fixed wordlength. When the largest possible value of all ones is reached, adding a further one to the LSB causes it to become zero with a carry-out. This carry is added to the next bit which becomes zero with a carry-out and so on. The carry will ripple up the word until the MSB becomes zero and produces a carry-out. This carry-out represents the setting of a bit to the left of the MSB, which is not present in the hardware and is thus lost. Consequently when the highest value is reached, further counting causes the value to reset to zero and begin again. This is known as an overflow. Counting downwards will achieve the reverse. When zero is reached, subtracting one will cause an underflow where a borrow should be taken from a bit to the left of the MSB, which does not exist, the result being that the bits which do exist take the value of all ones, being the highest possible code value.

image

Figure 2.4 Counting up in a fixed wordlength system leads to overflow (a) where the high-order bit is lost. A binary counter can be made by cascading divide-by-two stages. Overflow results in wraparound as shown in (c).

Storage devices such as latches can be configured so that they count pulses. Figure 2.4(b) shows such an arrangement. The pulses to be counted are fed to the clock input of a D-type latch, whose input is connected to its complemented output. This configuration will change state at every input pulse, so that it will be in a true state after every other pulse. This is a divide-by-two counter. If the output is connected to the clock input of another stage, this will divide by four. A series of divideby- two stages can be cascaded indefinitely in this way to count up to arbitrarily high numbers. Note that when the largest possible number is reached, when all latches are in the high state, the next pulse will result in all latches going to a low state, corresponding to the count of zero. This is the overflow condition described above.

Counters often include reset inputs which can be used to force the count to zero. Some are presettable so that a specific value can be loaded into each latch before counting begins.

As a result of the fixed wordlength, underflow and overflow, the infinite range of real numbers is mapped onto the limited range of a binary code of finite wordlength. Figure 2.4(c) shows that the overflow makes the number scale circular and it is as if the real number scale were rolled around it so that a binary code could represent any of a large possible number of real values, positive or negative. This is why the term wraparound is sometimes used to describe the result of an overflow condition.

Mathematically the pure binary mapping of Figure 2.4(c) from an infinite scale to a finite scale is known as modulo arithmetic. The four-bit example shown expresses real numbers as Modulo-16 codes. Modulo arithmetic will be considered further in section 2.7.

In a practical ADC, each number represents a different analog signal voltage, and the hardware is arranged such that voltages outside the finite range do not overflow but instead result in one or other limit codes being output. This is the equivalent of clipping in analog systems. In Figure 2.5(a) it will be seen that in an eight-bit pure binary system, the number range goes from 00 hex, which represents the smallest voltage and all those voltages below it, through to FF hex, which represents the largest positive voltage and all voltages above it.

image

Figure 2.5 The unipolar quantizing range of an eight-bit pure binary system is shown at (a). The analog input must be shifted to fit into the quantizing range. In component, sync pulses are not digitized, so the quantizing intervals can be smaller as at (b). An offset of half scale is used for colour difference signals (c).

In some computer graphics systems these extremes represent black and peak white respectively. In television systems the traditional analog video waveform must be accommodated within this number range. Figure 2.5(b) shows how this is done for a broadcast standard luminance signal. As digital systems only handle the active line, the quantizing range is optimized to suit the gamut of the unblanked luminance and the sync pulses go off the bottom of the scale. There is a small offset in order to handle slightly misadjusted inputs. Additionally the codes at the extremes of the range are reserved for synchronizing and are not available to video values.

Colour difference video signals (see Chapter 7) are bipolar and so blanking is in the centre of the signal range. In order to accommodate colour difference signals in the quantizing range, the blanking voltage level of the analog waveform has been shifted as in Figure 2.5(c) so that the positive and negative voltages in a real signal can be expressed by binary numbers which are only positive. This approach is called offset binary and has the advantage that the codes of all ones and all zeros are still at the ends of the scale and can continue to be used for synchronizing.

image

Figure 2.6 0 dBFs is defined as the level of the largest sinusoid which will fit into the quantizing range without clipping.

Figure 2.6 shows that analog audio signal voltages are referred to midrange. The level of the signal is measured by how far the waveform deviates from midrange, and attenuation, gain and mixing all take place around that level. Digital audio mixing is achieved by adding sample values from two or more different sources, but unless all the quantizing intervals are of the same size and there is no offset, the sum of two sample values will not represent the sum of the two original analog voltages. Thus sample values which have been obtained by non-uniform or offset quantizing cannot readily be processed because the binary numbers are not proportional to the signal voltage.

If two offset binary sample streams are added together in an attempt to perform digital mixing, the result will be that the offsets are also added and this may lead to an overflow. Similarly, if an attempt is made to attenuate by, say, 6.02 dB by dividing all the sample values by two, Figure 2.7 shows that the offset is also divided and the waveform suffers a shifted baseline. This problem can be overcome with digital luminance signals simply by subtracting the offset from each sample before processing as this results in positive-only numbers truly proportional to the luminance voltage. This approach is not suitable for audio or colour difference signals because negative numbers would result when the analog voltage goes below blanking and pure binary coding cannot handle them.

The problem with offset binary is that it works with reference to one end of the range. What is needed is a numbering system which operates symmetrically with reference to the centre of the range.

In the two's complement system, the mapping of real numbers onto the finite range of a binary word is modified. Instead of the mapping of Figure 2.8(a)

in which only positive numbers are mapped, in Figure 2.8(b) the upper half of the pure binary number range has been redefined to represent negative quantities. In two's complement, the range represented by the circle of numbers does not start at zero, but starts on the diametrically opposite side of the circle such that zero is now in the centre of the number range. All numbers clockwise from zero are positive and have the MSB reset. All numbers anticlockwise from zero are negative and have the MSB set. The MSB is thus the equivalent of a sign bit where 1 = minus. Two's complement notation differs from pure binary in that the most significant bit is inverted in order to achieve the half-circle rotation.

image

Figure 2.7 The result of an attempted attenuation in pure binary code is an offset. Pure binary cannot be used for digital video processing.

Figure 2.9 shows how a real ADC is configured to produce two's complement output. At (a) an analog offset voltage equal to one half the quantizing range is added to the bipolar analog signal in order to make it unipolar as at (b). The ADC produces positive-only numbers at (c) which are proportional to the input voltage. This is actually an offset binary code. The MSB is then inverted at (d) so that the all-zeros code moves to the centre of the quantizing range. The analog offset is often incorporated into the ADC as is the MSB inversion. Some convertors are designed to be used in either pure binary or two's complement mode. In this case the designer must arrange the appropriate DC conditions at the input. The MSB inversion may be selectable by an external logic level. In the broadcast digital video interface standards the colour difference signals use offset binary because the codes of all zeros and all ones are at the end of the range and can be reserved for synchronizing. A digital vision mixer simply inverts the MSB of each colour difference sample to convert it to two's complement.

image

Figure 2.8 In pure binary (a) this mapping of real numbers is used. In two’s complement an alternative mapping (b) is used. See text.

The two’s complement system allows two sample values to be added, or mixed in audio and video parlance, and the result will be referred to the system midrange; this is analogous to adding analog signals in an operational amplifier.

Figure 2.10 illustrates how adding two's complement samples simulates a bipolar mixing process. The waveform of input A is depicted by solid black samples, and that of B by samples with a solid outline. The result of mixing is the linear sum of the two waveforms obtained by adding pairs of sample values. The dashed lines depict the output values. Beneath each set of samples is the calculation which will be seen to give the correct result. Note that the calculations are pure binary. No special arithmetic is needed to handle two's complement numbers.

image

Figure 2.9 A two’s complement ADC. At (a) an analog offset voltage equal to one-half the quantizing range is added to the bipolar analog signal in order to make it unipolar as at (b). The ADC produces positive only numbers at (c), but the MSB is then inverted at (d) to give a two’s complement output.

image

Figure 2.10 Using two's complement arithmetic, single values from two waveforms areadded together with respect to midrange to give a correct mixing function.

It is interesting to see why the two's complement adding process works. Effectively both two's complement numbers to be added contain an offset of half full scale. When they are added, the two offsets add to produce a sum offset which has a value of full scale. As adding full scale to a code consists of moving one full rotation round the circle of numbers, the offset has no effect and is effectively eliminated. It is sometimes necessary to phase reverse or invert a digital signal. The process of inversion in two's complement is simple. All bits of the sample value are inverted to form the one's complement, and one is added. This can be checked by mentally inverting some of the values in Figure 2.8(b). The inversion is transparent and performing a second inversion gives the original sample values. Using inversion, signal subtraction can be performed using only adding logic.

Two's complement numbers can have a radix point and bits below it just as pure binary numbers can. It should, however, be noted that in two's complement, if a radix point exists, numbers to the right of it are added. For example, 1100.1 is not –4.5, it is –4 + 0.5 = –3.5.

The circuitry necessary for adding pure binary or two's complement binary numbers is shown in Figure 2.11. Addition in binary requires two bits to be taken at a time from the same position in each word, starting at the least significant bit. Should both be ones, the output is zero, and there is a carry-out generated. Such a circuit is called a half adder, shown in Figure 2.11(a) and is suitable for the least-significant bit of the calculation. All higher stages will require a circuit which can accept a carry input as well as two data inputs. This is known as a full adder (Figure 2.11(b)). Such a device is also convenient for inverting a two's complement number, in conjunction with a set of inverters. The adder has one set of inputs taken to a false state, and the carry-in permanently held true, such that it adds one to the one's complement number from the invertor.

When mixing by adding sample values, care has to be taken to ensure that if the sum of the two sample values exceeds the number range the result will be clipping rather than overflow. In two's complement, the action necessary depends on the polarities of the two signals. Clearly if one positive and one negative number are added, the result cannot exceed the number range. If two positive numbers are added, the symptom of positive overflow is that the most significant bit sets, causing an erroneous negative result, whereas a negative overflow results in the most significant bit clearing. The overflow control circuit will be designed to detect these two conditions, and override the adder output. If the MSB of both inputs is zero, the numbers are both positive, thus if the sum has the MSB set, the output is replaced with the maximum positive code (0111 . . .). If the MSB of both inputs is set, the numbers are both negative, and if the sum has no MSB set, the output is replaced with the maximum negative code (1000 . . .). These conditions can also be connected to warning indicators. Figure 2.11(c) shows this system in hardware. The resultant clipping on overload is sudden, and sometimes a PROM is included which translates values around and beyond maximum to softclipped values below or equal to maximum.

image

Figure 2.11 (a) Half adder; (b) full-adder circuit and truth table; (c) comparison of sign bits prevents wraparound on adder overflow by substituting clipping level.

image

Figure 2.12 Two configurations which are common in processing. In (a) the feedback around the adder adds the previous sum to each input to perform accumulation or digital integration. In (b) an invertor allows the difference between successive inputs to be computed. This is differentiation.

A storage element can be combined with an adder to obtain a number of useful functional blocks which will crop up frequently in digital signal processing. Figure 2.12(a) shows that a latch is connected in a feedback loop around an adder. The latch contents are added to the input each time it is clocked. The configuration is known as an accumulator in computation because it adds up or accumulates values fed into it. In filtering, it is known as an discrete time integrator. If the input is held at some constant value, the output increases by that amount on each clock. The output is thus a sampled ramp.

Figure 2.12(b) shows that the addition of an invertor allows the difference between successive inputs to be obtained. This is digital differentiation. The output is proportional to the slope of the input.

2.5 Gain control

When processing digital audio or image data the gain of the system will need to be variable so that mixes and fades can be performed. Gain is controlled in the digital domain by multiplying each sample value by a coefficient. If that coefficient is less than one attenuation will result; if it is greater than one, amplification can be obtained.

Multiplication in binary circuits is difficult. It can be performed by repeated adding, but this is too slow to be of any use. In fast multiplication, one of the inputs will be simultaneously multiplied by one, two, four, etc., by hard-wired bit shifting. Figure 2.13 shows that the other input bits will determine which of these powers will be added to produce the final sum, and which will be neglected. If multiplying by five, the process is the same as multiplying by four, multiplying by one, and adding the two products. This is achieved by adding the input to itself shifted two places. As the wordlength of such a device increases, the complexity increases exponentially.

In a given application, all that matters is that the output has the correct numerical value. It does not matter if this is achieved using dedicated hardware or using software in a general-purpose processor. It should be clear that if it is wished to simulate analog gain control in the digital domain by multiplication, the samples to be multiplied must have been uniformly quantized. If the quantizing is non-uniform the binary numbers are no longer proportional to the original parameter and multiplication will not give the correct result.

image

Figure 2.13 Structure of fast multiplier: the input A is multiplied by 1, 2, 4, 8, etc., by bit shifting. The digits of the B input then determine which multiples of A should be added together by enabling AND gates between the shifters and the adder. For long wordlengths, the number of gates required becomes enormous, and the device is best implemented in a chip.

In audio, uniform quantizing is universal in production systems. However, in video it is not, owing to the widespread use of gamma which will be discussed in Chapter 6. Strictly speaking, video signals with gamma should be returned to the uniformly quantized domain before processing but this is seldom done in practice.

2.6 Floating-point coding

Computers operate on data words of fixed length and if binary or two's complement coding is used, this limits the range of the numbers. For this reason many computers use floating-point coding which allows a much greater range of numbers with a penalty of reduced accuracy.

Figure 2.14 shows that in pure binary, numbers which are significantly below the full scale value have a number of high-order bits which are all zero. Instead of handling these bits individually, as they are all zero it is good enough simply to count them. Figure 2.14(b) shows that every time a leading zero is removed, the remaining bits are shifted left one place and this has the effect in binary of multiplying by two. Two shifts multiply by four, three shifts by eight and so on. In order to re-create the number with the right magnitude, the power of two by which the number was multiplied must also be sent. This value is known as the exponent.

image

Figure 2.14 Small numbers in a long wordlength system have inefficient leading zeros (a). Floating-point coding (b) is more efficient, but can lead to inaccuracy.

In order to convert a binary number of arbitrary value with an arbitrarily located radix point into floating-point notation, the position of the most significant or leading one and the position of the radix point are noted. The number is then multiplied or divided by powers of two until the radix point is immediately to the right of the leading one. This results in a value known as the mantissa (plural: mantissae) which always has the form 1.XXX.... where X is 1 or 0 (known in logic as ‘don't care').

The exponent is a two's complement code which determines whether the mantissa has to be multiplied by positive powers of two which will shift it left and make it bigger, or whether it has to be multiplied by negative powers of two which will shift it right and make it smaller.

In floating-point notation, the range of the numbers and the precision are independent. The range is determined by the wordlength of the exponent. For example, a six-bit exponent having 64 values allows a range from 1.XX × 231 to 1.XX × 2–32. The precision is determined by the length of the mantissa. As the mantissa is always in the format 1.XXX it is not necessary to store the leading one so the actual stored value is in the form .XXX. Thus a ten-bit mantissa has eleven-bit precision. It is possible to pack a ten-bit mantissa and a six-bit exponent in one sixteen-bit word.

Although floating-point operation extends the number range of a computer, the user must constantly be aware that floating point has limited precision. Floating point is the computer's equivalent of lossy compression. In trying to get more for less, there is always a penalty.

In some signal-processing applications, floating-point coding is simply not accurate enough. For example, in an audio filter, if the stopband needs to be, say, 100 dB down, this can only be achieved if the entire filtering arithmetic has adequate precision. 100 dB is one part in 100 000 and needs more than sixteen bits of resolution. The poor quality of a good deal of digital audio equipment is due to the unwise adoption of floatingpoint processing of inadequate precision.

Computers of finite wordlength can operate on larger numbers without the inaccuracy of floating-point coding by using techniques such a double precision. For example, thirty-two-bit precision data words can be stored in two adjacent memory locations in a sixteen-bit machine, and the processor can manipulate them by operating on the two halves at different times. This takes longer, or needs a faster processor.

2.7 Modulo-n arithmetic

Conventional arithmetic which is in everyday use relates to the real world of counting actual objects, and to obtain correct answers the concepts of borrow and carry are necessary in the calculations.

There is an alternative type of arithmetic which has no borrow or carry which is known as modulo arithmetic. In modulo-n no number can exceed n. If it does, n or whole multiples of n are subtracted until it does not. Thus 25 modulo-16 is 9 and 12 modulo-5 is 2. The count shown in Figure 2.4 is from a four-bit device which overflows when it reaches 1111 because the carry-out is ignored. If a number of clock pulses m are applied from the zero state, the state of the counter will be given by m mod.16. Thus modulo arithmetic is appropriate to digital systems in which there is a fixed wordlength and this means that the range of values the system can have is restricted by that wordlength. A number range which is restricted in this way is called a finite field.

Modulo-2 is a numbering scheme which is used frequently in digital processes such as error correction, encryption and spectrum spreading in recording and transmission. Figure 2.15 shows that in modulo-2 the conventional addition and subtraction are replaced by the XOR function such that:

(A + B) Mod.2 = A XOR B

When multi-bit values are added Mod.2, each column is computed quite independently of any other. This makes Mod.2 circuitry very fast in operation as it is not necessary to wait for the carries from lower-order bits to ripple up to the high-order bits.

image

Figure 2.15(a) As a fixed wordlength counter cannot hold the carry-out bit, it will resume at zero. Thus a four-bit counter expresses every count as a modulo-16 number.

image

Figure 2.15(b) In modulo-2 calculations, there can be no carry or borrow operations and conventional addition and subtraction become identical. The XOR gate is a modulo-2 adder.

Modulo-2 arithmetic is not the same as conventional arithmetic and takes some getting used to. For example, adding something to itself in Mod.2 always gives the answer zero.

2.8 The Galois field

Figure 2.16 shows a simple circuit consisting of three D-type latches which are clocked simultaneously. They are connected in series to form a shift register. At (a) a feedback connection has been taken from the output to the input and the result is a ring counter where the bits contained will recirculate endlessly. At (b) one XOR gate is added so that the output is fed back to more than one stage. The result is known as a twisted-ring counter and it has some interesting properties. Whenever the circuit is clocked, the left-hand bit moves to the right-hand latch, the centre bit moves to the left-hand latch and the centre latch becomes the XOR of the two outer latches. The figure shows that whatever the starting condition of the three bits in the latches, the same state will always be reached again after seven clocks, except if zero is used.

image

Figure 2.16 The circuit shown is a twisted-ring counter which has an unusual feedback arrangement. Clocking the counter causes it to pass through a series of non-sequential values. See text for details.

The states of the latches form an endless ring of non-sequential numbers called a Galois field after the French mathematical prodigy Evariste Galois who discovered them. The states of the circuit form a maximum length sequence because there are as many states as are permitted by the wordlength. As the states of the sequence have many of the characteristics of random numbers, yet are repeatable, the result can also be called a pseudo-random sequence (prs). As the all-zeros case is disallowed, the length of a maximum length sequence generated by a register of m bits cannot exceed (2m – 1) states. The Galois field, however, includes the zero term. It is useful to explore the bizarre mathematics of Galois fields which use modulo-2 arithmetic. Familiarity with such manipulations is helpful when studying error correction, particularly the Reed – Solomon codes which will be treated in Chapter 10. They will also be found in processes which require pseudo-random numbers such as digital dither, treated in Chapter 4, and randomized channel codes used in, for example, DVB.

The circuit of Figure 2.16 can be considered as a counter and the four points shown will then be representing different powers of 2 from the MSB on the left to the LSB on the right. The feedback connection from the MSB to the other stages means that whenever the MSB becomes 1, two other powers are also forced to one so that the code of 1011 is generated.

Each state of the circuit can be described by combinations of powers of x, such as

        x2 = 100

        x = 010

x2 + x = 110, etc.

The fact that three bits have the same state because they are connected together is represented by the Mod.2 equation:

x3 + x + 1 = 0

Let x = a, which is a primitive element.

Now

a3 + a + 1 = 0 (2.1)

In modulo-2

a + a = a2 + a2 = 0

a = x = 010

a2 = x2 = 100

a3 = a + 1 = 011 from (2.1)

a4 = a3 × a = a(a + 1) = a2 + a = 110

a5 = a2 + a + 1 = 111

a6 = a × a5 = a(a2 + a + 1)

     = a3 + a2 + a = a + 1 + a2 + a

     = a2 + 1 = 101

a7 = a(a2 + 1) = a3 + a

     = a + 1 + a = 1 = 001

In this way it can be seen that the complete set of elements of the Galois field can be expressed by successive powers of the primitive element. Note that the twisted-ring circuit of Figure 2.16 simply raises a to higher and higher powers as it is clocked; thus the seemingly complex multibit changes caused by a single clock of the register become simple to calculate using the correct primitive and the appropriate power.

The numbers produced by the twisted-ring counter are not random; they are completely predictable if the equation is known. However, the sequences produced are sufficiently similar to random numbers that in many cases they will be useful. They are thus referred to as pseudorandom sequences. The feedback connection is chosen such that the expression it implements will not factorize. Otherwise a maximum-length sequence could not be generated because the circuit might sequence around one or other of the factors depending on the initial condition. A useful analogy is to compare the operation of a pair of meshed gears. If the gears have a number of teeth which is relatively prime, many revolutions are necessary to make the same pair of teeth touch again. If the number of teeth have a common multiple, far fewer turns are needed.

image

Figure 2.17 The PRS generator of DVB.

Figure 2.17 shows the pseudo-random sequence generator used in DVB. Its purpose is to modify the transmitted spectrum so that the amount of energy transmitted is as uniform as possible across the channel.

2.9 The phase-locked loop

All sampling systems need to be clocked at the appropriate rate in order to function properly. Whilst a clock may be obtained from a fixed frequency oscillator such as a crystal, many operations in video require genlocking or synchronizing the clock to an external source. The phaselocked loop excels at this job, and many others, particularly in connection with recording and transmission.

In phase-locked loops, the oscillator can run at a range of frequencies according to the voltage applied to a control terminal. This is called a voltage-controlled oscillator or VCO. Figure 2.18 shows that the VCO is driven by a phase error measured between the output and some reference. The error changes the control voltage in such a way that the error is reduced, such that the output eventually has the same frequency as the reference. A low-pass filter is fitted in the control voltage path to prevent the loop becoming unstable. If a divider is placed between the VCO and the phase comparator, as in the figure, the VCO frequency can be made to be a multiple of the reference. This also has the effect of making the loop more heavily damped, so that it is less likely to change frequency if the input is irregular.

image

Figure 2.18 A phase-locked loop requires these components as a minimum. The filter in the control voltage serves to reduce clock jitter.

In digital video, the frequency multiplication of a phase-locked loop is extremely useful. Figure 2.19 shows how the 13.5 MHz clock of component digital video is obtained from the sync pulses of an analog reference by such a multiplication process.

image

Figure 2.19 In order to obtain 13.5 MHz from input syncs, a PLL with an appropriate division ratio is required.

Figure 2.20 shows the NLL or numerically locked loop. This is similar to a phase-locked loop, except that the two phases concerned are represented by the state of a binary number. The NLL is useful to generate a remote clock from a master. The state of a clock count in the master is periodically transmitted to the NLL which will recreate the same clock frequency. The technique is used in MPEG transport streams.

2.10 Timebase correction

In Chapter 1 it was stated that a strength of digital technology is the ease with which delay can be provided. Accurate control of delay is the essence of timebase correction, necessary whenever the instantaneous time of arrival or rate from a data source does not match the destination. In digital video, the destination will almost always have perfectly regular timing, namely the sampling rate clock of the final DAC. Timebase correction consists of aligning jittery signals from storage media or transmission channels with that stable reference.

image

Figure 2.20 The numerically locked loop (NLL) is a digital version of the phase-lockedloop.

A further function of timebase correction is to reverse the time compression applied prior to recording or transmission. As was shown in section 1.8, digital recorders compress data into blocks to facilitate editing and error correction as well as to permit head switching between blocks in rotary-head machines. Owing to the spaces between blocks, data arrive in bursts on replay, but must be fed to the output convertors in an unbroken stream at the sampling rate.

In computer hard-disk drives, which are used in digital video workstations and file servers, time compression is also used, but a converse problem also arises. Data from the disk blocks arrive at a reasonably constant rate, but cannot necessarily be accepted at a steady rate by the logic because of contention for the use of networks by the different parts of the system. In this case the data must be buffered by a relative of the timebase corrector which is usually referred to as a silo.

Although delay is easily implemented, it is not possible to advance a data stream. Most real machines cause instabilities balanced about the correct timing: the output jitters between too early and too late. Since the information cannot be advanced in the corrector, only delayed, the solution is to run the machine in advance of real time. In this case, correctly timed output signals will need a nominal delay to align them with reference timing. Early output signals will receive more delay, and late output signals will receive less delay.

image

Figure 2.21 Most TBCs are implemented as a memory addressed by a counter which periodically overflows to give a ring structure. The memory allows the read and write sides to be asynchronous.

Section 2.3 showed the principles of digital storage elements which can be used for delay purposes. The shift-register approach and the RAM approach to delay are very similar, as a shift register can be thought of as a memory whose address increases automatically when clocked. The data rate and the maximum delay determine the capacity of the RAM required. Figure 2.21 shows that the addressing of the RAM is by a counter that overflows endlessly from the end of the memory back to the beginning, giving the memory a ring-like structure. The write address is determined by the incoming data, and the read address is determined by the outgoing data. This means that the RAM has to be able to read and write at the same time.

The switching between read and write involves not only a data multiplexer but also an address multiplexer. In general the arbitration between read and write will be done by signals from the stable side of the TBC as Figure 2.22 shows. In the replay case the stable clock will be on the read side. The stable side of the RAM will read a sample when it demands, and the writing will be locked out for that period. The input data cannot be interrupted in many applications, however, so a small buffer silo is installed before the memory, which fills up as the writing is locked out, and empties again as writing is permitted. Alternatively, the memory will be split into blocks, such that when one block is reading a different block will be writing and the problem does not arise.

In many digital video applications, the sampling rate exceeds the rate at which economically available RAM chips can operate. The solution is to arrange several video samples into one longer word, known as a superword, and to construct the memory so that it stores superwords in parallel.

image

Figure 2.22 In a RAM-based TBC, the RAM is reference synchronous, and an arbitrator decides when it will read and when it will write. During reading, asynchronous input data back up in the input silo, asserting a write request to the arbitrator. Arbitrator will then cause a write cycle between read cycles.

Figure 2.23 shows the operation of a FIFO chip, colloquially known as a silo because the data are tipped in at the top on delivery and drawn off at the bottom when needed. Each stage of the chip has a data register and a small amount of logic, including a data-valid or V bit. If the input register does not contain data, the first V bit will be reset, and this will cause the chip to assert ‘input ready'. If data are presented at the input, and clocked into the first stage, the V bit will set, and the ‘input ready' signal will become false. However, the logic associated with the next stage sees the V bit set in the top stage, and if its own V bit is clear, it will clock the data into its own register, set its own V bit, and clear the input V bit, causing ‘input ready' to reassert, when another word can be fed in. This process then continues as the word moves down the silo, until it arrives at the last register in the chip. The V bit of the last stage becomes the ‘output ready' signal, telling subsequent circuitry that there are data to be read. If this word is not read, the next word entered will ripple down to the stage above. Words thus stack up at the bottom of the silo.

image

Figure 2.23 Structure of FIFO or silo chip. Ripple logic controls propagation of data down silo.

When a word is read out, an external signal must be provided which resets the bottom V bit. The ‘output ready' signal now goes false, and the logic associated with the last stage now sees valid data above, and loads down the word when it will become ready again. The last register but one will now have no V bit set, and will see data above itself and brIng that down. In this way a reset V bit propagates up the chip while the data ripple down, rather like a hole in a semiconductor going the opposite way to the electrons.

When used in a hard-disk system, a silo will allow data to and from the disk, which is turning at constant speed. When reading the disk, Figure 2.24(a) shows that the silo starts empty, and if there is bus contention, the silo will start to fill. Where the bus is free, the disk controller will attempt to empty the silo into the memory. The system can take advantage of the interblock gaps on the disk, containing headers, preambles and redundancy, for in these areas there are no data to transfer, and there is some breathing space to empty the silo before the next block. In practice the silo need not be empty at the start of every block, provided it never becomes full before the end of the transfer. If this happens some data are lost and the function must be aborted. The block containing the silo overflow will generally be reread on the next revolution. In sophisticated systems, the silo has a kind of dipstick, and can interrupt the CPU if the data get too deep. The CPU can then suspend some bus activity to allow the disk controller more time to empty the silo.

image

Figure 2.24 The silo contents during read functions (a) appear different from those during write functions (b). In (a), the control logic attempts to keep the silo as empty as possible; in (b) the logic prefills the silo and attempts to keep it full until the memory word count overflows.

When the disk is to be written, as in Figure 2.24(b), a continuous data stream must be provided during each block, as the disk cannot stop. The silo will be pre-filled before the disk attempts to write, and the disk controller attempts to keep it full. In this case all will be well if the silo does not become empty before the end of the transfer. Figure 2.25 shows the silo of a typical disk controller with the multiplexers necessary to put it in the read data stream or the write data stream.

image

Figure 2.25 In order to guarantee that the drive can transfer data in real time at regular intervals (determined by disk speed and density) the silo provides buffering to the asynchronous operation of the memory access process. At (a) the silo is configured for a disk read. The same silo is used at (b) for a disk write.

2.11 Programmers

Figure 2.26 shows a simple system in which a counter is driven by a clock and counts steadily. At each state of the count, a different address is generated by the counter which is fed to a ROM. At each count state, the ROM is programmed with codes which determine what should happen. These are a simple form of instruction. This simple system assumes that each instruction takes the same amount of time and that the same sequence of events is needed come what may. This is not suitable for any but the simplest applications. Even a device like a washing machine takes a variable time to fill up and to heat its water.

image

Figure 2.26 A simple sequencer which consists of a counter driving a ROM.

image

Figure 2.27 Part of the ROM output in each state selects the source of the next clock pulse.

Figure 2.27 shows that variable instruction time is easily arranged by increasing the wordlength of the ROM. Now, part of the instruction drives a source selector or multiplexer which chooses the source of the next clock pulse. To continue with the example of the washing machine, if the instruction were to fill up with water, the clock selector would choose the water level switch. Upon reaching the required level, the counter would be clocked and move on to the next instruction. This might be to heat the water, and the clock selector might then choose the thermostat.

Such a system is still only capable of performing the same sequence, and it cannot select a new sequence in response to changing conditions.

Figure 2.28 shows a slightly more complex arrangement. Here the instruction word has been further extended. At each state, in addition to an instruction and a clock source selector, there is an additional field which forms the address of the next state. Instead of a counter, the ROM is addressed by a latch which holds the address from the ROM when clocked. Now, the program does not need to be written in sequential ROM locations, because each location contains the address of the next instruction. However, more importantly, it is possible to modify the sequence by interfering with the next address. Figure 2.28 shows that in the connection from the ROM to the address latch, there is a gate which allows an external signal to modify one bit of the address. The next address will have two possible values depending on the external signal.

image

Figure 2.28 An external input can modify the sequence of states by changing the next address.

This allows the system to react to external circumstances. The term conditional branching is used when the normal sequence of instructions is altered by an external event.

A device of the type found in Figure 2.28 is known as a microsequencer and the codes in the ROM are called microinstructions. Inside a processor, many steps are needed to carry out one instruction. Each of these steps is controlled by a microinstruction.

2.12 The computer

The computer is now a vital part of convergent systems, being used both for control purposes and to process audio and video signals as data. In control, the computer finds applications in database management, automation, editing, and in electromechanical systems such as tape drives and robotic cassette handling. Now that processing speeds have advanced sufficiently, computers are able to manipulate certain types of digital video in real time. Where very complex calculations are needed, real-time operation may not be possible and instead the computation proceeds as fast as it can in a process called rendering. The rendered data are stored so that they can be viewed in real time from a storage medium when the rendering is complete.

The computer is a programmable device in that operation is not determined by its construction alone, but instead by a series of instructions forming a program. The program is supplied to the computer one instruction at a time so that the desired sequence of events takes place.

Programming of this kind has been used for over a century in electromechanical devices, including automated knitting machines and street organs which are programmed by punched cards. However, the computer differs from these devices in that the program is not fixed, but can be modified by the computer itself. This possibility led to the creation of the term software to suggest a contrast to the constancy of hardware.

Computer instructions are binary numbers each of which is interpreted in a specific way. As these instructions don't differ from any other kind of data, they can be stored in RAM. The computer can change its own instructions by accessing the RAM. Most types of RAM are volatile, in that they lose data when power is removed. Clearly if a program is entirely stored in this way, the computer will not be able to recover fom a power failure. The solution is that a very simple starting or bootstrap program is stored in non-volatile ROM which will contain instructions which will bring in the main program from a storage system such as a disk drive after power is applied. As programs in ROM cannot be altered, they are sometimes referred to as firmware to indicate that they are classified between hardware and software.

Making a computer do useful work requires more than simply a program which performs the required computation. There is also a lot of mundane activity which does not differ significantly from one program to the next. This includes deciding which part of the RAM will be occupied by the program and which by the data, producing commands to the storage disk drive to read the input data from a file and to write back the results. It would be very inefficient if all programs had to handle these processes themselves. Consequently the concept of an operating system was developed. This manages all the mundane decisions and creates an environment in which useful programs or applications can execute.

The ability of the computer to change its own instructions makes it very powerful, but it also makes it vulnerable to abuse. Programs exist which are deliberately written to do damage. These viruses are generally attached to plausible messages or data files and enter computers through storage media or communications paths.

There is also the possibility that programs contain logical errors such that in certain combinations of circumstances the wrong result is obtained. If this results in the unwitting modification of an instruction, the next time that instruction is accessed the computer will crash. In consumer grade software, written for the vast personal computer market, this kind of thing is unfortunately accepted.

For critical applications, software must be verified. This is a process which can prove that a program can recover from absolutely every combination of circumstances and keep running properly. This is a nontrivial process, because the number of combinations of states a computer can get into is staggering. As a result most software is unverified.

It is of the utmost importance that networked computers which can suffer virus infection or computers running unverified software are never used in a life-support or critical application.

Figure 2.29 shows a simple computer system. The various parts are linked by a bus which allows binary numbers to be transferred from one place to another. This will generally use tri-state logic so that when one device is sending to another, all other devices present a high impedance to the bus.

A typical bus is shown in Figure 2.30. There are three kinds of signal in a bus. These are addresses, data and control/status signals. The control signals are sent by a controlling device to cause some action such as writing data. Status signals are sent by a controlled device to indicate that it has complied, or in the case of a fault, cannot comply. The address is asserted by the controlling device to determine where the data transfer is to take place. Most of the addresses relate to individual locations in memory, but a further unique address range relates to peripheral devices.

image

Figure 2.29 A simple computer system. All components are linked by a single data/address/control bus. Although cheap and flexible, such a bus can only make one connection at a time, so it is slow.

image

Figure 2.30 Structure of a typical computer bus.

The ROM stores the startup program, the RAM stores the operating system, applications programs and the data to be processed. The disk drive stores large quantities of data in a non-volatile form. The RAM only needs to be able to hold part of one program as other parts can be brought from the disk as required. A program executes by fetching one instruction at a time from the RAM to the processor along the bus.

The bus also allows keyboard/mouse inputs and outputs to the display and printer. Inputs and outputs are generally abbreviated to I/O. Finally a programmable timer will be present which acts as a kind of alarm clock for the processor.

2.13 The processor

The processor or CPU (central processing unit) is the heart of the system. Figure 2.31 shows a simple example of a CPU. The CPU has a bus interface which allows it to generate bus addresses and input or output data. Sequential instructions are stored in RAM at contiguously increasing locations so that a program can be executed by fetching instructions from a RAM address specified by the program counter (PC) to the instruction register in the CPU. As each instruction is completed, the PC is incremented so that it points to the next instruction. In this way the time taken to execute the instruction can vary.

image

Figure 2.31 Structure of a typical computer bus.

The processor is notionally divided into data paths and control paths. Figure 2.31 shows the data path. The CPU contains a number of generalpurpose registers or scratchpads which can be used to store partial results in complex calculations. Pairs of these registers can be addressed so that their contents go to the ALU (arithmetic logic unit).

The ALU is a programmable device which performs various functions as a result of a control word supplied to it from a microsequencer. These functions include arithmetic (add, subtract, increment, etc.) or logical (and, or, etc.) functions on the input data, and conditional functions which allow branching of the program due to earlier results such as a calculation being zero, or negative or whatever. The output of the ALU may be routed back to a register or output. By reversing this process it is possible to get data into the registers from the RAM.

Which function the ALU performs and which registers are involved are determined by the instruction currently in the instruction register that is decoded in the control path. One pass through the ALU can be completed in one cycle of the processor's clock. Instructions vary in complexity as do the number of clock cycles needed to complete them. Incoming instructions are decoded and used to access a look-up table which converts them into microinstructions, one of which controls the CPU at each clock cycle.

2.14 Interrupts

Ordinarily instructions are executed in the order that they are stored in RAM. However, some instructions direct the processor to jump to a new memory location. If this is a jump to an earlier instruction, the program will enter a loop. The loop must increment a count in a register each time, and contain a conditional instruction called a branch, which allows the processor to jump out of the loop when a predetermined count is reached.

However, it is often required that the sequence of execution should be changeable by some external event. This might be the changing of some value due to a keyboard input. Events of this kind are handled by interrupts, which are created by devices needing attention. Figure 2.32 shows that in addition to the PC, the CPU contains another dedicated register called the stack pointer. Figure 2.33 shows how this is used. At the end of every instruction the CPU checks to see if an interrupt is asserted on the bus.

If it is, a different set of microinstructions are executed. The PC is incremented as usual, but the next instruction is not executed. Instead, the contents of the PC are stored so that the CPU can resume execution when it has handled the current event. The PC state is stored in a reserved area of RAM known as the stack, at an address determined by the stack pointer.

image

Figure 2.32 Normally the program counter (PC) increments each time an instruction is completed in order to select the next instruction. However, an interrupt may cause the PC state to be stored in the stack area of RAM prior to the PC being forced to the start address of the interrupt subroutine. Afterwards the PC can get its original value back by reading the stack.

image

Figure 2.33 How an interrupt is handled. See text for details.

Once the PC is stacked, the processor can handle the interrupt. It issues a bus interrupt acknowledge, and the interrupting device replies with an unique code identifying itself. This is known as a vector which steers the processor to a RAM address containing a new program counter. This is the RAM address of the first instruction of the subroutine which is the program that will handle the interrupt. The CPU loads this address into the PC and begins execution of the subroutine.

At the end of the subroutine there will be a return instruction. This causes the CPU to use the stack pointer as a memory address in order to read the return PC state from the stack. With this value loaded into the PC, the CPU resumes execution where it left off.

The stack exists so that subroutines can themselves be interrupted. If a subroutine is executing when a higher-priority interrupt occurs, the subroutine can be suspended by incrementing the stack pointer and storing the current PC in the next location in the stack.

When the second interrupt has been serviced, the stack pointer allows the PC of the first subroutine to be retrieved. Whenever a stack PC is retrieved, the stack pointer decrements so that it always points to the PC of the next item of unfinished business.

2.15 Programmable timers

Ordinarily processors have no concept of time and simply execute instructions as fast as their clock allows. This is fine for general-purpose processing, but not for time-critical processes such as video. One way in which the processor can be made time conscious is to use programmable timers. These are devices which reside on the computer bus and which run from a clock. The CPU can set up a timer by loading it with a count. When the count is reached, the timer will interrupt. To give an example, if the count were to be equal to one frame period, there would be one interrupt per frame, and this would result in the execution of a subroutine once per frame, provided, of course, that all the instructions could be executed in one frame period.

2.16 Memory management

The amount of memory a computer can have is determined by the wordlength of the memory address bus. A typical processor has a wordlength of sixteen bits and this can only address 64 K different addresses. Most computers have more memory than this, which leads to the question, how can the processor address large memories?

The solution is the use of memory management or mapping. Figure 2.34 shows that the program counter (PC) of the processor does not address the memory directly. Instead it passes through the memory management unit which is capable of adding constants to the address from the PC. As a result the actual memory location addressed (the physical address) is not what the processor thinks. This leads to the state of the processor PC being called the virtual address.

image

Figure 2.34 Using memory mapping, the processor produces only a virtual address which is converted to a physical address by the addition of an offset.

The constants are loaded into the memory management unit by the operating system. When a program is loaded from, for example, the disk drive, the operating system determines where in the memory the program will reside. This will obviously be a different area from the program currently executing. When the new program is to be executed, the existing program is halted and the memory mapping offsets are changed so that the same virtual address range now addresses the new program.

Correct use of memory management means that user programs cannot access peripheral devices so that faulty user code cannot corrupt the peripheral devices. When a user program needs to obtain data from a peripheral, it must do so via a request to the operating system. As a result the most frequent mode change in the processor will be between running the user program and running the operating system. In addition to requiring a change of memory mapping constants, a mode change also requires that all of the processor's register contents are stored in memory so that they can be restored when the mode reverts. This takes time.

In some processor designs the CPU has more than one register set connected to the ALU and the buses. One of these is called the user set and the other is called the kernel set which is used by the operating system. This approach allows faster mode changes because the processor registers do not have to be saved and reloaded. Instead the processor simply switches register set.

Having kernel hardware makes computers much more secure and reliable. For example, if something goes wrong with a user program, it may crash and the user registers can be left in total chaos, but the error will alert the operating system which will simply switch to the kernel register set and carry on working normally. If the memory management registers can only be changed from kernel mode, then there is no possible way in which a user program can change any of the computer's key settings or affect the peripherals because user virtual addresses will never be mapped to peripheral or control register addresses. This makes computers virus resistant.

It should be clear from the virus-prone crash-riddled reputation of the PC that it lacks such features. PCs are consumer devices and their sole advantage is low cost. They are not suitable for critical applications.

2.17 The human interface

In many processes, direct interaction with the human operator is required. The system will present data to the human operator and the operator will present data to the system in the form of movements of the controls. The relative importance of the two data flows depends somewhat on the application. The skilled operator of a digital image production workstation will produce significantly more data than the viewer of a digital television set or a web browser and this will reflect in the type of controls made available. The digital television can be controlled by a simple remote handset with a few buttons. The PC can be controlled with a keyboard and a mouse.

However, in applications which require speed and convenience, a greater number and variety of controls are needed. Professional operators must be able to adjust parameters immediately without having to descend through nested menus. The large number of controls on professional audio and video mixers is a direct consequence of that.

Figure 2.35 shows how a mouse works. The ball is turned in two dimensions as the mouse is moved over a flat surface and a pair of rollers mounted orthogonally (at 90°) operate pulse generators which are sensitive to direction. These may be optical and consist of slotted vanes. Two suitably positioned light beams falling on photocells will produce outputs in quadrature. The relative phase determines the direction and the frequency is proportional to speed. The pulses from the mouse move the cursor across the display screen until it is over one of the available functions. This function can then be selected by pressing a key on the mouse. A trackball is basically an inverted mouse where the operator rotates the ball in two dimensions with the fingertips.

image

Figure 2.35 A mouse converts motion of the ball in two axes into a series of pulses.

An alternative is the touch-screen. This is a display which can sense when and where it has been touched by a fingertip. The mechanism may be ultrasonic, optical or capacitive, but it allows a large number of selections to be made quickly. The disadvantage of the touchscreen is that it is tiring to use for long periods as the operator has to extend an arm for each touch.

In a digital audio or video mixer, the gain of each signal is controlled by hand-operated faders, just as in earlier analog machines. Analog faders may be retained and used to produce a varying voltage which is converted to a digital code or gain coefficient in an ADC, but it is also possible to obtain position codes directly in digital faders. Digital faders are a form of displacement transducer in which the mechanical position of the control is converted directly to a digital code. The position of other controls, such as jog wheels on VTRs or editors, will also need to be digitized. Controls can be linear or rotary, and absolute or relative. In an absolute control, the position of the knob determines the output directly. In a relative control, the knob can be moved to increase or decrease the output, but its absolute position is meaningless.

image

Figure 2.36 An absolute linear fader uses a number of light beams which are interrupted in various combinations according to the position of a grating. A Gray code shown in Figure 2.37 must be used to prevent false codes.

Figure 2.36 shows an absolute linear fader. A grating is moved with respect to several light beams, one for each bit of the coefficient required. The interruption of the beams by the grating determines which photocells are illuminated. It is not possible to use a pure binary pattern on the grating because this results in transient false codes due to mechanical tolerances. Figure 2.37 shows some examples of these false codes. For example, on moving the fader from 3 to 4, the MSB goes true slightly before the middle bit goes false. This results in a momentary value of 4 + 2 = 6 between 3 and 4. The solution is to use a code in which only one bit ever changes in going from one value to the next. One such code is the Gray code which was devised to overcome timing hazards in relay logic but is now used extensively in position encoders. Gray code can be converted to binary in a suitable PROM, gate array or as a look-up table in software.

image

Figure 2.37 (a) Binary cannot be used for position encoders because mechanical tolerances cause false codes to be produced. (b) In Gray code, only one bit (arrowed) changes in between positions, so no false codes can be generated.

Note that in audio faders, the relationship between the gain and the fader position is logarithmic because this is how the ear senses loudness. The analog volume control in a transistor radio has a tapering resistance track to give this characteristic. In digital faders the logarithmic response can be obtained as an additional function of the Gray code look-up process.

Figure 2.38 shows a rotary incremental encoder. This produces a sequence of pulses whose number is proportional to the angle through which it has been turned. The rotor carries a radial grating over its entire perimeter. This turns over a second fixed radial grating whose bars are not parallel to those of the first grating. The resultant moir’e fringes travel inward or outward depending on the direction of rotation. These fringes can be detected by a pair of light beams and sensors, whose relative phase will indicate direction. The encoder outputs can be connected to a counter whose contents will increase or decrease according to the direction the rotor is turned. The counter provides the position output.

2.18 DSP

Although general-purpose computers can be programmed to process digital audio or image data, they are not ideal for the following reasons:

1 The number of arithmetic operations, particularly multiplications, is far higher than in data processing.

2 Processing is required in real time; data processors do not generally work in real time.

3 The program needed generally remains constant for the duration of a session, or changes slowly, whereas a data processor rapidly jumps between many programs.

4 Data processors can suspend a program on receipt of an interrupt; audio and image processors must work continuously for long periods.

5 Data processors tend to be I/O (input–output) limited, in that their operating speed is constrained by the problems of moving large quantities of data and instructions into the CPU.

Audio processors in contrast have a relatively small input and output rate, but compute intensively, whereas image processors also compute intensively but tend to outstrip the I/O capabilities of conventional computer architectures.

image

Figure 2.38 The fixed and rotating gratings produce moir´e fringes which are detected by two light paths as quadrature sinusoids. The relative phase determines the direction, and the frequency is proportional to speed of rotation.

A common video process is spatial interpolation used for resizing or oversampling. Spatial filters compute each output pixel value as a function of all input pixel values over a finite-sized window. The windows for the output pixels have extensive overlap. In a conventional CPU, shortage of internal registers means that a filter algorithm would have to fetch the input pixel values within the window from memory for every output pixel to be calculated. With an 8 × 8 window size, one input pixel falls within 64 different windows with the result that the conventional processor would have to fetch the same value from the same location 64 times, whereas in principle it only needs to be fetched once.

This is sufficient justification for the development of specialized digital signal processors (DSPs). These units are equipped with more internal registers than data processors to facilitate implementation of, for example, multi-point filter algorithms. The arithmetic unit will be designed to offer high-speed multiply/accumulate using techniques such as pipelining, which allows operations to overlap. The functions of the register set and the arithmetic unit are controlled by a microsequencer which interprets the instructions in the program. Figure 2.39 shows the interior structure of a DSP chip.

image

Figure 2.39 A DSP (digital signal processor) is a specialized form of computer. (Courtesy of Texas Instruments.)

Where a DSP is designed specifically for image processing, it is possible to incorporate one CPU per pixel. With a massively parallel approach such as this, the speed of each CPU can be very slow and it can be implemented serially, making it trivially easy to optimize the wordlength of the calculation to the accuracy requirement.

DSPs are used in many other industries where waveforms which were originally analog need to be manipulated in the digital domain. In fact this is probably the best definition of DSP which distinguishes it from computation in general. Equipment intended for convergent audio/video systems can take advantage of DSP devices designed for applications such as synthetic aperture radar and pattern recognition.

image

Figure 2.40 (a) A simple mixer built conventionally. (b) The same mixer implemented with DSP. The instructions at (c) operate the DSP

Figure 2.40(a) shows a simple digital mixer which accepts two PCM inputs, sets the gain of each and then mixes (adds) the two together. The sum will have increased in wordlength and must be digitally dithered prior to rounding to the required output wordlength.

Figure 2.40(b) shows a simple DSP system which is designed to do the same job. The hardware is trivial; a few ports and a DSP chip (known colloquially as an ‘engine'). The program which is needed to operate the DSP is shown in (c). This has been written in English rather than in DSP language which is incomprehensible to most humans. If all the steps in the program are executed in turn, the output value ought to be the same as if the hardware of (a) had been used.

One problem is that the DSP engine is designed to run as fast as its technology allows whereas in PCM results are required at the signal sampling rate. This is solved by using interrupts. The interrupt signal can occur at any time with respect to the processor clock without causing difficulty as it will only be examined when an instruction has been completed, prior to executing another one. The normal program is suspended, and a different program, known as a subroutine, is executed instead. When the subroutine is completed, the normal program resumes. In a PCM DSP application, the normal program may be an idling program; i.e. it doesn't do anything useful or it may rotate the lights on the front panel. The sample calculation is contained in the subroutine. The master sampling rate clock from a phase-locked loop is then used to generate interrupts to the DSP just after input samples have been made available.

Figure 2.41 shows that if this is done the subroutine is executed at the sampling rate with idling periods between. In practice this is only true if the subroutine is short enough to be executed within the sample period.

image

Figure 2.41 Synchronizing a signal processor with real time using interrupts. Theprocessing is carried out by a subroutine.

If it can't, a more elegant program or a more powerful ‘engine' must be sought.

2.19 Multiplexing principles

Multiplexing is used where several signals are to be transmitted down the same channel. The channel bit rate must be the same as or greater than the sum of the source bit rates. Figure 2.42 shows that when multiplexing is used, the data from each source has to be time compressed. This is done by buffering source data in a memory at the multiplexer. It is written into the memory in real time as it arrives, but will be read from the memory with a clock which has a much higher rate. This means that the readout occurs in a smaller timespan. If, for example, the clock frequency is raised by a factor of ten, the data for a given signal will be transmitted in a tenth of the normal time, leaving time in the multiplex for nine more such signals.

image

Figure 2.42 Multiplexing requires time compression on each input.

In the demultiplexer another buffer memory will be required. Only the data for the selected signal will be written into this memory at the bit rate of the multiplex. When the memory is read at the correct speed, the data will emerge with its original timebase.

In practice it is essential to have mechanisms to identify the separate signals to prevent them being mixed up and to convey the original signal clock frequency to the demultiplexer. In time-division multiplexing the timebase of the transmission is broken into equal slots, one for each signal. This makes it easy for the demultiplexer, but forces a rigid structure on all the signals such that they must all be locked to one another and have an unchanging bit rate. Packet multiplexing overcomes these limitations.

2.20 Packets

The multiplexer must switch between different time-compressed signals to create the bitstream and this is much easier to organize if each signal is in the form of data packets of constant size. This approach is used in commonly in networks such as Ethernet and ATM (see Chapter 12) as well as in MPEG transport streams. Figure 2.43 shows a packet multiplexing system.

image

Figure 2.43 Packet multiplexing relies on headers to identify the packets.

Each packet consists of two components: the header, which identifies the packet, and the payload, which is the data to be transmitted. The header will contain at least an identification code (ID) which is unique for each signal in the multiplex. The demultiplexer checks the ID codes of all incoming packets and discards those which do not have the wanted ID.

In complex systems it is common to have a mechanism to check that packets are not lost or repeated. This is the purpose of the packet continuity count which is carried in the header. For packets carrying the same ID, the count should increase by one from one packet to the next. Upon reaching the maximum binary value, the count overflows and recommences.

2.21 Statistical multiplexing

Packet multiplexing has advantages over time-division multiplexing because it does not set the bit rate of each signal. A demultiplexer simply checks packet IDs and selects all packets with the wanted code. It will do this however frequently such packets arrive. Consequently it is practicable to have variable bit rate signals in a packet multiplex. The multiplexer has to ensure that the total bit rate does not exceed the rate of the channel, but that rate can be allocated arbitrarily between the various signals.

As a practical matter is is usually necessary to keep the bit rate of the multiplex constant. With variable rate inputs this is done by creating null packets which are generally called stuffing or packing. The headers of these packets contain an unique ID which the demultiplexer does not recognize and so these packets are discarded on arrival.

In an MPEG environment, statistical multiplexing can be extremely useful because it allows for the varying difficulty of real program material. In a multiplex of several television programs, it is unlikely that all the programs will encounter difficult material simultaneously. When one program encounters a detailed scene or frequent cuts which are hard to compress, more data rate can be allocated at the allowable expense of the remaining programs which are handling easy material.

In a network using statistical multiplexing, such as ATM, efficient use of transmission capacity can be made by combining mixed data types with different priorities. If real-time MPEG video is being carried, clearly this must be given priority over non-real-time data to provide the required quality of service. However, when the instantaneous bit rate of the video falls, an ATM system can increase the amount of non-real-time data sent.

2.22 Networks

In the most general sense a network is a means of communication between a large number of places. Some networks deliver physical objects. If, however, we restrict the delivery to information only the result is a telecommunications network. The telephone system is a good example of a telecommunications network because it displays most of the characteristics of later networks.

It is fundamental in a network that any port can communicate with any other port. Figure 2.44 shows a primitive three-port network. Clearly each port that must select one or other of the remaining ports is a trivial switching system. However, if it were attempted to redraw Figure 2.44 with one hundred ports, each one would need a 99-way switch and the number of wires needed would be phenomenal. Another approach is needed.

Figure 2.45 shows that the common solution is to have an exchange, also known as a router, hub or switch, which is connected to every port by a single cable. In this case when a port wishes to communicate with another, it instructs the switch to make the connection. The complexity of the switch varies with its performance. The minimal case may be to install a single input selector and a single output selector. This allows any port to communicate with any other, but only one at a time. If more simultaneous communications are needed, further switching is needed. The extreme case is where every possible pair of ports can communicate simultaneously.

image

Figure 2.44 A simple three-port network has trivial switching requirements.

image

Figure 2.45 A network implemented with a router or hub.

The amount of switching logic needed to implement the extreme case is phenomenal and in practice it is unlikely to be needed. One fundamental property of networks is that they are seldom implemented with the extreme case supported. There will be an economic decision made balancing the number of simultaneous communications with the equipment cost. Most of the time the user will be unaware that this limit exists, until there is a statistically abnormal condition which causes more than the usual number of nodes to attempt communication. This is known as congestion in communications parlance.

The phrase ‘the switchboard was jammed' has passed into the language and stayed there despite the fact that manual switchboards are only seen in museums. This is a characteristic of networks. They generally only work up to a certain throughput and then there are problems. This doesn't mean that networks aren't useful, far from it. What it means is that with care, networks can be very useful, but without care they can be a nightmare.

There are two key factors to address in a network. The first is that it must have enough throughput, bandwidth or connectivity to handle the anticipated usage and the second is that a priority system or algorithm is chosen which has appropriate behaviour during congestion. These two characteristics are quite different, but often come as a pair in a network corresponding to a particular standard.

image

Figure 2.46 Radial network at (a) has one cable per node. TDM network (b) shares time slots on a single cable.

Where each device is individually cabled, the result is a radial network shown in Figure 2.46(a). It is not necessary to have one cable per device and several devices can co-exist on a single cable if some form of multiplexing is used. This might be time-division multiplexing (TDM) or frequency division multiplexing (FDM). In TDM, shown in (b), the time axis is divided into steps which may or may not be equal in length. In Ethernet, for example, these are called frames, whereas ATM calls them cells. During each time step or frame a pair of nodes have exclusive use of the cable. At the end of the time step another pair of nodes can communicate. Rapidly switching between steps gives the illusion of simultaneous transfer between several pairs of nodes. In FDM, simultaneous transfer is possible because each message occupies a different band of frequencies in the cable. This approach can also be used in optical fibres where light of several different wavelengths can be used. Each node has to ‘tune' to the correct signal. In practice it is possible to combine FDM and TDM. Each frequency band can be time multiplexed in some applications.

Data networks originated to serve the requirements of computers and it is a simple fact that most computer processes don't need to be performed in real time or indeed at a particular time at all. Networks tend to reflect that background as many of them, particularly the older ones, are asynchronous.

Asynchronous means that the time taken to deliver a given quantity of data is unknown. A TDM system may chop the data into several different transfers and each transfer may experience delay according to what other transfers the system is engaged in. Ethernet and most storage system buses are asynchronous. For broadcasting purposes an asynchronous delivery system is no use at all, but for copying a video data file between two storage devices an asynchronous system is perfectly adequate.

The opposite extreme is the synchronous system in which the network can guarantee a constant delivery rate and a fixed and minor delay. An AES/EBU digital audio router or an SDI digital video router is a synchronous network.

In between asynchronous and synchronous networks reside the isochronous approaches which cause a fixed moderate delay. These can be thought of as sloppy synchronous networks or more rigidly controlled asynchronous networks.

These three different approaches are needed for economic reasons. Asynchronous systems are very efficient because as soon as one transfer completes another one can begin. This can only be achieved by making every device wait with its data in a buffer so that transfer can start immediately. Asynchronous systems also make it possible for low bit rate devices to share a network with high bit rate ones. The low bit rate device will only need a small buffer and will send few cells, whereas the high bit rate device will send more cells.

Isochronous systems try to give the best of both worlds, generally by sacrificing some flexibility in block size. Modern networks are tending to be part isochronous and part asynchronous so that the advantages of both are available. Details of this approach will be found in Chapter 12.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset