CHAPTER 3 Digital Principles

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

INTRODUCTION

Starting with binary data, one can define a process to manipulate those data entirely mathematically. It is important to understand the process before considering the implementation. A large number of choices of processing mechanisms exist today, all of which must give the same result if properly engineered. A fixed set of gates may be hardwired to implement a process in which the highest speed is required, whereas a general purpose processor can implement a process under software control at lower speed but with greater flexibility. Somewhere between those extremes, a field programmable gate array (FPGA) is a set of logic gates whose configuration can be externally programmed.

There will be a fixed number of bits in a PCM (Pulse Code Modulation) video sample, and this number determines the size of the quantizing range. In the 8-bit samples used in much digital video equipment, there are 256 different numbers. Each number represents a different analog signal voltage, and care must be taken during conversion to ensure that the signal does not go outside the convertor range, or it will be clipped. In Figure 3.2a it can be seen that in an 8-bit pure binary system, the number range goes from 00hex, which represents the smallest voltage, through to FFhex, which represents the largest positive voltage. The video waveform must be accommodated within this voltage range, and Figure 3.2b shows how this can be done for a PAL composite signal. A luminance signal is shown in Figure 3.2c. As component digital systems handle only the active line, the quantizing range is optimized to suit the gamut of the unblanked luminance. There is a small offset to handle slightly misadjusted inputs.

PURE BINARY CODE

For digital video use, the prime purpose of binary numbers is to express the values of the samples that represent the original analog video waveform. Figure 3.1 shows some binary numbers and their equivalent in decimal. The radix point has the same significance in binary: symbols to the right of it represent one-half, one-quarter, and so on. Binary is convenient for electronic circuits, which do not get tired, but numbers expressed in binary become very long, and writing them is tedious and error-prone. The octal and hexadecimal notations are both used for writing binary because conversion is so simple. Figure 3.1 also shows that a binary number is split into groups of three or four digits starting at the least significant end, and the groups are individually converted to octal or hexadecimal digits. Because 16 different symbols are required in hex, the letters A–F are used for the numbers above 9.

FIGURE 3.1

(a) Binary and decimal.

FIGURE 3.1

(b) In octal, groups of 3 bits make one symbol, 0–7. (c) In hex, groups of 4 bits make one symbol, 0–F. Note how much shorter the number is in hex.

Colour difference signals are bipolar and so blanking is in the centre of the signal range. To accommodate colour difference signals in the quantizing range, the blanking voltage level of the analog waveform has been shifted as in Figure 3.2d so that the positive and negative voltages in a real signal can be expressed by binary numbers that are only positive. This approach is called offset binary. Strictly speaking both the composite and the luminance signals are also offset binary, because the blanking level is partway up the quantizing scale.

Offset binary is perfectly acceptable where the signal has been digitized only for recording or transmission from one place to another, after which it will be converted directly back to analog. Under these conditions it is not necessary for the quantizing steps to be uniform, provided both the ADC (analog-to-digital) and the DAC (digital-to-analog convertors) are constructed to the same standard. In practice, it is the requirements of signal processing in the digital domain that make both nonuniform quantizing and offset binary unsuitable.

FIGURE 3.2

(a) The unipolar quantizing range of an 8-bit pure binary system. (b) The analog input must be shifted to fit into the quantizing range, as shown for PAL. (c) In component, sync pulses are not digitized, so the quantizing intervals can be smaller. (d) An offset of half-scale is used for colour difference signals.

Figure 3.3 shows that analog video signal voltages are referred to blanking. The level of the signal is measured by how far the waveform deviates from blanking, and attenuation, gain, and mixing all take place around blanking level. Digital vision mixing is achieved by adding sample values from two or more different sources, but unless all the quantizing intervals are of the same size and there is no offset, the sum of two sample values will not represent the sum of the two original analog voltages. Thus sample values that have been obtained by nonuniform or offset quantizing cannot readily be processed because the binary numbers are not proportional to the signal voltage.

FIGURE 3.3

All video signal voltages are referred to blanking and must be added with respect to that level.

FIGURE 3.4

The result of an attempted attenuation in pure binary code is an offset. Pure binary cannot be used for digital video processing.

If two offset binary sample streams are added together in an attempt to perform digital mixing, the result will be that the offsets are also added and this may lead to an overflow. Similarly, if an attempt is made to attenuate by, say, 6.02 dB by dividing all the sample values by 2, Figure 3.4 shows that the offset is also divided and the waveform suffers a shifted baseline. This problem can be overcome with digital luminance signals simply by subtracting the offset from each sample before processing to obtain numbers truly proportional to the luminance voltage. This approach is not suitable for colour difference or composite signals because negative numbers would result when the analog voltage goes below blanking and pure binary coding cannot handle them. The problem with offset binary is that it works with reference to one end of the range. Chapter 1 introduced the two's complement numbering scheme, which operates symmetrically with reference to the centre of the range.

DIGITAL PROCESSING

However complex a digital process, it can be broken down into smaller stages until finally one finds that there are really only two basic types of element in use, and these can be combined in some way and supplied with a clock to implement virtually any process. Figure 3.5 shows that the first type is a logic element. This produces an output that is a logical function of the input with minimal delay. The second type is a storage element that samples the state of the input(s) when clocked and holds or delays that state. The strength of binary logic is that the signal has only two states, and considerable noise and distortion of the binary waveform can be tolerated before the state becomes uncertain. At every logic element, the signal is compared with a threshold and can thus pass through any number of stages without being degraded. In addition, the use of a storage element at regular locations throughout logic circuits eliminates time variations or jitter. Figure 3.5 shows that if the inputs to a logic element change, the output will not change until the propagation delay of the element has elapsed. However, if the output of the logic element forms the input to a storage element, the output of that element will not change until the input is sampled at the next clock edge. In this way the signal edge is aligned to the system clock and the propagation delay of the logic becomes irrelevant. The process is known as reclocking.

FIGURE 3.5

Logic elements have a finite propagation delay between input and output and cascading them delays the signal an arbitrary amount. Storage elements sample the input on a clock edge and can return a signal to near coincidence with the system clock. This is known as reclocking. Reclocking eliminates variations in propagation delay in logic elements.

FIGURE 3.6

Using an open-collector drive, several signal sources can share one common bus. If negative logic is used, the bus drivers turn off their output transistors with a false input, allowing another driver to control the bus. This will not happen with positive logic.

LOGIC ELEMENTS

In logic systems, all logical functions, however complex, can be configured from combinations of a few fundamental logic elements or gates. It is not profitable to spend too much time debating which are the truly fundamental ones, because most can be made from combinations of others. Figure 3.7 shows the important simple gates and their derivatives and introduces the logical expressions to describe them, which can be compared with the truth table notation. The figure also shows the important fact that when negative logic is used, the OR gate function interchanges with that of the AND gate. Sometimes schematics are drawn to reflect which voltage state represents the true condition. In the so-called intentional logic scheme, a negative logic signal always starts and ends at an inverting “bubble.” If an AND function is required between two negative logic signals, it will be drawn as an AND symbol with bubbles on all the terminals, even though the component used will be a positive logic OR gate. Opinions vary on the merits of intentional logic.

FIGURE 3.7

The basic logic gates compared.

The two states of the signal when measured with an oscilloscope are simply two voltages, usually referred to as high and low. The actual voltage levels will depend on the type of logic family in use and on the supply voltage used. Within logic, these levels are not of much consequence, and it is necessary to know them only when interfacing between different logic families or when driving external devices. The pure logic designer is not interested at all in these voltages, only in their meaning. Just as the electrical waveform from a microphone represents sound velocity, so the waveform in a logic circuit represents the truth of some statement. As there are only two states, there can only be true or false meanings. The true state of the signal can be assigned by the designer to either voltage state. When a high voltage represents a true logic condition and a low voltage represents a false condition, the system is known as positive, or high, true logic. This is the usual system, but sometimes the low voltage represents the true condition and the high voltage represents the false condition. This is known as negative or low true logic. Provided that everyone is aware of the logic convention in use, both work equally well.

Negative logic is often found in the TTL (transistor transistor logic) family, because in this technology it is easier to sink current to ground than to source it from the power supply. Figure 3.6 shows that if it is necessary to connect several logic elements to a common bus so that any one can communicate with any other, an open collector system is used, in which high levels are provided by pull-up resistors and the logic elements pull only the common line down. If positive logic were used, when no device was operating the pull-up resistors would cause the common line to take on an absurd true state, whereas if negative logic were used, the common line would pull up to a sensible false condition when there was no device using the bus. Whilst the open collector is a simple way of obtaining a shared bus system, it is limited in frequency of operation due to the time constant of the pull-up resistors charging the bus capacitance. In the so-called tristate bus systems, there are both active pull-up and active pull-down devices connected in the so-called totem-pole output configuration. Both devices can be disabled to a third state, in which the output assumes a high impedance, allowing some other driver to determine the bus state.

If numerical quantities need to be conveyed down the two-state signal paths described here, then the only appropriate numbering system is binary, which has only two symbols, 0 and 1. Just as positive or negative logic could represent the truth of a logical binary signal, the same choice is available to a numerical binary signal. Normally, a high voltage level will represent a binary 1 and a low voltage will represent a binary 0, described as a “high for a 1” system. Clearly a “low for a 1” system is just as feasible. Decimal numbers have several columns, each of which represents a different power of 10; in binary the column position specifies the power of 2.

STORAGE ELEMENTS

The basic memory element in logic circuits is the latch, which is constructed from two gates as shown in Figure 3.8a and which can be set or reset. A more useful variant is the D-type latch shown in Figure 3.8b, which remembers the state of the input either at the time a separate clock changes state for an edge-triggered device or after it goes false for a level-triggered device. D-type latches are commonly available with four or eight latches to the chip. A shift register can be made from a series of latches by connecting the Q output of one latch to the D input of the next and connecting all the clock inputs in parallel. Data are delayed by the number of stages in the register. Shift registers are also useful for converting between serial and parallel data transmissions.

Where large numbers of bits are to be stored, cross-coupled latches are less suitable because they are more complicated to fabricate inside integrated circuits than dynamic memory and consume more current.

FIGURE 3.8

Digital semiconductor memory types. In (a), one data bit can be stored in a simple set–reset latch, which has little application because the D-type latch in (b) can store the state of the single data input when the clock occurs. These devices can be implemented with bipolar transistors or FETs and are called static memories because they can store indefinitely. They consume a lot of power.

Figure 3.8

In (c), a bit is stored as the charge in a potential well in the substrate of a chip. It is accessed by connecting the bit line with the field effect from the word line. The single well where the two lines cross can then be written or read. These devices are called dynamic RAMs because the charge decays, and they must be read and rewritten (refreshed) periodically.

In large random access memories (RAMs), the data bits are stored as the presence or absence of charge in a tiny capacitor as shown in Figure 3.8c. The capacitor is formed by a metal electrode, insulated by a layer of silicon dioxide from a semiconductor substrate; hence the term MOS (metal oxide semiconductor). The charge will suffer leakage, and the value would become indeterminate after a few milliseconds. Where the delay needed is less than this, decay is of no consequence, as data will be read out before they have had a chance to decay. Where longer delays are necessary, such memories must be refreshed periodically by reading the bit value and writing it back to the same place. Most modern MOS RAM chips have suitable circuitry built in. Large RAMs store thousands of bits, and it is clearly impractical to have a connection to each one. Instead, the desired bit has to be addressed before it can be read or written. The size of the chip package restricts the number of pins available, so that large memories use the same address pins more than once. The bits are arranged internally as rows and columns, and the row address and the column address are specified sequentially on the same pins.

Several binary digits or bits are needed to express the value of a binary video sample. These bits can be conveyed at the same time by several signals to form a parallel system, which is most convenient inside equipment or for short distances because it is inexpensive, or one at a time down a single signal path, which is more complex, but convenient for cables between pieces of equipment because the connectors require fewer pins. When a binary system is used to convey numbers in this way, it can be called a digital system.

BINARY ADDITION

The circuitry necessary for adding pure binary or two's complement numbers is shown in Figure 3.9. Addition in binary requires 2 bits to be taken at a time from the same position in each word, starting at the least significant bit. Should both be 1’s, the output is 0, and there is a carry-out generated. Such a circuit is called a half-adder, shown in Figure 3.9a, and is suitable for the least-significant bit of the calculation. All higher stages will require a circuit that can accept a carry input as well as two data inputs. This is known as a full adder (Figure 3.9b). Multibit full adders are available in chip form and have carry-in and carry-out terminals to allow them to be cascaded to operate on long word lengths. Such a device is also convenient for inverting a two's complement number, in conjunction with a set of inverters. The adder chip has one set of inputs grounded, and the carry-in permanently held true, such that it adds 1 to the one's complement number from the inverter.

When mixing by adding sample values, care has to be taken to ensure that if the sum of the two sample values exceeds the number range the result will be clipping rather than wraparound. In two's complement, the action necessary depends on the polarities of the two signals. Clearly if one positive and one negative number are added, the result cannot exceed the number range. If two positive numbers are added, the symptom of positive overflow is that the most significant bit sets, causing an erroneous negative result, whereas a negative overflow results in the most significant bit clearing. The overflow control circuit will be designed to detect these two conditions and override the adder output. If the MSB (most significant bit) of both inputs is 0, the numbers are both positive, thus if the sum has the MSB set, the output is replaced with the maximum positive code (0111…). If the MSB of both inputs is set, the numbers are both negative, and if the sum has no MSB set, the output is replaced with the maximum negative code (1000…). These conditions can also be connected to warning indicators. Figure 3.9c shows this system in hardware. The resultant clipping on overload is sudden, and sometimes a PROM is included, which translates values around and beyond maximum to soft-clipped values below or equal to maximum.

A storage element can be combined with an adder to obtain a number of useful functional blocks, which will crop up frequently in audio equipment.

FIGURE 3.9

(a) Half adder. (b) Full-adder circuit and truth table. (c) Comparison of sign bits prevents wraparound on adder overflow by substituting clipping level.

Figure 3.10a shows that a latch is connected in a feedback loop around an adder. The latch contents are added to the input each time it is clocked. The configuration is known as an accumulator in computation because it adds up or accumulates values fed into it. In filtering, it is known as a discrete time integrator. If the input is held at some constant value, the output increases by that amount on each clock. The output is thus a sampled ramp.

FIGURE 3.10

Two configurations common in processing. In (a) the feedback around the adder adds the previous sum to each input to perform accumulation or digital integration. In (b) an inverter allows the difference between successive inputs to be computed. This is differentiation.

MODULO-N ARITHMETIC

Conventional arithmetic that is in everyday use relates to the real world of counting actual objects, and to obtain correct answers the concepts of borrow and carry are necessary in the calculations. There is an alternative type of arithmetic that has no borrow or carry, which is known as modulo arithmetic. In modulo-N no number can exceed N. If it does, N or whole multiples of N are subtracted until it does not. Thus 25 modulo-16 is 9 and 12 modulo-5 is 2. The count shown in Figure 3.11 is from a 4-bit device that overflows when it reaches 1111 because the carry-out is ignored. If a number of clock pulses M are applied from the zero state, the state of the counter will be given by M Mod.16. Thus modulo arithmetic is appropriate to systems in which there is a fixed word length and this means that the range of values the system can have is restricted by that word length. A number range that is restricted in this way is called a finite field.

Modulo-2 is a numbering scheme that is used frequently in digital processes. Figure 3.12 shows that in modulo-2 the conventional addition and subtraction are replaced by the XOR function such that

A + B Mod.2 = A XOR B.

When multibit values are added Mod.2, each column is computed quite independent of any other. This makes Mod.2 circuitry very fast in operation as it is not necessary to wait for the carries from lower-order bits to ripple up to the higher-order bits. Modulo-2 arithmetic is not the same as conventional arithmetic and takes some getting used to. For example, adding something to itself in Mod.2 always gives the answer 0.

FIGURE 3.11

As a fixed-word-length counter cannot hold the carry-out bit, it will resume at 0. Thus a 4-bit counter expresses every count as a modulo-16 number.

FIGURE 3.12

In modulo-2 calculations, there can be no carry or borrow operations and conventional addition and subtraction become identical. The XOR gate is a modulo-2 adder.

Figure 3.10b shows that the addition of an inverter allows the difference between successive inputs to be obtained. This is digital differentiation. The output is proportional to the slope of the input.

GAIN CONTROL BY MULTIPLICATION

When a digital recording is made, the gain of the analog input will usually be adjusted so that the quantizing range is fully exercised to make a recording of maximum signal-to-noise ratio. During postproduction, the recording may be played back and mixed with other signals, and the desired effect can be achieved only if the level of each can be controlled independently. Gain is controlled in the digital domain by multiplying each sample value by a coefficient. If that coefficient is less than 1, attenuation will result; if it is greater than 1, amplification can be obtained.

Multiplication in binary circuits is difficult. It can be performed by repeated adding, but this is too slow to be of any use. In fast multiplication, one of the inputs will simultaneously be multiplied by 1, 2, 4, etc., by hardwired bit shifting. Figure 3.13 shows that the other input bits will determine which of these powers will be added to produce the final sum and which will be neglected. If multiplying by 5, the process is the same as multiplying by 4, multiplying by 1, and adding the two products. This is achieved by adding the input to itself shifted two places. As the word length of such a device increases, the complexity increases exponentially, so this is a natural application for an integrated circuit. It is probably true that digital video would not have been viable without such chips.

THE COMPUTER

The computer is now a vital part of digital video systems, being used both for control purposes and to process video signals as data. In control, the computer finds applications in database management, automation, editing, and electromechanical systems such as tape drives and robotic cassette handling. Now that processing speeds have advanced sufficiently, computers are able to manipulate certain types of digital video in real time. Where very complex calculations are needed, real-time operation may not be possible and instead the computation proceeds as fast as it can in a process called rendering. The rendered data are stored so that they can be viewed in real time from a storage medium when the rendering is complete.

FIGURE 3.13

Structure of the fast multiplier: the input A is multiplied by 1, 2, 4, 8, etc., by bit shifting. The digits of the B input then determine which multiples of A should be added together by enabling AND gates between the shifters and the adder. For long word lengths, the number of gates required becomes enormous, and the device is best implemented in a chip.

The computer is a programmable device in that its operation is not determined by its construction alone, but instead by a series of instructions forming a program. The program is supplied to the computer one instruction at a time so that the desired sequence of events takes place.

Programming of this kind has been used for over a century in electromechanical devices, including automated knitting machines and street organs that are programmed by punched cards. However, the computer differs from these devices in that the program is not fixed, but can be modified by the computer itself. This possibility led to the creation of the term software to suggest a contrast to the constancy of hardware.

Computer instructions are binary numbers, each of which is interpreted in a specific way. As these instructions do not differ from any other kind of data, they can be stored in RAM. The computer can change its own instructions by accessing the RAM. Most types of RAM are volatile, in that they lose data when power is removed. Clearly if a program is stored entirely in this way, the computer will not be able to recover from a power failure. The solution is that a very simple starting or bootstrap program is stored in nonvolatile ROM, which will contain instructions to bring in the main program from a storage system such as a disk drive after power is applied. As programs in ROM cannot be altered, they are sometimes referred to as firmware to indicate that they are classified between hardware and software.

Making a computer do useful work requires more than simply a program that performs the required computation. There is also a lot of mundane activity that does not differ significantly from one program to the next. This includes deciding which part of the RAM will be occupied by the program and which by the data, producing commands to the storage disk drive to read the input data from a file and write back the results. It would be very inefficient if all programs had to handle these processes themselves. Consequently the concept of an operating system was developed. This manages all the mundane decisions and creates an environment in which useful programs or applications can execute.

The ability of the computer to change its own instructions makes it very powerful, but it also makes it vulnerable to abuse. Programs exist that are deliberately written to do damage. These viruses are generally attached to plausible messages or data files and enter computers through storage media or communications paths.

There is also the possibility that programs contain logical errors such that in certain combinations of circumstances the wrong result is obtained. If this results in the unwitting modification of an instruction, the next time that instruction is accessed the computer will crash. In consumer-grade software, written for the vast personal computer market, this kind of thing is unfortunately accepted.

For critical applications, software must be verified. This is a process that can prove that a program can recover from absolutely every combination of circumstances and keep running properly. This is a nontrivial process, because the number of combinations of states a computer can get into is staggering. As a result most software is unverified.

It is of the utmost importance that networked computers that can suffer virus infection or computers running unverified software are never used in a life-support or critical application.

FIGURE 3.14

A simple computer system. All components are linked by a single data/address/control bus. Although cheap and flexible, such a bus can make only one connection at a time, so it is slow.

Figure 3.14 shows a simple computer system. The various parts are linked by a bus, which allows binary numbers to be transferred from one place to another. This will generally use tristate logic (see Digital Processing) so that when one device is sending to another, all other devices present a high impedance to the bus.

The ROM stores the start-up program; the RAM stores the operating system, applications programs, and the data to be processed. The disk drive stores large quantities of data in a nonvolatile form. The RAM needs only to be able to hold part of one program as other parts can be brought from the disk as required. A program executes by fetching one instruction at a time from the RAM to the processor along the bus.

The bus also allows keyboard/mouse inputs and outputs to the display and printer. Inputs and outputs are generally abbreviated to I/O. Finally a programmable timer will be present, which acts as a kind of alarm clock for the processor.

INTERRUPTS

Ordinarily instructions are executed in the order that they are stored in RAM. However, some instructions direct the processor to jump to a new memory location. If this is a jump to an earlier instruction, the program will enter a loop. The loop must increment a count in a register each time, and contain a conditional instruction called a branch, which allows the processor to jump out of the loop when a predetermined count is reached.

THE CPU

The processor or CPU (central processing unit) is the heart of the system. Figure 3.15 shows a simple example of a CPU. The CPU has a bus interface, which allows it to generate bus addresses and input or output data. Sequential instructions are stored in RAM at contiguously increasing locations so that a program can be executed by fetching instructions from a RAM address specified by the program counter (PC) to the instruction register in the CPU. As each instruction is completed, the PC is incremented so that it points to the next instruction. In this way the time taken to execute the instruction can vary.

The processor is notionally divided into data paths and control paths. Figure 3.15 shows the data path. The CPU contains a number of general-purpose registers or scratchpads, which can be used to store partial results in complex calculations. Pairs of these registers can be addressed so that their contents go to the ALU (arithmetic logic unit). This performs various arithmetic (add, subtract, etc.) or logical (and, or, etc.) functions on the input data. The output of the ALU may be routed back to a register or output. By reversing this process it is possible to get data into the registers from the RAM. The ALU also outputs the conditions resulting from the calculation, which can control conditional instructions.

Which function the ALU performs and which registers are involved are determined by the instruction currently in the instruction register that is decoded in the control path. One pass through the ALU can be completed in one cycle of the processor's clock. Instructions vary in complexity as do the number of clock cycles needed to complete them. Incoming instructions are decoded and used to access a lookup table, which converts them into microinstructions, one of which controls the CPU at each clock cycle.

FIGURE 3.15

The data path of a simple CPU. Under the control of an instruction, the ALU will perform some function on a pair of input values from the registers and store or output the result.

However, it is often required that the sequence of execution should be changeable by some external event. This might be the changing of some value due to a keyboard input. Events of this kind are handled by interrupts, which are created by devices needing attention. Figure 3.16 shows that in addition to the PC, the CPU contains another dedicated register called the stack pointer. Figure 3.17 shows how this is used. At the end of every instruction the CPU checks to see if an interrupt is asserted on the bus.

If it is, a different set of microinstructions is executed. The PC is incremented as usual, but the next instruction is not executed. Instead, the contents of the PC are stored so that the CPU can resume execution when it has handled the current event. The PC state is stored in a reserved area of RAM known as the stack, at an address determined by the stack pointer.

Once the PC is stacked, the processor can handle the interrupt. It issues a bus interrupt acknowledge, and the interrupting device replies with a unique code identifying itself. This is known as a vector, which steers the processor to a RAM address containing a new program counter. This is the RAM address of the first instruction of the subroutine that is the program that will handle the interrupt. The CPU loads this address into the PC and begins execution of the subroutine.

FIGURE 3.16

Normally the program counter (PC) increments each time an instruction is completed in order to select the next instruction. However, an interrupt may cause the PC state to be stored in the stack area of RAM prior to the PC being forced to the start address of the interrupt subroutine. Afterward the PC can get its original value back by reading the stack.

FIGURE 3.17

How an interrupt is handled. See text for details.

At the end of the subroutine there will be a return instruction. This causes the CPU to use the stack pointer as a memory address to read the return PC state from the stack. With this value loaded into the PC, the CPU resumes execution where it left off.

The stack exists so that subroutines can themselves be interrupted. If a subroutine is executing when a higher-priority interrupt occurs, the subroutine can be suspended by incrementing the stack pointer and by storing the current PC in the next location in the stack.

When the second interrupt has been serviced, the stack pointer allows the PC of the first subroutine to be retrieved. Whenever a stack PC is retrieved, the stack pointer decrements and thereby points to the PC of the next item of unfinished business.

DIGITAL SIGNAL PROCESSORS

Although general-purpose computers can be programmed to process digital audio or image data, they are not ideal for the following reasons:

The number of arithmetic operations, particularly multiplications, is far higher than for data processing.
Processing is required in real time; data processors do not generally work in real time.
The program needed generally remains constant for the duration of a session, or changes slowly, whereas a data processor rapidly jumps between many programs.
Data processors can suspend a program on receipt of an interrupt; audio and image processors must work continuously for long periods.
Data processors tend to be I/O (input–output) limited, in that their operating speed is constrained by the problems of moving large quantities of data and instructions into the CPU.

Audio processors in contrast have a relatively small input and output rate, but compute intensively, whereas image processors also compute intensively but tend to outstrip the I/O capabilities of conventional computer architectures. A common video process is spatial interpolation used for resizing or oversampling. Spatial filters compute each output pixel value as a function of all input pixel values over a finite-sized window. The windows for the output pixels have extensive overlap. In a conventional CPU, shortage of internal registers means that a filter algorithm would have to fetch the input pixel values within the window from memory for every output pixel to be calculated. With an 8 × 8 window size, 1 input pixel falls within 64 different windows, with the result that the conventional processor would have to fetch the same value from the same location 64 times, whereas in principle it needs to be fetched only once.

THE PROGRAMMABLE TIMER

Ordinarily processors have no concept of time and simply execute instructions as fast as their clock allows. This is fine for general-purpose processing, but not for time-critical processes such as video. One way in which the processor can be made time conscious is to use programmable timers. These are devices that reside on the computer bus and run from a clock. The CPU can set up a timer by loading it with a count. When the count is reached, the timer will interrupt. To give an example, if the count were to be equal to one frame period, there would be one interrupt per frame, and this would result in the execution of a subroutine once per frame, provided, of course, that all the instructions could be executed in one frame period.

This is sufficient justification for the development of specialized digital signal processors (DSPs). These units are equipped with more internal registers than data processors to facilitate implementation of, for example, multipoint filter algorithms. The arithmetic unit will be designed to multiply/accumulate rapidly using techniques such as pipelining, which allows operations to overlap. The functions of the register set and the arithmetic unit are controlled by a micro-sequencer, which interprets the instructions in the program. Figure 3.18 shows the interior structure of a DSP chip.

Where a DSP is designed specifically for image processing, it is possible to incorporate one CPU per pixel. With a massively parallel approach such as this, the speed of each CPU can be very slow and it can be implemented serially, making it trivially easy to optimize the word length of the calculation to the accuracy requirement. DSPs are used in many other industries in which waveforms that were originally analog need to be manipulated in the digital domain. In fact this is probably the best definition of DSP, which distinguishes it from computation in general. Equipment intended for convergent audio/video systems can take advantage of DSP devices designed for applications such as synthetic aperture radar and pattern recognition.

Figure 3.19a shows a simple digital mixer that accepts two PCM inputs, sets the gain of each, and then mixes (adds) the two together. The sum will have increased in word length and must be digitally dithered prior to rounding to the required output word length. Figure 3.19b shows a simple DSP system that is designed to do the same job. The hardware is trivial: a few ports and a DSP chip (known colloquially as an “engine”). The program needed to operate the DSP is shown in (c). This has been written in English rather than in DSP language, which is incomprehensible to most humans. If all the steps in the program are executed in turn, the output value ought to be the same as if the hardware of Figure 3.19a had been used.

FIGURE 3.18

A DSP is a specialized form of computer (courtesy of Texas Instruments).

One problem is that the DSP engine is designed to run as fast as its technology allows, whereas in PCM results are required at the signal sampling rate. This is solved by using interrupts. The interrupt signal can occur at any time with respect to the processor clock without causing difficulty as it will be examined only when an instruction has been completed, prior to executing another one. The normal program is suspended, and a different program, known as a subroutine, is executed instead. When the subroutine is completed, the normal program resumes.

In a PCM DSP application, the normal program may be an idling program; i.e., it doesn't do anything useful or it may rotate the lights on the front panel. The sample calculation is contained in the subroutine. The master sampling rate clock from a phase-locked loop is then used to generate interrupts to the DSP just after input samples have been made available. Figure 3.20 shows that if this is done the subroutine is executed at the sampling rate with idling periods between. In practice this is true only if the subroutine is short enough to be executed within the sample period. If it cannot, a more elegant program or a more powerful “engine” must be sought.

FIGURE 3.19

(a) A simple mixer built conventionally. (b) The same mixer implemented with DSP. The instructions in (c) operate the DSP.

FIGURE 3.20

Synchronising a signal processor with real time using interrupts. The processing is carried out by a subroutine.

TIME BASE CORRECTION

In Chapter 1 it was stated that a strength of digital technology is the ease with which delay can be provided. Accurate control of delay is the essence of time base correction, necessary whenever the instantaneous time of arrival or rate from a data source does not match the destination. In digital video, the destination will almost always have perfectly regular timing, namely the sampling rate clock of the final DAC. Time base correction consists of aligning jittery signals from storage media or transmission channels with that stable reference.

A further function of time base correction is to reverse the time compression applied prior to recording or transmission. As was shown in Chapter 1 under Time Compression and Packetising, digital recorders compress data into blocks to facilitate editing and error correction as well as to permit head switching between blocks in rotary-head machines. Owing to the spaces between blocks, data arrive in bursts on replay, but must be fed to the output convertors in an unbroken stream at the sampling rate.

In computer hard-disk drives, which are used in digital video workstations, time compression is also used, but a converse problem arises. Data from the disk blocks arrive at a reasonably constant rate, but cannot necessarily be accepted at a steady rate by the logic because of contention for the use of buses and memory by the different parts of the system. In this case the data must be buffered by a relative of the time base corrector, which is usually referred to as a silo.

Although delay is easily implemented, it is not possible to advance a data stream. Most real machines cause instabilities balanced about the correct timing: the output jitters between too early and too late. Because the information cannot be advanced in the corrector, only delayed, the solution is to run the machine in advance of real time. In this case, correctly timed output signals will need a nominal delay to align them with reference timing. Early output signals will receive more delay, and late output signals will receive less delay.

FIGURE 3.21

Most TBCs are implemented as a memory addressed by a counter, which periodically overflows to give a ring structure. The memory allows the read and write timing to be asynchronous.

Under Digital Processing the principles of digital storage elements, which can be used for delay purposes, were presented. The shift-register approach and the RAM approach to delay are very similar, as a shift register can be thought of as a memory whose address increases automatically when clocked. The data rate and the maximum delay determine the capacity of the RAM required. Figure 3.21 shows that the addressing of the RAM is done by a counter that overflows endlessly from the end of the memory back to the beginning, giving the memory a ring-like structure. The write address is determined by the incoming data, and the read address is determined by the outgoing data. This means that the RAM has to be able to read and write at the same time. The switching between read and write involves not only a data multiplexer but also an address multiplexer. In general, the arbitration between read and write will be done by signals from the stable side of the TBC as Figure 3.22 shows. In the replay case the stable clock will be on the read side. The stable side of the RAM will read a sample when it demands, and the writing will be locked out for that period. However, the input data cannot be interrupted in many applications, so a small buffer silo is installed before the memory, which fills up as the writing is locked out and empties again as writing is permitted. Alternatively, the memory will be split into blocks as was shown in Chapter 1, such that when one block is reading, a different block will be writing, and the problem does not arise.

FIGURE 3.22

In a RAM-based TBC, the RAM is reference synchronous, and an arbitrator decides when it will read and when it will write. During reading, asynchronous input data back up in the input silo, asserting a write request to the arbitrator. The arbitrator will then cause a write cycle between read cycles.

In most digital video applications, the sampling rate exceeds the rate at which economically available RAM chips can operate. The solution is to arrange several video samples into one longer word, known as a superword, and to construct the memory so that it stores superwords in parallel.

When used in a hard-disk system, a silo will allow data to and from the disk, which is turning at constant speed. When the disk is being read, Figure 3.23a shows that the silo starts empty, and if there is bus contention, the silo will start to fill. When the bus is free, the disk controller will attempt to empty the silo into the memory. The system can take advantage of the interblock gaps on the disk, containing headers, preambles, and redundancy, for in these areas there are no data to transfer, and there is some breathing space to empty the silo before the next block. In practice the silo need not be empty at the start of every block, provided it never becomes full before the end of the transfer. If this happens some data are lost and the function must be aborted. The block containing the silo overflow will generally be reread on the next revolution. In sophisticated systems, the silo has a kind of dipstick and can interrupt the CPU if the data get too deep. The CPU can then suspend some bus activity to allow the disk controller more time to empty the silo.

FIGURE 3.23

The silo contents during read functions (a) appear different from those during write functions (b). In (a), the control logic attempts to keep the silo as empty as possible; in (b) the logic prefills the silo and attempts to keep it full until the memory word count overflows.

When the disk is to be written, as in Figure 3.23b, a continuous data stream must be provided during each block, as the disk cannot stop. The silo will be prefilled before the disk attempts to write, and the disk controller attempts to keep it full. In this case all will be well if the silo does not become empty before the end of the transfer. Figure 3.24 shows the silo of a typical disk controller with the multiplexers necessary to put it in the read data stream or the write data stream.

MULTIPLEXING PRINCIPLES

Multiplexing is used where several signals are to be transmitted down the same channel. The channel bit rate must be the same as or greater than the sum of the source bit rates. Figure 3.25 shows that when multiplexing is used, the data from each source has to be time compressed. This is done by buffering source data in a memory at the multiplexer. It is written into the memory in real time as it arrives, but will be read from the memory with a clock that has a much higher rate. This means that the readout occurs in a smaller time span. If, for example, the clock frequency is raised by a factor of 10, the data for a given signal will be transmitted in a tenth of the normal time, leaving time in the multiplex for nine more such signals.

FIGURE 3.24

To guarantee that the drive can transfer data in real time at regular intervals (determined by disk speed and density) the silo provides buffering to the asynchronous operation of the memory access process. In (a) the silo is configured for a disk read. The same silo is used in (b) for a disk write.

FIGURE 3.25

Multiplexing requires time compression on each input.

PACKETS

The multiplexer must switch between different time-compressed signals to create the bitstream and this is much easier to organize if each signal is in the form of data packets of constant size. Figure 3.26 shows a packet multiplexing system.

Each packet consists of two components: the header, which identifies the packet, and the payload, which is the data to be transmitted. The header will contain at least an identification code (ID), which is unique for each signal in the multiplex. The demultiplexer checks the ID codes of all incoming packets and discards those that do not have the wanted ID.

In complex systems it is common to have a mechanism to check that packets are not lost or repeated. This is the purpose of the packet continuity count, which is carried in the header. For packets carrying the same ID, the count should increase by one from one packet to the next. Upon reaching the maximum binary value, the count overflows and recommences.

FIGURE 3.26

Packet multiplexing relies on headers to identify the packets.

In the demultiplexer another buffer memory will be required. Only the data for the selected signal will be written into this memory at the bit rate of the multiplex. When the memory is read at the correct speed, the data will emerge with their original time base.

In practice it is essential to have mechanisms to identify the separate signals to prevent them being mixed up and to convey the original signal clock frequency to the demultiplexer. In time-division multiplexing the time base of the transmission is broken into equal slots, one for each signal. This makes it easy for the demultiplexer, but forces a rigid structure on all the signals such that they must all be locked to one another and have an unchanging bit rate. Packet multiplexing overcomes these limitations.

STATISTICAL MULTIPLEXING

Packet multiplexing has advantages over time-division multiplexing because it does not set the bit rate of each signal. A demultiplexer simply checks packet IDs and selects all packets with the wanted code. It will do this however frequently such packets arrive. Consequently it is practicable to have variable bit rate signals in a packet multiplex. The multiplexer has to ensure that the total bit rate does not exceed the rate of the channel, but that rate can be allocated arbitrarily between the various signals.

As a practical matter it is usually necessary to keep the bit rate of the multiplex constant. With variable rate inputs this is done by creating null packets, which are generally called stuffing or packing. The headers of these packets contain a unique ID, which the demultiplexer does not recognize, and so these packets are discarded on arrival.

In an MPEG environment, statistical multiplexing can be extremely useful because it allows for the varying difficulty of real program material. In a multiplex of several television programs, it is unlikely that all the programs will encounter difficult material simultaneously. When one program encounters a detailed scene or frequent cuts that are hard to compress, more data rate can be allocated at the allowable expense of the remaining programs that are handling easy material.

DIGITAL FADERS AND CONTROLS

In a digital mixer, the gain coefficients will originate in hand-operated faders, just as in analog. Analog faders may be retained and used to produce a varying voltage, which is converted to a digital code or gain coefficient in an ADC, but it is also possible to obtain coefficients directly in digital faders. Digital faders are a form of displacement transducer in which the mechanical position of the control is converted directly to a digital code. The positions of other controls, such as jog wheels on VTRs or editors, will also need to be digitized. Controls can be linear or rotary, and absolute or relative. In an absolute control, the position of the knob determines the output directly. In a relative control, the knob can be moved to increase or decrease the output, but its absolute position is meaningless.

FIGURE 3.27

An absolute linear fader uses a number of light beams, which are interrupted in various combinations according to the position of a grating. A Gray code shown in Figure 3.28 must be used to prevent false codes.

Figure 3.27 shows an absolute linear fader. A grating is moved with respect to several light beams, one for each bit of the coefficient required. The interruption of the beams by the grating determines which photocells are illuminated. It is not possible to use a pure binary pattern on the grating because this results in transient false codes due to mechanical tolerances. Figure 3.28 shows some examples of these false codes. For example, on moving the fader from 3 to 4, the MSB goes true slightly before the middle bit goes false. This results in a momentary value of 4 + 2 = 6 between 3 and 4. The solution is to use a code in which only 1 bit ever changes in going from one value to the next. One such code is the Gray code, which was devised to overcome timing hazards in relay logic but is now used extensively in position encoders. Gray code can be converted to binary in software or in a suitable PROM or gate array.

Figure 3.29 shows a rotary incremental encoder. This produces a sequence of pulses whose number is proportional to the angle through which it has been turned. The rotor carries a radial grating over its entire perimeter. This turns over a second fixed radial grating whose bars are not parallel to those of the first grating.

FIGURE 3.28

(a) Binary cannot be used for position encoders because mechanical tolerances cause false codes to be produced. (b) In Gray code, only 1 bit (arrowed) changes in between positions, so no false codes can be generated.

The resultant moiré fringes travel inward or outward depending on the direction of rotation. Two suitably positioned light beams falling on photocells will produce outputs in quadrature. The relative phase determines the direction, and the frequency is proportional to speed. The encoder outputs can be connected to a counter whose contents will increase or decrease according to the direction in which the rotor is turned. The counter provides the coefficient output.

FIGURE 3.29

The fixed and rotating gratings produce moiré fringes, which are detected by two light paths as quadrature sinusoids. The relative phase determines the direction, and the frequency is proportional to speed of rotation.

The word length of the gain coefficients requires some thought, as they determine the number of discrete gains available. If the coefficient word length is inadequate, the gain control becomes “steppy,” particularly toward the end of a fadeout. A compromise between performance and the expense of high-resolution faders is to insert a digital interpolator having a low-pass characteristic between the fader and the gain control stage. This will compute intermediate gains to higher resolution than the coarse fader scale so that the steps cannot be discerned.

THE GALOIS FIELD

Figure 3.30 shows a simple circuit consisting of three D-type latches, which are clocked simultaneously. They are connected in series to form a shift register. At Figure 3.30a a feedback connection has been taken from the output to the input and the result is a ring counter in which the bits contained will recirculate endlessly. At (b) one XOR gate is added so that the output is fed back to more than one stage. The result is known as a twisted-ring counter and it has some interesting properties. Whenever the circuit is clocked, the lefthand bit moves to the righthand latch, the centre bit moves to the lefthand latch, and the centre latch becomes the XOR of the two outer latches. The figure shows that whatever the starting condition of the 3 bits in the latches, the same state will always be reached again after seven clocks, except if zero is used. The states of the latches form an endless ring of nonsequential numbers called a Galois field, after the French mathematical prodigy Evariste Galois, who discovered them. The states of the circuit form a maximum length sequence (prs) because there are as many states as are permitted by the word length. As the states of the sequence have many of the characteristics of random numbers, yet are repeatable, the result can also be called a pseudo-random sequence. As the all-zeros case is disallowed, the maximum length of a sequence generated by a register of m bits cannot exceed (2^{^m} – 1) states. The Galois field, however, includes the zero term. It is useful to explore the bizarre mathematics of Galois fields, which use modulo-2 arithmetic. Familiarity with such manipulations is helpful when studying error correction, particularly the Reed–Solomon codes used in recorders and treated in Chapter 8. They will also be found in processes that require pseudo-random numbers, such as digital dither, treated in Chapter 4, and randomized channel codes used in, for example, DVB and discussed in Chapter 10.

The circuit of Figure 3.30 can be considered as a counter and the four points shown will then be representing different powers of 2 from the MSB on the left to the LSB on the right. The feedback connection from the MSB to the other stages means that whenever the MSB becomes 1, two other powers are also forced to 1 so that the code of 1011 is generated.

FIGURE 3.30

The circuit shown is a twisted-ring counter, which has an unusual feedback arrangement. Clocking the counter causes it to pass through a series of nonsequential values. See text for details.

Each state of the circuit can be described by combinations of powers of x, such as

x² = 100,

x = 010,

x² + x = 110, etc.

The fact that 3 bits have the same state because they are connected together is represented by the Mod.2 equation

x³ + x + 1 = 0.

Let x = a, which is a primitive element.

Now

a³ + a + 1 = 0.

(3.1)

In modulo-2,

a + a = a² + a² = 0,

a = x = 010,

a² = x² = 100,

a³ = a + 1 = 011 from (3.1),

a⁴ = a × a³ = a(a + 1) = a² + a = 110,

a⁵ = a × a⁴ = a(a² + a) = a³ + a² = 111,

a⁶ = a³ × a³ = (a + 1)² = a² + a + a + 1 = a² + 1 = 101,

a⁷ = a(a² + 1) = a³ + a = a = 1 + a = 1 = 001.

In this way it can be seen that the complete set of elements of the Galois field can be expressed by successive powers of the primitive element. Note that the twisted-ring circuit of Figure 3.30 simply raises a to higher and higher powers as it is clocked. Thus the seemingly complex multibit changes caused by a single clock of the register become simple to calculate using the correct primitive and the appropriate power.

The numbers produced by the twisted-ring counter are not random; they are completely predictable if the equation is known. However, the sequences produced are sufficiently similar to random numbers that in many cases they will be useful. They are thus referred to as pseudo-random sequences. The feedback connection is chosen such that the expression it implements will not factorize. Otherwise a maximum-length sequence could not be generated because the circuit might sequence around one or other of the factors depending on the initial condition. A useful analogy is to compare the operation of a pair of meshed gears. If the gears have a number of teeth that is relatively prime, many revolutions are necessary to make the same pair of teeth touch again. If the number of teeth have a common multiple, far fewer turns are needed.

FIGURE 3.31

The PRS generator of DVB.

Figure 3.31 shows the pseudo-random sequence generator used in DVB. Its purpose is to modify the transmitted spectrum so that the amount of energy transmitted is as uniform as possible across the channel.

FILTERS

Figure 3.32 shows an optical system of finite resolution. If an object containing an infinitely sharp line is presented to this system, the image will be an intensity function known in optics as a point spread function. Such functions are almost invariably symmetrical in optics. There is no movement or change here, the phenomenon is purely spatial. A point spread function is a spatial impulse response. All images passing through the optical system are convolved with it. Figure 3.32b shows that the object may be scanned by an analog system to produce a waveform. The image may also be scanned in this way. These waveforms are now temporal. However, the second waveform may be obtained in another way, using an analog filter in series with the first scanned waveform that has an equivalent impulse response. This filter must have linear phase, i.e., its impulse response must be symmetrical.

Figure 3.32c shows that the object may also be sampled, in which case all samples but one will have a value of zero. The image may also be sampled, and owing to the point spread function, there will now be a number of nonzero sample values. However, the image samples may also be obtained by passing the input sample into a digital filter having the appropriate impulse response. Note that it is possible to obtain the same result as in Figure 3.32c by passing the scanned waveform of (b) into an ADC and storing the samples in a memory. Clearly there are a number of equivalent routes leading to the same result. One result of this is that optical systems and sampled systems can simulate one another. This gives us considerable freedom to perform processing in the most advantageous domain that gives the required result. There are many parallels between analog, digital, and optical filters, which this chapter treats as a common subject.

It should be clear from Figure 3.32 why video signal paths need to have linear phase. In general, analog circuitry and filters tend not to have linear phase because they must be causal, which means that the output can occur only after the input. Figure 3.33a shows a simple RC network and its impulse response. This is the familiar exponential decay due to the capacitor discharging through the resistor (in series with the source impedance, which is assumed here to be negligible). The figure also shows the response to a square wave at (b). With other waveforms the process is inevitably more complex.

FIGURE 3.32

(a) In optical systems an infinitely sharp line is reproduced as a point spread function, which is the impulse response of the optical path. (b) Scanning either object or image produces an analog time-variant waveform. The scanned object waveform can be converted to the scanned image waveform with an electrical filter having an impulse response, which is an analog of the point spread function. (c) The object and image may also be sampled or the object samples can be converted to the image samples by a filter with an analogous discrete impulse response.

FIGURE 3.33

(a) The impulse response of a simple RC network is an exponential decay. This can be used to calculate the response to a square wave, as in (b).

Filtering is unavoidable. Sometimes a process has a filtering effect that is undesirable, for example, the limited frequency response of a video amplifier or loss of resolution in a lens, and we try to minimize it. On other occasions a filtering effect is specifically required. Analog or digital filters, and sometimes both, are required in DACs, in ADCs, in the data channels of digital recorders and transmission systems, and in DSP. Optical filters may also be necessary in imaging systems to convert between sampled and continuous images. Optical systems used in displays and in laser recorders also act as spatial filters.¹

Figure 3.34 shows that impulse response testing tells a great deal about a filter. With a perfect filter, all frequencies should experience the same time delay. If some groups of frequencies experience a delay different from the others, there is a group-delay error. As an impulse has an infinite spectrum, a filter suffering from group-delay error will separate the different frequencies of an impulse along the time axis. A pure delay will cause a phase shift proportional to frequency, and a filter with this characteristic is said to be phase-linear. The impulse response of a phase-linear filter is symmetrical. If a filter suffers from group-delay error it cannot be phase-linear. It is almost impossible to make a perfectly phase-linear analog filter, and many filters have a group-delay equalization stage following them, which is often as complex as the filter itself. In the digital domain it is straightforward to make a phase-linear filter, and phase equalization becomes unnecessary.

FIGURE 3.34

If a filter is not phase-linear, different frequencies will emerge at different times if an impulse is input. This is undesirable in video circuitry.

Because of the sampled nature of the signal, whatever the response at low frequencies may be, all PCM channels (and sampled analog channels) act as low-pass filters because they cannot contain frequencies above the Nyquist limit of half the sampling frequency.

TRANSFORMS

Transforms are a useful subject because they can help to understand processes that cause undesirable filtering or to design filters. The information itself may be subject to a transform. Transforming converts the information into another analog. The information is still there, but expressed with respect to temporal or spatial frequency rather than time or space. Instead of binary numbers representing the magnitude of samples, there are binary numbers representing the magnitude of frequency coefficients. The close relationship of transforms to convergent technologies makes any description somewhat circular, as Figure 3.35 shows. The solution adopted in this chapter is to introduce a number of filtering-related topics and to return to the subject of transforms whenever a point can be illustrated.

Transforms are only a different representation of the same information. As a result, what happens in the frequency domain must always be consistent with what happens in the time or space domains. A filter may modify the frequency response of a system, and/or the phase response, but every combination of frequency and phase response has a corresponding impulse response in the time domain. Figure 3.36 shows the relationship between the domains. On the left is the frequency domain. Here an input signal having a given spectrum is input to a filter having a given frequency response. The output spectrum will be the product of the two functions. If the functions are expressed logarithmically in decibels, the product can be obtained by simple addition. On the right, the time-domain output waveform represents the convolution of the impulse response with the input waveform. However, if the frequency transform of the output waveform is taken, it must be the same as the result obtained from the frequency response and the input spectrum. This is a useful result because it means that when image sampling is considered, it will be possible to explain the process in both domains.

FIGURE 3.35

Transforms are extensively found in convergent systems. They may be used to explain the operation of a process, or a process may actually create a transform. Here the relationship between transforms and DVB is shown.

FIGURE 3.36

(a) If a signal having a given spectrum is passed into a filter, multiplying the two spectra will give the output spectrum. Equally transforming the filter frequency response will yield the impulse response of the filter. (b) If this is convolved with the time-domain waveform, the result will be the output waveform, whose transform is the output spectrum.

CONVOLUTION

When a waveform is input to a system, the output waveform will be the convolution of the input waveform and the impulse response of the system. Convolution can be followed by reference to the graphic example in Figure 3.37. Where the impulse response is asymmetrical, the decaying tail occurs after the input. As a result it is necessary to reverse the impulse response in time so that it is mirrored prior to sweeping it through the input waveform. The output voltage is proportional to the shaded area shown where the two impulses overlap. If the impulse response is symmetrical, as would be the case with a linear phase filter, or in an optical system, the mirroring process is superfluous.

FIGURE 3.37

In the convolution of two continuous signals (the impulse response with the input), the impulse must be time reversed or mirrored. This is necessary because the impulse will be moved from left to right, and mirroring gives the impulse the correct time-domain response when it is moved past a fixed point. As the impulse response slides continuously through the input waveform, the area where the two overlap determines the instantaneous output amplitude. This is shown for five different times by the crosses on the output waveform.

FIGURE 3.38

In discrete time convolution, the mirrored impulse response is stepped through the input one sample period at a time. At each step, the sum of the cross-products is used to form an output value. As the input in this example is a constant height pulse, the output is simply proportional to the sum of the coincident impulse response samples. This figure should be compared with Figure 3.37, filters used for image processing, sampling rate conversion, and oversampling.

The same process can be performed in the sampled, or discrete, time domain as shown in Figure 3.38. The impulse and the input are now a set of discrete samples, which clearly must have the same sample spacing. The impulse response has value only where impulses coincide. Elsewhere it is zero. The impulse response is therefore stepped through the input one sample period at a time. At each step, the area is still proportional to the output, but as the time steps are of uniform width, the area is proportional to the impulse height and so the output is obtained by adding up the lengths of overlap. In mathematical terms, the output samples represent the convolution of the input and the impulse response by summing the coincident cross-products.

FIR AND IIR FILTERS

Filters can be described in two main classes, as shown in Figure 3.39, according to the nature of the impulse response. Finite-impulse response (FIR) filters are always stable and, as their name suggests, respond to an impulse once, as they have only a forward path. In the temporal domain, the time for which the filter responds to an input is finite, fixed, and readily established. The same is therefore true about the distance over which an FIR filter responds in the spatial domain. FIR filters can be made perfectly phase-linear if required. Most filters used for sampling rate conversion and oversampling fall into this category.

Infinite-impulse response (IIR) filters respond to an impulse indefinitely and are not necessarily stable, as they have a return path from the output to the input. For this reason they are also called recursive filters. As the impulse response is not symmetrical, IIR filters are not phase-linear. Noise reduction units may employ recursive filters and will be treated in Chapter 5.

FIGURE 3.39

An FIR filter (a) responds only once to an input, whereas the output of an IIR filter (b) continues indefinitely, rather like a decaying echo.

FIR FILTERS

An FIR filter works by graphically constructing the impulse response for every input sample. It is first necessary to establish the correct impulse response. Figure 3.40a shows an example of a low-pass filter that cuts off at 1/4 of the sampling rate. The impulse response of a perfect low-pass filter is a sin x/x curve, where the time between the two central zero crossings is the reciprocal of the cutoff frequency. According to the mathematics, the waveform has always existed and carries on forever. The peak value of the output coincides with the input impulse. This means that the filter is not causal, because the output has changed before the input is known. Thus in all practical applications it is necessary to truncate the extreme ends of the impulse response, which causes an aperture effect, and to introduce a time delay in the filter equal to half the duration of the truncated impulse to make the filter causal. As an input impulse is shifted through the series of registers in Figure 3.40b, the impulse response is created, because at each point it is multiplied by a coefficient as in Figure 3.40c. These coefficients are simply the result of sampling and quantizing the desired impulse response. Clearly the sampling rate used to sample the impulse must be the same as the sampling rate for which the filter is being designed. In practice the coefficients are calculated, rather than attempting to sample an actual impulse response. The coefficient word length will be a compromise between cost and performance. Because the input sample shifts across the system registers to create the shape of the impulse response, the configuration is also known as a transversal filter. In operation with real sample streams, there will be several consecutive sample values in the filter registers at any time to convolve the input with the impulse response.

Simply truncating the impulse response causes an abrupt transition from input samples that matter and those that do not. Truncating the filter superimposes a rectangular shape on the time-domain impulse response. In the frequency domain the rectangular shape transforms to a sin x/x characteristic, which is superimposed on the desired frequency response as a ripple. One consequence of this is known as Gibb's phenomenon, a tendency for the response to peak just before the cutoff frequency.²,³ As a result, the length of the impulse that must be considered will depend not only on the frequency response, but also on the amount of ripple that can be tolerated. If the relevant period of the impulse is measured in sample periods, the result will be the number of points or multiplications needed in the filter. Figure 3.41 compares the performance of filters with different numbers of points. Video filters may use as few as 8 points, whereas a high-quality digital audio FIR filter may need as many as 96 points.

FIGURE 3.40

(a) The impulse response of a low-pass filter (LPF) is a sin x/x curve, which stretches from −∞ to +∞ in time. The ends of the response must be neglected and a delay introduced to make the filter causal. (b) The structure of an FIR LPF. Input samples shift across the register and at each point are multiplied by different coefficients. (c) When a single unit sample shifts across the circuit of Figure 3.43b, the impulse response is created at the output as the impulse is multiplied by each coefficient in turn.

FIGURE 3.41

The truncation of the impulse in an FIR filter caused by the use of a finite number of points (N) results in ripple in the response. Shown here are three different numbers of points for the same impulse response. The filter is an LPF that rolls off at 0.4 of the fundamental interval (courtesy of Philips Technical Review).

Rather than simply truncate the impulse response in time, it is better to make a smooth transition from samples that do not count to those that do. This can be done by multiplying the coefficients in the filter by a window function that peaks in the centre of the impulse. Figure 3.42 shows some different window functions and their responses. The rectangular window is the case of truncation, and the response is shown at I. A linear reduction in weight from the centre of the window to the edges characterizes the Bartlett window (II), which trades ripple for an increase in transition-region width. At III is shown the Hamming window, which is essentially a raised cosine shape. Not shown is the similar Hamming window, which offers a slightly different trade-off between ripple and the width of the main lobe. The Blackman window introduces an extra cosine term into the Hamming window at half the period of the main cosine period, reducing Gibb's phenomenon and ripple level, but increasing the width of the transition region. The Kaiser window is a family of windows based on the Bessel function, allowing various trade-offs between ripple ratio and main lobe width. Two of these are shown in IV and V. The drawback of the Kaiser windows is that they are complex to implement.

Filter coefficients can be optimized by computer simulation. One of the best-known techniques used is the Remez exchange algorithm, which converges on the optimum coefficients after a number of iterations.

In the example of Figure 3.43, the low-pass filter of Figure 3.40 is shown with a Bartlett window. Acceptable ripple determines the number of significant sample

FIGURE 3.42

The effects of window functions. At top, various window functions are shown in continuous form. Once the number of samples in the window is established, the continuous functions shown here are sampled at the appropriate spacing to obtain window coefficients. These are multiplied by the truncated impulse response coefficients to obtain the actual coefficients used by the filter. The amplitude responses (I–V) correspond to the window functions illustrated (responses courtesy of Philips Technical Review).

FIGURE 3.43

A truncated sin x/x impulse (top) is multiplied by a Bartlett window function (centre) to produce the actual coefficients used (bottom).

If the coefficients are not quantized. nely enough, it will be as if they had been calculated inaccurately, and the performance of the filter will be less than expected. Figure 3.44 shows an example of quantizing coefficients. Conversely, raising the word length of the coefficients increases cost.

FIGURE 3.44

Frequency response of a 49-point transversal filter with infinite precision (solid line) shows ripple due to finite window size. Quantizing coefficients to 12 bits reduces attenuation in the stop band (responses courtesy of Philips Technical Review).

The FIR structure is inherently phase-linear because it is easy to make the impulse response absolutely symmetrical. The individual samples in a digital system do not know in isolation what frequency they represent, and they can pass through the filter only at a rate determined by the clock. Because of this inherent phase-linearity, an FIR filter can be designed for a specific impulse response, and the frequency response will follow.

The frequency response of the filter can be changed at will by changing the coefficients. A programmable filter need have only several sets of coefficients in a memory; the address supplied to the memory will select the response. The frequency response of a digital filter will also change if the clock rate is changed, so it is often less ambiguous to specify a frequency of interest in a digital filter in terms of a fraction of the fundamental interval rather than in absolute terms. The configuration shown in Figure 3.40 serves to illustrate the principle. The units used on the diagrams are sample periods and the response is proportional to these periods or spacings, and so it is not necessary to use actual figures.

Where the impulse response is symmetrical, it is often possible to reduce the number of multiplications, because the same product can be used twice, at equal distances before and after the centre of the window. This is known as folding the filter. A folded filter is shown in Figure 3.45.

SAMPLING-RATE CONVERSION

Sampling-rate conversion or interpolation is an important enabling technology on which a large number of practical digital video devices are based. In digital video, the sampling rate takes on many guises. When analog video is sampled in real time, the sampling rate is temporal, but where pixels form a static array, the sampling rate is a spatial frequency.

FIGURE 3.45

A seven-point folded filter for a symmetrical impulse response. In this case K1 and K7 will be identical, and so the input sample can be multiplied once and the product fed into the output shift system in two different places. The centre coefficient K4 appears once. In an even-numbered filter the centre coefficient would be used twice.

Some of the applications of interpolation are set out here:

1. Standards convertors need to change two of the sampling rates of the video they handle, namely the temporal frame rate and the vertical line spacing, which is in fact a spatial sampling frequency. Standards convertors working with composite digital signals will also need to change the sampling rate along the line because it will be a multiple of the appropriate subcarrier frequency.

2. Different sampling rates exist today for different purposes. Most component digital devices sample at 13.5 MHz, using the 4:2:2 format, but other variations are possible, such as 3:1:1. Composite machines sample at a multiple of the subcarrier frequency of their line standard. Rate conversion allows material to be exchanged freely between such formats. For example, the output of a 4:2:2 paint system at 13.5 MHz may be digitally converted to 4Fsc for use as input to a composite digital recorder.

3. To take advantage of oversampling convertors, an increase in sampling rate is necessary for DACs and a reduction in sampling rate is necessary for ADCs. In oversampling the factors by which the rates are changed are simpler than in other applications.

4. In effects machines, the size of the picture may be changed without the pixel spacing being changed. This is exactly the converse of the standards convertor, which leaves the picture size unchanged and changes the pixel spacing. Alternatively the picture may be shifted with respect to the sampling matrix by any required distance to subpixel accuracy. Similar processes are necessary in motion estimation for standards convertors and data reduction.

5. When a digital VTR is played back at other than the correct speed to achieve some effect or to correct pitch, the sampling rate of the reproduced audio signal changes in proportion. If the playback samples are to be fed to a digital mixing console that works at some standard frequency, audio sampling-rate conversion will be necessary. Whilst DVTRs universally use an audio sampling rate of 48 kHz, Compact Disc uses 44.1 kHz, and 32 kHz is common for broadcast use (e.g. DVB).

6. When digital audio is used in conjunction with film or video, difficulties arise because it is not always possible to synchronise the sampling rate with the frame rate. An example of this is where the digital audio recorder uses its internally generated sampling rate, but also records studio timecode. On playback, the timecode can be made the same as on other units, or the sampling rate can be locked, but not both. Sampling-rate conversion allows a recorder to play back an asynchronous recording locked to timecode.

There are three basic but related categories of rate conversion, as shown in Figure 3.46. The most straightforward (Figure 3.46a) changes the rate by an integer ratio, up or down. The timing of the system is thus simplified because all samples (input and output) are present on edges of the higher-rate sampling clock. Such a system is generally adopted for oversampling convertors; the exact sampling rate immediately adjacent to the analog domain is not critical and will be chosen to make the filters easier to implement.

FIGURE 3.46

Categories of rate conversion. (a) Integer-ratio conversion, in which the lower-rate samples are always coincident with those of the higher rate. There are a small number of phases needed. (b) Fractional-ratio conversion, in which sample coincidence is periodic. A larger number of phases is required. The example here is conversion from 50.4 to 44.1 kHz (8/7). (c) Variable-ratio conversion, in which there is no fixed relationship, and a large number of phases are required.

Next in order of difficulty is the category shown at Figure 3.46b, in which the rate is changed by the ratio of two small integers. Samples in the input periodically time-align with the output. Such devices can be used for converting from 4 × F_sc to 3 × F_sc, in the vertical processing of standards convertors, or between the various rates of CCIR-601.

The most complex rate-conversion category is that in which there is no simple relationship between input and output sampling rates, and in fact they may vary. This situation, shown in Figure 3.46c, is known as variable-ratio conversion. The temporal or spatial relationship of input and output samples is arbitrary. This problem will be met in effects machines that zoom or rotate images.

The technique of integer-ratio conversion is used in conjunction with over-sampling convertors in digital video and audio and in motion estimation and compression systems in which subsampled or reduced resolution versions of an input image are required. These applications will be detailed in Chapter 5. Sampling-rate reduction by an integer factor is dealt with first.

Figure 3.47a shows the spectrum of a typical sampled system in which the sampling rate is a little more than twice the analog bandwidth. Attempts to reduce the sampling rate simply by omitting samples, a process known as decimation, will result in aliasing, as shown in Figure 3.47b. Intuitively it is obvious that omitting samples is the same as if the original sampling rate was lower. To prevent aliasing, it is necessary to incorporate low-pass filtering into the system, by which the cutoff frequency reflects the new, lower, sampling rate. An FIR-type low-pass filter could be installed, as described earlier in this chapter, immediately prior to the stage at which samples are omitted, but this would be wasteful, because for much of its time the FIR filter would be calculating sample values that are to be discarded. The more effective method is to combine the low-pass filter with the decimator so that the filter calculates only values to be retained in the output sample stream. Figure 3.47c shows how this is done. The filter makes one accumulation for every output sample, but that accumulation is the result of multiplying all relevant input samples in the filter window by an appropriate coefficient. The number of points in the filter is determined by the number of input samples in the period of the filter window, but the number of multiplications per second is obtained by multiplying that figure by the output rate. If the filter is not integrated with the decimator, the number of points has to be multiplied by the input rate. The larger the rate-reduction factor, the more advantageous the decimating filter ought to be, but this is not quite the case, as the greater the reduction in rate, the longer the filter window will need to be to accommodate the broader impulse response.

FIGURE 3.47

The spectrum of a typical digital sample stream in (a) will be subject to aliasing as in (b) if the baseband width is not reduced by an LPF. (c) An FIR low-pass filter prevents aliasing. Samples are clocked transversely across the filter at the input rate, but the filter computes only at the output sample rate. Clearly this will work only if the two are related by an integer factor.

When the sampling rate is to be increased by an integer factor, additional samples must be created at even spacing between the existing ones. There is no need for the bandwidth of the input samples to be reduced because, if the original sampling rate was adequate, a higher one must also be adequate.

FIGURE 3.48

In integer-ratio sampling, rate increase can be obtained in two stages. First, zero-value samples are inserted to increase the rate, and then filtering is used to give the extra samples real values. The filter necessary will be an LPF with a response that cuts off at the Nyquist frequency of the input samples.

Figure 3.48 shows that the process of sampling-rate increase can be thought of in two stages. First, the correct rate is achieved by inserting samples of zero value at the correct instant, and then the additional samples are given meaningful values by passing the sample stream through a low-pass filter that cuts off at the Nyquist frequency of the original sampling rate. This filter is known as an interpolator, and one of its tasks is to prevent images of the input spectrum from appearing in the extended baseband of the output spectrum.

How do interpolators work? Remember that, according to sampling theory, all sampled systems have finite bandwidth. An individual digital sample value is obtained by sampling the instantaneous voltage of the original analog waveform, and because it has zero duration, it must contain an infinite spectrum. However, such a sample can never be observed in that form because of the reconstruction process, which limits the spectrum of the impulse to the Nyquist limit. After reconstruction, one infinitely short digital sample ideally represents a sin x/x pulse whose central peak width is determined by the response of the reconstruction filter, and whose amplitude is proportional to the sample value. This implies that, in reality, one sample value has meaning over a considerable time span, rather than just at the sample instant. If this were not true, it would be impossible to build an interpolator.

FIGURE 3.49

A single sample results in a sin x/x waveform after filtering in the analog domain. At a new, higher, sampling rate, the same waveform after filtering will be obtained if the numerous samples of differing size shown here are used. It follows that the values of these new samples can be calculated from the input samples in the digital domain in an FIR filter.

As in rate reduction, performing the steps separately is inefficient. The bandwidth of the information is unchanged when the sampling rate is increased; therefore the original input samples will pass through the filter unchanged, and it is superfluous to compute them. The combination of the two processes into an interpolating filter minimizes the amount of computation.

As the purpose of the system is purely to increase the sampling rate, the filter must be as transparent as possible, and this implies that a linear-phase configuration is mandatory, suggesting the use of an FIR structure. Figure 3.49 shows that the theoretical impulse response of such a filter is a sin x/x curve that has zero value at the position of adjacent input samples. In practice this impulse cannot be implemented because it is infinite. The impulse response used will be truncated and windowed as described earlier. To simplify this discussion, assume that a sin x/x impulse is to be used. There is a strong parallel with the operation of a DAC in which the analog voltage is returned to the time-continuous state by summing the analog impulses due to each sample. In a digital interpolating filter, this process is duplicated.⁴

If the sampling rate is to be doubled, new samples must be interpolated exactly halfway between existing samples. The necessary impulse response is shown in Figure 3.50; it can be sampled at the output sample period and quantized to form coefficients. If a single input sample is multiplied by each of these coefficients in turn, the impulse response of that sample at the new sampling rate will be obtained. Note that every other coefficient is zero, which confirms that no computation is necessary on the existing samples; they are just transferred to the output. The intermediate sample is computed by adding together the impulse responses of every input sample in the window. The figure shows how this mechanism operates. If the sampling rate is to be increased by a factor of 4, three sample values must be interpolated between existing input samples. Figure 3.51 shows that it is necessary to sample only the impulse response at one-quarter the period of input samples to obtain three sets of coefficients, which will be used in turn. In hardware-implemented filters, the input sample, which is passed straight to the output, is transferred by using a fourth filter phase in which all coefficients are 0 except the central one, which is unity.

Fractional-ratio conversion allows interchange between different CCIR-601 rates, such as 4:2:2 and 4:2:0. Fractional ratios also occur in the vertical axis of standards convertors. Figure 3.46 showed that when the two sampling rates have a simple fractional relationship m/n, there is a periodicity in the relationship between samples in the two streams. It is possible to have a system clock running at the least-common multiple frequency, which will divide by different integers to give each sampling rate.⁵

The existence of a common clock frequency means that a fractional-ratio convertor could be made by arranging two integer-ratio convertors in series. This configuration is shown in Figure 3.52a. The input-sampling rate is multiplied by m in an interpolator, and the result is divided by n in a decimator. Although this system would work, it would be grossly inefficient, because only one in n of the interpolator's outputs would be used. A decimator followed by an interpolator would also offer the correct sampling rate at the output, but the intermediate sampling rate would be so low that the system bandwidth would be quite unacceptable.

As has been seen, a more efficient structure results from combining the processes. The result is exactly the same structure as an integer-ratio interpolator and requires an FIR filter. The impulse response of the filter is determined by the lower of the two sampling rates, and as before it prevents aliasing when the rate is being reduced and prevents images when the rate is being increased. The interpolator has sufficient coefficient phases to interpolate m output samples for every input sample, but not all these values are computed; only interpolations that coincide with an output sample are performed. It can be seen in Figure 3.52b that input samples shift across the transversal filter at the input sampling rate, but interpolations are performed only at the output sample rate. This is possible because a different filter phase will be used at each interpolation.

FIGURE 3.50

A 2 × oversampling interpolator. To compute an intermediate sample, the input samples are imagined to be sin x/x impulses, and the contributions from each at the point of interest can be calculated. In practice, rather more samples on either side need to be taken into account.

FIGURE 3.51

In 4 × oversampling, for each set of input samples, four phases of coefficients are necessary, each of which produces one of the oversampled values.

FIGURE 3.52

(a) Fractional-ratio conversion of 3/4 in this sample is by increasing to 4 × input prior to reducing by 3 ×. The inefficiency due to discarding previously computed values is clear. In (b), efficiency is raised because only needed values will be computed. Note how the interpolation phase changes for each output. Fixed coefficients can no longer be used.

In the previous examples, the sample rate or spacing of the filter output had a constant relationship to the input, which meant that the two rates had to be phase-locked. This is an undesirable constraint in some applications, including image manipulators. In a variable-ratio interpolator, values will exist for the points at which input samples were made, but it is necessary to compute what the sample values would have been at absolutely any point between available samples. The general concept of the interpolator is the same as for the fractional-ratio convertor, except that an infinite number of filter phases is ideally necessary. Because a realizable filter will have a finite number of phases, it is necessary to study the degradation this causes. The desired continuous temporal or spatial axis of the interpolator is quantized by the phase spacing, and a sample value needed at a particular point will be replaced by a value for the nearest available filter phase. The number of phases in the filter therefore determines the accuracy of the interpolation. The effects of calculating a value for the wrong point are identical to those of sampling with clock jitter, in that an error occurs proportional to the slope of the signal. The result is program-modulated noise. The higher the noise specification, the greater the desired time accuracy and the greater the number of phases required. The number of phases is equal to the number of sets of coefficients available and should not be confused with the number of points in the filter, which is equal to the number of coefficients in a set (and the number of multiplications needed to calculate one output value).

The sampling jitter accuracy necessary for 8-bit working is measured in picoseconds. This implies that something like 32 filter phases will be required for adequate performance in an 8-bit sampling-rate convertor.

TRANSFORMS AND DUALITY

The duality of transforms provides an interesting insight into what is happening in common processes. Fourier analysis holds that any periodic waveform can be reproduced by adding together an arbitrary number of harmonically related sinusoids of various amplitudes and phases. Figure 3.53 shows how a square wave can be built up of harmonics. The spectrum can be drawn by plotting the amplitude of the harmonics against frequency. It will be seen that this gives a spectrum that is a decaying wave. It passes through 0 at all even multiples of the fundamental. The shape of the spectrum is a sin x/x curve. If a square wave has a sin x/x spectrum, it follows that a filter with a rectangular impulse response will have a sin x/x spectrum.

A low-pass filter has a rectangular spectrum, and this has a sin x/x impulse response. These characteristics are known as a transform pair. In transform pairs, if one domain has one shape of the pair, the other domain will have the other shape. Figure 3.54 shows a number of transform pairs. At Figure 3.54a a square wave has a sin x/x spectrum and a sin x/x impulse has a square spectrum. In general the product of equivalent parameters on either side of a transform remains constant, so that if one increases, the other must fall. If (a) shows a filter with a wider bandwidth, having a narrow impulse response, then (b) shows a filter of narrower bandwidth that has a wide impulse response. This is duality in action. The limiting case of this behaviour is that in which one parameter becomes zero, and the other goes to infinity. At (c) a time-domain pulse of infinitely short duration has a flat spectrum. Thus a flat waveform, i.e., DC, has only zero in its spectrum. The impulse response of the optics of a laser disk (d) has a sin² x/x² intensity function, and this is responsible for the triangular falling frequency response of the pickup. The lens is a rectangular aperture, but as there is no such thing as negative light, a sin x/x impulse response is impossible. The squaring process is consistent with a positive-only impulse response. Interestingly, the transform of a Gaussian response in still Gaussian.

FIGURE 3.53

Fourier analysis of a square wave into fundamental and harmonics. A, amplitude; δ, phase of fundamental wave in degrees; 1, first harmonic (fundamental); 2, odd harmonics 3–15; 3, sum of harmonics 1–15; 4, ideal square wave.

Duality also holds for sampled systems. A sampling process is periodic in the time domain. This results in a spectrum that is periodic in the frequency domain. If the time between the samples is reduced, the bandwidth of the system rises. Figure 3.55a shows that a continuous time signal has a continuous spectrum, whereas in Figure 3.55b the frequency transform of a sampled signal is also discrete. In other words sampled signals can be analysed into only a finite number of frequencies. The more accurate the frequency analysis has to be, the more samples are needed in the block. Making the block longer reduces the ability to locate a transient in time. This is the Heisenberg inequality, which is the limiting case of duality, because when infinite accuracy is achieved in one domain, there is no accuracy at all in the other.

FIGURE 3.54

Transform pairs. (a) The dual of a rectangle is a sin x/x function. If one is time domain, the other is frequency domain. (b) Narrowing one domain widens the other. The limiting case of this is (c). (d) The transform of the sin x/x squared function is triangular.

FIGURE 3.55

(a) A continuous time signal has continuous spectrum. (b) A discrete time signal has discrete spectrum.

THE FOURIER TRANSFORM

If the amplitude and phase of each frequency component are known, linearly adding the resultant components in an inverse transform results in the original waveform. In digital systems the waveform is expressed as a number of discrete samples. As a result the Fourier transform analyses the signal into an equal number of discrete frequencies. This is known as a discrete Fourier transform, or DFT, in which the number of frequency coefficients is equal to the number of input samples. The FFT (fast Fourier transform) is no more than an efficient way of computing the DFT.⁶ As was seen in the previous section, practical systems must use windowing to create short-term transforms.

It will be evident from Figure 3.56 that the knowledge of the phase of the frequency component is vital, as changing the phase of any component will seriously alter the reconstructed waveform. Thus the DFT must accurately analyse the phase of the signal components.

There are a number of ways of expressing phase. Figure 3.57 shows a point that is rotating about a fixed axis at constant speed. Looked at from the side, the point oscillates up and down at constant frequency. The waveform of that motion is a sine wave, and that is what we would see if the rotating point were to translate along its axis whilst we continued to look from the side.

One way of defining the phase of a waveform is to specify the angle through which the point has rotated at time zero (T = 0). If a second point is made to revolve at 90° to the first, it would produce a cosine wave when translated. It is possible to produce a waveform having arbitrary phase by adding together the sine and the cosine wave in various proportions and polarities. For example, adding the sine and cosine waves in equal proportion results in a waveform lagging the sine wave by 45°.

Figure 3.57 shows that the proportions necessary are respectively the sine and the cosine of the phase angle. Thus the two methods of describing phase can be readily interchanged.

FIGURE 3.56

Fourier analysis allows the synthesis of any waveform by the addition of discrete frequencies of appropriate amplitude and phase.

The discrete Fourier transform spectrum analyses a string of samples by searching separately for each discrete target frequency. It does this by multiplying the input waveform by a sine wave, known as the basis function, having the target frequency and adding up or integrating the products. Figure 3.58a shows that multiplying by basis functions gives a nonzero integral when the input frequency is the same, whereas Figure 3.58b shows that with a different input frequency (in fact all other different frequencies) the integral is zero, showing that no component of the target frequency exists. Thus from a real waveform containing many frequencies all frequencies except the target frequency are excluded. The magnitude of the integral is proportional to the amplitude of the target component.

FIGURE 3.57

The origin of sine and cosine waves is to take a particular viewpoint of a rotation. Any phase can be synthesized by adding proportions of sine and cosine waves.

Figure 3.58c shows that the target frequency will not be detected if it is phase-shifted 90° as the product of quadrature waveforms is always 0. Thus the discrete Fourier transform must make a further search for the target frequency using a cosine basis function. It follows from the arguments above that the relative proportions of the sine and cosine integrals reveal the phase of the input component. Thus each discrete frequency in the spectrum must be the result of a pair of quadrature searches.

Searching for one frequency at a time as above will result in a DFT, but only after considerable computation. However, a lot of the calculations are repeated many times over in different searches. The FFT gives the same result with less computation by logically gathering together all the places where the same calculation is needed and making the calculation once.

FIGURE 3.58

The input waveform is multiplied by the target frequency and the result is averaged or integrated. In (a) the target frequency is present and a large integral results. With another input frequency the integral is 0 as in (b). The correct frequency will also result in a 0 integral shown in (c) if it is at 90° to the phase of the search frequency. This is overcome by making two searches in quadrature.

The amount of computation can be reduced by performing the sine and cosine component searches together. Another saving is obtained by noting that every 180° the sine and cosine have the same magnitude but are simply inverted in sign. Instead of performing four multiplications on two samples 180° apart and adding the pairs of products it is more economical to subtract the sample values and multiply twice, once by a sine value and once by a cosine value.

The first coefficient is the arithmetic mean, which is the sum of all the sample values in the block divided by the number of samples. Figure 3.59 shows how the search for the lowest frequency in a block is performed. Pairs of samples are subtracted as shown, and each difference is then multiplied by the sine and the cosine of the search frequency. The process shifts one sample period, and a new sample pair is subtracted and multiplied by new sine and cosine factors. This is repeated until all the sample pairs have been multiplied. The sine and cosine products are then added to give the value of the sine and cosine coefficients respectively.

FIGURE 3.59

An example of a filtering search. Pairs of samples are subtracted and multiplied by sampled sine and cosine waves. The products are added to give the sine and cosine components of the search frequency.

It is possible to combine the calculation of the DC component, which requires the sum of samples, and the calculation of the fundamental, which requires sample differences, by combining stages shown in Figure 3.60a, which take a pair of samples and add and subtract them. Such a stage is called a butterfly because of the shape of the schematic. Figure 3.60b shows how the first two components are calculated. The phase rotation boxes attribute the input to the sine or cosine component outputs according to the phase angle. As shown, the box labelled 90° attributes nothing to the sine output, but unity gain to the cosine output. The 45° box attributes the input equally to both components.

FIGURE 3.60

The basic element of an FFT is known as a butterfly (a) because of the shape of the signal paths in a sum and difference system. Butterflies are used to compute the first two coefficients as shown in (b).

FIGURE 3.60

Figure 3.60c shows a numerical example. If a sine-wave input is considered where 0° coincides with the first sample, this will produce a zero sine coefficient and non-0 cosine coefficient, whereas (d) shows the same input waveform shifted by 90°. Note how the coefficients change over.

FIGURE 3.60

(d) With a quadrature input the frequency is not seen.

Figure 3.60e shows how the next frequency coefficient is computed. Note that exactly the same first-stage butterfly outputs are used, reducing the computation needed.

A similar process may be followed to obtain the sine and cosine coefficients of the remaining frequencies. The full FFT diagram for eight samples is shown in Figure 3.61a. The spectrum this calculates is shown in Figure 3.61b. Note that only half of the coefficients are useful in a real band-limited system because the remaining coefficients represent frequencies above one-half of the sampling rate.

In STFTs the overlapping input sample blocks must be multiplied by window functions. The principle is the same as for the application in FIR filters shown under FIR Filters. Figure 3.62 shows that multiplying the search frequency by the window has exactly the same result except that this need be done only once and much computation is saved. Thus in the STFT the basis function is a windowed sine or cosine wave.

FIGURE 3.60

(e) The butterflies used for the first coefficients form the basis of the computation of the next coefficient.

The FFT is used extensively in such applications as phase correlation, in which the accuracy with which the phase of signal components can be analysed is essential. It also forms the foundation of the discrete cosine transform.

THE DISCRETE COSINE TRANSFORM (DCT)

The DCT is a special case of a discrete Fourier transform in which the sine components of the coefficients have been eliminated leaving a single number. This is actually quite easy. Figure 3.63a shows a block of input samples to a transform process. By repeating the samples in a time-reversed order and performing a discrete Fourier transform on the double-length sample set a DCT is obtained. The effect of mirroring the input waveform is to turn it into an even function whose sine coefficients are all 0. The result can be understood by considering the effect of individually transforming the input block and the reversed block.

FIGURE 3.61

(a) The full butterfly diagram for an FFT. The spectrum this computes is shown in (b).

FIGURE 3.62

Multiplication of a windowed block by a sine-wave basis function is the same as multiplying the raw data by a windowed basis function but requires less multiplication as the basis function is constant and can be precomputed.

FIGURE 3.63

The DCT is obtained by mirroring the input block, as shown in (a), prior to an FFT. The mirroring cancels out the sine components, as in (b), leaving only cosine coefficients.

Figure 3.63b shows that the phase of all the components of one block are in the sense opposite to those in the other. This means that when the components are added to give the transform of the double-length block all the sine components cancel out, leaving only the cosine coefficients, hence the name of the transform.⁷ In practice the sine component calculation is eliminated. Another advantage is that doubling the block length by mirroring doubles the frequency resolution, so that twice as many useful coefficients are produced. In fact a DCT produces as many useful coefficients as input samples.

For image processing two-dimensional transforms are needed. In this case for every horizontal frequency, a search is made for all possible vertical frequencies. A two-dimensional DCT is shown in Figure 3.64. The DCT is separable in that the two-dimensional DCT can be obtained by computing in each dimension separately. Fast DCT algorithms are available.⁸

Figure 3.65 shows how a two-dimensional DCT is calculated by multiplying each pixel in the input block by terms that represent sampled cosine waves of various spatial frequencies. A given DCT coefficient is obtained when the result of multiplying every input pixel in the block is summed. Although most compression systems, including JPEG and MPEG, use square DCT blocks, this is not a necessity and rectangular DCT blocks are possible and are used in, for example, Digital Betacam, SX, and DVC.

FIGURE 3.64

The discrete cosine transform breaks up an image area into discrete frequencies in two dimensions. The lowest frequency can be seen here at the top-left corner. Horizontal frequency increases to the right and vertical frequency increases downward.

FIGURE 3.65

A two-dimensional DCT is calculated as shown here. Starting with an input pixel block one calculation is necessary to find a value for each coefficient. After 64 calculations using different basis functions the coefficient block is complete.

The DCT is primarily used in MPEG-2 because it converts the input waveform into a form in which redundancy can be easily detected and removed. More details of the DCT can be found in Chapter 6.

THE WAVELET TRANSFORM

The wavelet transform was not discovered by any one individual, but has evolved via a number of similar ideas and was given a strong mathematical foundation only relatively recently.^9–12 The wavelet transform is similar to the Fourier transform in that it has basis functions of various frequencies that are multiplied by the input waveform to identify the frequencies it contains. However, the Fourier transform is based on periodic signals and endless basis functions and requires windowing. The wavelet transform is fundamentally windowed, as the basis functions employed are not endless sine waves, but are finite on the time axis; hence the name. Wavelet transforms do not use a fixed window, but instead the window period is inversely proportional to the frequency being analysed. As a result a useful combination of time and frequency resolutions is obtained. High frequencies corresponding to transients in audio or edges in video are transformed with short basis functions and therefore are accurately located. Low frequencies are transformed with long basis functions that have good frequency resolution.

FIGURE 3.66

Unlike discrete Fourier transforms, wavelet basis functions are scaled so that they contain the same number of cycles irrespective of frequency. As a result their frequency discrimination ability is a constant proportion of the centre frequency.

Figure 3.66 shows that a set of wavelets or basis functions can be obtained simply by scaling (stretching or shrinking) a single wavelet on the time axis. Each wavelet contains the same number of cycles such that as the frequency reduces, the wavelet gets longer. Thus the frequency discrimination of the wavelet transform is a constant fraction of the signal frequency. In a filter bank such a characteristic would be described as “constant Q.” Figure 3.67 shows that the division of the frequency domain by a wavelet transform is logarithmic, whereas in the Fourier transform the division is uniform. The logarithmic coverage is effectively dividing the frequency domain into octaves and as such parallels the frequency discrimination of human hearing.

As it is relatively recent, the wavelet transform has yet to be widely used although it shows great promise. It has successfully been used in audio and in commercially available nonlinear video editors and in other fields such as radiology and geology.

FIGURE 3.67

Wavelet transforms divide the frequency domain into octaves instead of the equal bands of the Fourier transform.

In video, wavelet compression does not display the “blocking” of DCT-based coders at high compression factors; instead compression error is spread over the spectrum and appears as white noise.¹³ It is naturally a multiresolution transform allowing scalable decoding.

References

1. Ray, S.F. Applied Photographic Optics, Chap. 17, Oxford: Focal Press (1988).

2. van den Enden, A.W.M., and Verhoeckx, N.A.M. Digital signal processing: theoretical background. Philips Tech. Rev., 42, 110–144 (1985).

3. McClellan, J.H., Parks, T.W., and Rabiner, L.R. A computer program for designing optimum FIR linear-phase digital filters. IEEE Trans. Audio and Electroacoustics, AU-21, 506–526 (1973).

4. Crochiere, R.E., and Rabiner, L.R. Interpolation and decimation of digital signals: a tutorial review. Proc. IEEE, 69, 300–331 (1981).

5. Rabiner, L.R. Digital techniques for changing the sampling rate of a signal. In B. Blesser, B. Locanthi, and T.G. Stockham Jr. (eds.), Digital Audio, pp. 79–89, New York: Audio Engineering Society (1982).

6. Kraniauskas, P. Transforms in Signals and Systems, Chap. 6, Wokingham: Addison Wesley (1992).

7. Ahmed, N., Natarajan, T., and Rao, K. Discrete cosine transform. IEEE Trans. Computers, C-23, 90–93 (1974).

8. De With, P.H.N. Data Compression Techniques for Digital Video Recording, Delft, The Netherlands: Technical University of Delft (1992) (Ph.D. thesis).

9. Goupillaud, P., Grossman, A., and Morlet, J. Cycle-octave and related transforms in seismic signal analysis. Geoexploration, 23, 85–102 (1984/5).

10. Daubechies, I. The wavelet transform, time-frequency localisation and signal analysis. IEEE Trans. Information Theory, 36, 961–1005 (1990).

11. Rioul, O., and Vetterli, M. Wavelets and signal processing. IEEE Signal Process. Mag., October, 14–38 (1991).

12. Strang, G., and Nguyen, T. Wavelets and Filter Banks, Wellesley, MA: Wellesley–Cambridge Press (1996).

13. Huffman, J. Wavelets and image compression. Presented at the 135th SMPTE Tech. Conf. (Los Angeles), Preprint No. 135198 (1993).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for CHAPTER 3 Digital Principles

Create new playlist

Sign In

Sign Up

INTRODUCTION

PURE BINARY CODE

DIGITAL PROCESSING

LOGIC ELEMENTS

STORAGE ELEMENTS

BINARY ADDITION

MODULO-N ARITHMETIC

GAIN CONTROL BY MULTIPLICATION

THE COMPUTER

INTERRUPTS

THE CPU

DIGITAL SIGNAL PROCESSORS

THE PROGRAMMABLE TIMER

TIME BASE CORRECTION

MULTIPLEXING PRINCIPLES

PACKETS

STATISTICAL MULTIPLEXING

DIGITAL FADERS AND CONTROLS

THE GALOIS FIELD

FILTERS

TRANSFORMS

CONVOLUTION

FIR AND IIR FILTERS

FIR FILTERS

SAMPLING-RATE CONVERSION

TRANSFORMS AND DUALITY

THE FOURIER TRANSFORM

THE DISCRETE COSINE TRANSFORM (DCT)

THE WAVELET TRANSFORM

References

Table of Contents for
CHAPTER 3 Digital Principles