6

Digital recording and transmission principles

Recording and transmission are quite different tasks, but they have a great deal in common and have always been regarded as being different applications of the same art. Digital transmission consists of converting data into a waveform suitable for the path along which it is to be sent. Digital recording is basically the process of recording a digital transmission waveform on a suitable medium. Although the physics of the recording or transmission processes are unaffected by the meaning attributed to signals, digital techniques are rather different from those used with analog signals, although often the same phenomenon shows up in a different guise. In this chapter the fundamentals of digital recording and transmission are introduced along with descriptions of the coding techniques used in practical applications. The parallel subject of error correction is dealt with in the next chapter.

6.1 Introduction to the channel

Data can be recorded on many different media and conveyed using many forms of transmission. The generic term for the path down which the information is sent is the channel. In a transmission application, the channel may be no more than a length of cable. In a recording application the channel will include the record head, the medium and the replay head. In analog systems, the characteristics of the channel affect the signal directly. It is a fundamental strength of digital audio that by pulse code modulating an audio waveform the quality can be made independent of the channel. The dynamic range required by the programme material no longer directly decides the dynamic range of the channel.

In digital circuitry there is a great deal of noise immunity because the signal has only two states, which are widely separated compared with the amplitude of noise. In both digital recording and transmission this is not always the case. In magnetic recording, noise immunity is a function of track width and reduction of the working SNR of a digital track allows the same information to be carried in a smaller area of the medium, improving economy of operation. In broadcasting, the noise immunity is a function of the transmitter power and reduction of working SNR allows lower power to be used with consequent economy. These reductions also increase the random error rate, but, as was seen in Chapter 1, an error-correction system may already be necessary in a practical system and it is simply made to work harder.

In real channels, the signal may originate with discrete states which change at discrete times, but the channel will treat it as an analog waveform and so it will not be received in the same form. Various loss mechanisms will reduce the amplitude of the signal. These attenuations will not be the same at all frequencies. Noise will be picked up in the channel as a result of stray electric fields or magnetic induction. As a result the voltage received at the end of the channel will have an infinitely varying state along with a degree of uncertainty due to the noise. Different frequencies can propagate at different speeds in the channel; this is the phenomenon of group delay. An alternative way of considering group delay is that there will be frequency-dependent phase shifts in the signal and these will result in uncertainty in the timing of pulses.

In digital circuitry, the signals are generally accompanied by a separate clock signal which reclocks the data to remove jitter as was shown in Chapter 1. In contrast, it is generally not feasible to provide a separate clock in recording and transmission applications. In the transmission case, a separate clock line would not only raise cost, but is impractical because at high frequency it is virtually impossible to ensure that the clock cable propagates signals at the same speed as the data cable except over short distances. In the recording case, provision of a separate clock track is impractical at high density because mechanical tolerances cause phase errors between the tracks. The result is the same; timing differences between parallel channels which are known as skew.

The solution is to use a self-clocking waveform and the generation of this is a further essential function of the coding process. Clearly if data bits are simply clocked serially from a shift register in so-called direct recording or transmission this characteristic will not be obtained. If all the data bits are the same, for example all zeros, there is no clock when they are serialized.

It is not the channel which is digital; instead the term describes the way in which the received signals are interpreted. When the receiver makes discrete decisions from the input waveform it attempts to reject the uncertainties in voltage and time. The technique of channel coding is one where transmitted waveforms are restricted to those which still allow the receiver to make discrete decisions despite the degradations caused by the analog nature of the channel.

6.2 Types of transmission channel

Transmission can be by electrical conductors, radio or optical fibre. Although these appear to be completely different, they are in fact just different examples of electromagnetic energy travelling from one place to another. If the energy is made to vary is some way, information can be carried.

Even today electromagnetism is not fully understood, but sufficiently good models, based on experimental results, exist so that practical equipment can be made. It is not actually necessary to fully understand a process in order to harness it; it is only necessary to be able to reliably predict what will happen in given circumstances.

Electromagnetic energy propagates in a manner which is a function of frequency, and our partial understanding requires it to be considered as electrons, waves or photons so that we can predict its behaviour in given circumstances.

At DC and at the low frequencies used for power distribution, electromagnetic energy is called electricity and it is remarkably aimless stuff which needs to be transported completely inside conductors. It has to have a complete circuit to flow in, and the resistance to current flow is determined by the cross-sectional area of the conductor. The insulation around the conductor and the spacing between the conductors has no effect on the ability of the conductor to pass current. At DC an inductor appears to be a short circuit, and a capacitor appears to be an open circuit.

As frequency rises, resistance is exchanged for impedance. Inductors display increasing impedance with frequency, capacitors show falling impedance. Electromagnetic energy becomes increasingly desperate to leave the conductor. The first symptom is that the current flows only in the outside layer of the conductor effectively causing the resistance to rise. This is the skin effect and gives rise to such techniques as Litz wire which has as much surface area as possible per unit cross-section, and to silver-plated conductors in which the surface has lower resistivity than the interior.

As the energy is starting to leave the conductors, the characteristics of the space between them become important. This determines the impedance. A change of impedance causes reflections in the energy flow and some of it heads back towards the source. Constant impedance cables with fixed conductor spacing are necessary, and these must be suitably terminated to prevent reflections. The most important characteristic of the insulation is its thickness as this determines the spacing between the conductors.

As frequency rises still further, the energy travels less in the conductors and more in the insulation between them. Their composition becomes important and they begin to be called dielectrics. A poor dielectric like PVC absorbs high-frequency energy and attenuates the signal. So-called low-loss dielectrics such as PTFE are used, and one way of achieving low loss is to incorporate as much air in the dielectric as possible by making it in the form of a foam or extruding it with voids.

Further rise in frequency causes the energy to start behaving more like waves and less like electron movement. As the wavelength falls it becomes increasingly directional. The transmission line becomes a waveguide and microwaves are sufficiently directional that they can keep going without any conductor at all. Microwaves are simply low-frequency radiant heat, which is itself low-frequency light. All three are reflected well by electrical conductors, and can be refracted at the boundary between media having different propagation speeds. A waveguide is the microwave equivalent of an optical fibre.

This frequency-dependent behaviour is the most important factor in deciding how best to harness electromagnetic energy flow for information transmission. It is obvious that the higher the frequency, the greater the possible information rate, but in general, losses increase with frequency, and flat frequency response is elusive. The best that can be managed is that over a narrow band of frequencies, the response can be made reasonably constant with the help of equalization. Unfortunately raw data when serialized have an unconstrained spectrum. Runs of identical bits can produce frequencies much lower than the bit rate would suggest. One of the essential steps in a transmission system is to modify the spectrum of the data into something more suitable.

At moderate bit rates, say a few megabits per second, and with moderate cable lengths, say a few metres, the dominant effect will be the capacitance of the cable due to the geometry of the space between the conductors and the dielectric between. The capacitance behaves under these conditions as if it were a single capacitor connected across the signal. Figure 6.1 shows the equivalent circuit.

The effect of the series source resistance and the parallel capacitance is that signal edges or transitions are turned into exponential curves as the capacitance is effectively being charged and discharged through the source impedance. This effect can be observed on the AES/EBU interface with short cables. Although the position where the edges cross the centreline is displaced, the signal eventually reaches the same amplitude as it would at DC.

images

Figure 6.1    With a short cable, the capacitance between the conductors can be lumped as if it were a discrete component. The effect of the parallel capacitor is to slope off the edges of the signal.

images

Figure 6.2    A transmission line conveys energy packets which appear to alternate with respect to the dielectric. In (a) the driver launches a pulse which charges the dielectric at the beginning of the line. As it propagates the dielectric is charged further along as in (b). When the driver ends the pulse, the charged dielectric discharges into the line. A current loop is formed where the current in the return loop flows in the opposite direction to the current in the ‘hot’ wire.

As cable length increases, the capacitance can no longer be lumped as if it were a single unit; it has to be regarded as being distributed along the cable. With rising frequency, the cable inductance also becomes significant, and it too is distributed.

The cable is now a transmission line and pulses travel down it as current loops which roll along as shown in Figure 6.2. If the pulse is positive, as it is launched along the line, it will charge the dielectric locally as at (a). As the pulse moves along, it will continue to charge the local dielectric as at (b). When the driver finishes the pulse, the trailing edge of the pulse follows the leading edge along the line. The voltage of the dielectric charged by the leading edge of the pulse is now higher than the voltage on the line, and so the dielectric discharges into the line as at (c). The current flows forward as it is in fact the same current which is flowing into the dielectric at the leading edge. There is thus a loop of current rolling down the line flowing forward in the ‘hot’ wire and backwards in the return. The analogy with the tracks of a Caterpillar tractor is quite good. Individual plates in the track find themselves being lowered to the ground at the front and raised again at the back.

The constant to-ing and fro-ing of charge in the dielectric results in dielectric loss of signal energy. Dielectric loss increases with frequency and so a long transmission line acts as a filter. Thus the term ‘low-loss’ cable refers primarily to the kind of dielectric used.

Transmission lines which transport energy in this way have a characteristic impedance caused by the interplay of the inductance along the conductors with the parallel capacitance. One consequence of that transmission mode is that correct termination or matching is required between the line and both the driver and the receiver. When a line is correctly matched, the rolling energy rolls straight out of the line into the load and the maximum energy is available. If the impedance presented by the load is incorrect, there will be reflections from the mismatch. An open circuit will reflect all the energy back in the same polarity as the original, whereas a short circuit will reflect all the energy back in the opposite polarity. Thus impedances above or below the correct value will have a tendency towards reflections whose magnitude depends upon the degree of mismatch and whose polarity depends upon whether the load is too high or too low. In practice it is the need to avoid reflections which is the most important reason to terminate correctly.

Reflections at impedance mismatches have practical applications; electricity companies inject high-frequency pulses into faulty cables and the time taken until the reflection from the break or short returns can be used to locate the source of damage. The same technique can be used to find wiring breaks in large studio complexes.

A perfectly square pulse contains an indefinite series of harmonics, but the higher ones suffer progressively more loss. A square pulse at the driver becomes less and less square with distance as Figure 6.3 shows. The harmonics are progressively lost until in the extreme case all that is left is the fundamental. A transmitted square wave is received as a sine wave. Fortunately data can still be recovered from the fundamental signal component.

images

Figure 6.3    A signal may be square at the transmitter, but losses increase with frequency and as the signal propagates, more of the harmonics are lost until only the fundamental remains. The amplitude of the fundamental then falls with further distance.

Once all the harmonics have been lost, further losses cause the amplitude of the fundamental to fall. The effect worsens with distance and it is necessary to ensure that data recovery is still possible from a signal of unpredictable level.

6.3 Types of recording medium

There is considerably more freedom of choice for digital media than was the case for analog signals. Once converted to the digital domain, audio is no more than data and can take advantage of the research expended in computer data recording.

Digital media do not need to have linear transfer functions, nor do they need to be noise-free or continuous. All they need to do is to allow the player to be able to distinguish the presence or absence of replay events, such as the generation of pulses, with reasonable (rather than perfect) reliability. In a magnetic medium, the event will be a flux change from one direction of magnetization to another. In an optical medium, the event must cause the pickup to perceive a change in the intensity of the light falling on the sensor. In CD, the apparent contrast is obtained by interference. In some disks it will be through selective absorption of light by dyes. In magneto-optical disks the recording itself is magnetic, but it is made and read using light.

6.4 Magnetism

Magnetism is vital to digital audio recording. Hard disks and tapes store magnetic patterns and media are driven by motors which themselves rely on magnetism.

A magnetic field can be created by passing a current through a solenoid, which is no more than a coil of wire. When the current ceases, the magnetism disappears. However, many materials, some quite common, display a permanent magnetic field with no apparent power source. Magnetism of this kind results from the spin of electrons within atoms. Atomic theory describes atoms as having nuclei around which electrons orbit, spinning as they go. Different orbits can hold a different number of electrons. The distribution of electrons determines whether the element is diamagnetic (non-magnetic) or paramagnetic (magnetic characteristics are possible). Diamagnetic materials have an even number of electrons in each orbit, and according to the Pauli exclusion principle half of them spin in each direction. The opposed spins cancel any resultant magnetic moment. Fortunately there are certain elements, the transition elements, which have an odd number of electrons in certain orbits. The magnetic moment due to electronic spin is not cancelled out in these paramagnetic materials.

Figure 6.4 shows that paramagnetism materials can be classified as antiferromagnetic, ferrimagnetic and ferromagnetic. In some materials alternate atoms are anti-parallel and so the magnetic moments are cancelled. In ferrimagnetic materials there is a certain amount of antiparallel cancellation, but a net magnetic moment remains. In ferromagnetic materials such as iron, cobalt or nickel, all the electron spins can be aligned and as a result the most powerful magnetic behaviour is obtained.

images

Figure 6.4    The classification of paramagnetic materials. The ferromagnetic materials exhibit the strongest magnetic behaviour.

It is not immediately clear how a material in which electron spins are parallel could ever exist in an unmagnetized state or how it could be partially magnetized by a relatively small external field. The theory of magnetic domains has been developed to explain what is observed in practice. Figure 6.5(a) shows a ferromagnetic bar which is demagnetized. It has no net magnetic moment because it is divided into domains or volumes which have equal and opposite moments. Ferromagnetic material divides into domains in order to reduce its magnetostatic energy. Figure 6.5(b) shows a domain wall which is around 0.1 micrometre thick. Within the wall the axis of spin gradually rotates from one state to another. An external field of quite small value is capable of disturbing the equilibrium of the domain wall by favouring one axis of spin over the other. The result is that the domain wall moves and one domain becomes larger at the expense of another. In this way the net magnetic moment of the bar is no longer zero as shown in (c).

For small distances, the domain wall motion is linear and reversible if the change in the applied field is reversed. However, larger movements are irreversible because heat is dissipated as the wall jumps to reduce its energy. Following such a domain wall jump, the material remains magnetized after the external field is removed and an opposing external field must be applied which must do further work to bring the domain wall back again. This is a process of hysteresis where work must be done to move each way. Were it not for this, non-linear mechanism magnetic recording would be impossible. If magnetic materials were linear, tapes would return to the demagnetized state immediately after leaving the field of the head and this book would be a good deal thinner.

images

Figure 6.5    (a) A magnetic material can have a zero net moment if it is divided into domains as shown here. Domain walls (b) are areas in which the magnetic spin gradually changes from one domain to another. The stresses which result store energy. When some domains dominate, a net magnetic moment can exist as in (c).

Figure 6.6 shows a hysteresis loop which is obtained by plotting the magnetization M when the external field H is swept to and fro. On the macroscopic scale, the loop appears to be a smooth curve, whereas on a small scale it is in fact composed of a large number of small jumps. These were first discovered by Barkhausen. Starting from the unmagnetized state at the origin, as an external field is applied, the response is initially linear and the slope is given by the susceptibility. As the applied field is increased a point is reached where the magnetization ceases to increase. This is the saturation magnetization Ms. If the applied field is removed, the magnetization falls, not to zero, but the the remanent magnetization Md. This remanence is the magnetic memory mechanism which makes recording and permanent magnets possible. The ratio of Mr to Md is called the squareness ratio. In recording media squareness is beneficial as it increases the remanent magnetization.

If an increasing external field is applied in the opposite direction, the curve continues to the point where the magnetization is zero. The field required to achieve this is called the intrinsic coercive force mHc. A small increase in the reverse field reaches the point where, if the field were to be removed, the remanent magnetization would become zero. The field required to do this is the remanent coercive force, rHc.

images

Figure 6.6    A hysteresis loop which comes about because of the non-linear behaviour of magnetic materials. If this characteristic were absent, magnetic recording would not exist.

As the external field H is swept to and fro, the magnetization describes a major hysteresis loop. Domain wall transit causes heat to be dissipated on every cycle around the loop and the dissipation is proportional to the loop area. For a recording medium, a large loop is beneficial because the replay signal is a function of the remanence and high coercivity resists erasure. The same is true for a permanent magnet. Heating is not an issue.

For a device such as a recording head, a small loop is beneficial. Figure 6.7(a) shows the large loop of a hard magnetic material used for recording media and for permanent magnets. Figure 6.7(b) shows the small loop of a soft magnetic material which is used for recording heads and transformers.

According to the Nyquist noise theorem, anything which dissipates energy when electrical power is supplied must generate a noise voltage when in thermal equilibrium. Thus magnetic recording heads have a noise mechanism which is due to their hysteretic behaviour. The smaller the loop, the less the hysteretic noise. In conventional heads, there are a large number of domains and many small domain wall jumps. In thin film heads there are fewer domains and the jumps must be larger. The noise this causes is known as Barkhausen noise, but as the same mechanism is responsible it is not possible to say at what point hysteresis noise should be called Barkhausen noise.

images

Figure 6.7    The recording medium requires a large loop area (a) whereas the head requires a small loop area (b) to cut losses.

6.5 Magnetic recording

Magnetic recording relies on the hysteresis of certain magnetic materials. After an applied magnetic field is removed, the material remains magnetized in the same direction. By definition the process is non-linear, and analog magnetic recorders have to use bias to linearize it. Digital recorders are not concerned with the non-linearity, and HF bias is unnecessary.

images

Figure 6.8    A digital record head is similar in principle to an analog head but uses much narrower tracks.

Figure 6.8 shows the construction of a typical digital record head, which is not disimilar to an analog record head. A magnetic circuit carries a coil through which the record current passes and generates flux. A non-magnetic gap forces the flux to leave the magnetic circuit of the head and penetrate the medium. The current through the head must be set to suit the coercivity of the tape, and is arranged to almost saturate the track. The amplitude of the current is constant, and recording is performed by reversing the direction of the current with respect to time. As the track passes the head, this is converted to the reversal of the magnetic field left on the tape with respect to distance. The magnetic recording is therefore bipolar. Figure 6.9 shows that the recording is actually made just after the trailing pole of the record head where the flux strength from the gap is falling. As in analog recorders, the width of the gap is generally made quite large to ensure that the full thickness of the magnetic coating is recorded, although this cannot be done if the same head is intended to replay.

images

Figure 6.9    The recording is actually made near the trailing pole of the head where the head flux falls below the coercivity of the tape.

Figure 6.10 shows what happens when a conventional inductive head, i.e. one having a normal winding, is used to replay the bipolar track made by reversing the record current. The head output is proportional to the rate of change of flux and so only occurs at flux reversals. In other words, the replay head differentiates the flux on the track. The polarity of the resultant pulses alternates as the flux changes and changes back. A circuit is necessary which locates the peaks of the pulses and outputs a signal corresponding to the original record current waveform. There are two ways in which this can be done.

images

Figure 6.10    Basic digital recording. At (a) the write current in the head is reversed from time to time, leaving a binary magnetization pattern shown at (b). When replayed, the waveform at (c) results because an output is only produced when flux in the head changes. Changes are referred to as transitions.

The amplitude of the replay signal is of no consequence and often an AGC system is used to keep the replay signal constant in amplitude. What matters is the time at which the write current, and hence the flux stored on the medium, reverses. This can be determined by locating the peaks of the replay impulses, which can conveniently be done by differentiating the signal and looking for zero crossings. Figure 6.11 shows that this results in noise between the peaks. This problem is overcome by the gated peak detector, where only zero crossings from a pulse which exceeds the threshold will be counted. The AGC system allows the thresholds to be fixed. As an alternative, the record waveform can also be restored by integration, which opposes the differentiation of the head as in Figure 6.12.1

images

Figure 6.11    Gated peak detection rejects noise by disabling the differentiated output between transitions.

images

Figure 6.12    Integration method for re-creating write-current waveform.

The head shown in Figure 6.8 has a frequency response shown in Figure 6.13. At DC there is no change of flux and no output. As a result, inductive heads are at a disadvantage at very low speeds. The output rises with frequency until the rise is halted by the onset of thickness loss. As the frequency rises, the recorded wavelength falls and flux from the shorter magnetic patterns cannot be picked up so far away. At some point, the wavelength becomes so short that flux from the back of the tape coating cannot reach the head and a decreasing thickness of tape contributes to the replay signal.2 In digital recorders using short wavelengths to obtain high density, there is no point in using thick coatings. As wavelength further reduces, the familiar gap loss occurs, where the head gap is too big to resolve detail on the track. The construction of the head results in the same action as that of a two-point transversal filter, as the two poles of the head see the tape with a small delay interposed due to the finite gap. As expected, the head response is like a comb filter with the well-known nulls where flux cancellation takes place across the gap. Clearly the smaller the gap, the shorter the wavelength of the first null. This contradicts the requirement of the record head to have a large gap. In quality analog audio recorders, it is the norm to have different record and replay heads for this reason, and the same will be true in digital machines which have separate record and playback heads. Clearly where the same pair of heads are used for record and play, the head gap size will be determined by the playback requirement.

images

Figure 6.13    The major mechanism defining magnetic channel bandwidth.

As can be seen, the frequency response is far from ideal, and steps must be taken to ensure that recorded data waveforms do not contain frequencies which suffer excessive losses.

A more recent development is the magneto-resistive (M-R) head. This is a head which measures the flux on the tape rather than using it to generate a signal directly. Flux measurement works down to DC and so offers advantages at low tape speeds. Unfortunately flux-measuring heads are not polarity conscious but sense the modulus of the flux and if used directly they respond to positive and negative flux equally, as shown in Figure 6.14. This is overcome by using a small extra winding in the head carrying a constant current. This creates a steady bias field which adds to the flux from the tape. The flux seen by the head is now unipolar and changes between two levels and a more useful output waveform results.

Recorders which have low head-to-medium speed, such as DCC (digital compact cassette) use M-R heads, whereas recorders with high speeds, such as DASH (digital audio stationary head), DAT (digital audio tape) and magnetic disk drives use inductive heads.

Heads designed for use with tape work in actual contact with the magnetic coating. The tape is tensioned to pull it against the head. There will be a wear mechanism and need for periodic cleaning.

In the hard disk, the rotational speed is high in order to reduce access time, and the drive must be capable of staying on line for extended periods. In this case the heads do not contact the disk surface, but are supported on a boundary layer of air. The presence of the air film causes spacing loss, which restricts the wavelengths at which the head can replay. This is the penalty of rapid access.

images

Figure 6.14    The sensing element in a magneto-resistive head is not sensitive to the polarity of the flux, only the magnitude. At (a) the track magnetization is shown and this causes a bidirectional flux variation in the head as at (b), resulting in the magnitude output at (c). However, if the flux in the head due to the track is biased by an additional field, it can be made unipolar as at (d) and the correct output waveform is obtained.

Digital audio recorders must operate at high density in order to offer a reasonable playing time. This implies that shortest possible wavelengths will be used. Figure 6.15 shows that when two flux changes, or transitions, are recorded close together, they affect each other on replay. The amplitude of the composite signal is reduced, and the position of the peaks is pushed outwards. This is known as inter-symbol interference, or peak-shift distortion and it occurs in all magnetic media.

images

Figure 6.15    Readout pulses from two closely recorded transitions are summed in the head and the effect is that the peaks of the waveform are moved outwards. This is known as peak-shift distortion and equalization is necessary to reduce the effect.

The effect is primarily due to high-frequency loss and it can be reduced by equalization on replay, as is done in most tapes, or by pre-compensation on record as is done in hard disks.

6.6 Azimuth recording and rotary heads

Figure 6.16(a) shows that in azimuth recording, the transitions are laid down at an angle to the track by using a head which is tilted. Machines using azimuth recording must always have an even number of heads, so that adjacent tracks can be recorded with opposite azimuth angle. The two track types are usually referred to as A and B. Figure 6.16(b) shows the effect of playing a track with the wrong type of head. The playback process suffers from an enormous azimuth error. The effect of azimuth error can be understood by imagining the tape track to be made from many identical parallel strips.

In the presence of azimuth error, the strips at one edge of the track are played back with a phase shift relative to strips at the other side. At some wavelengths, the phase shift will be 180°, and there will be no output; at other wavelengths, especially long wavelengths, some output will reappear. The effect is rather like that of a comb filter, and serves to attenuate crosstalk due to adjacent tracks so that no guard bands are required. Since no tape is wasted between the tracks, more efficient use is made of the tape. The term guard-band-less recording is often used instead of, or in addition to, the term azimuth recording. The failure of the azimuth effect at long wavelengths is a characteristic of azimuth recording, and it is necessary to ensure that the spectrum of the signal to be recorded has a small low-frequency content. The signal will need to pass through a rotary transformer to reach the heads, and cannot therefore contain a DC component.

images

Figure 6.16    In azimuth recording (a), the head gap is tilted. If the track is played with the same head, playback is normal, but the response of the reverse azimuth head is attenuated (b).

In recorders such as DAT there is no separate erase process, and erasure is achieved by overwriting with a new waveform. Overwriting is only successful when there are no long wavelengths in the earlier recording, since these penetrate deeper into the tape, and the short wavelengths in a new recording will not be able to erase them. In this case the ratio between the shortest and longest wavelengths recorded on tape should be limited. Restricting the spectrum of the code to allow erasure by overwrite also eases the design of the rotary transformer.

6.7 Optical disks

Optical recorders have the advantage that light can be focused at a distance whereas magnetism cannot. This means that there need be no physical contact between the pickup and the medium and no wear mechanism. In the same way that the recorded wavelength of a magnetic recording is limited by the gap in the replay head, the density of optical recording is limited by the size of light spot which can be focused on the medium. This is controlled by the wavelength of the light used and by the aperture of the lens. When the light spot is as small as these limits allow, it is said to be diffraction limited. The recorded details on the disk are minute, and could easily be obscured by dust particles. In practice the information layer needs to be protected by a thick transparent coating. Light enters the coating well out of focus over a large area so that it can pass around dust particles, and comes to a focus within the thickness of the coating. Although the number of bits per unit area is high in optical recorders the number of bits per unit volume is not as high as that of tape because of the thickness of the coating.

Figure 6.17 shows the principle of readout of the Compact Disc which is a read-only disk manufactured by pressing. The track consists of raised bumps separated by flat areas. The entire surface of the disk is metallized, and the bumps are one quarter of a wavelength in height. The player spot is arranged so that half of its light falls on top of a bump, and half on the surrounding surface. Light returning from the flat surface has travelled half a wavelength further than light returning from the top of the bump, and so there is a phase reversal between the two components of the reflection. This causes destructive interference, and light cannot return to the pickup. It must reflect at angles which are outside the aperture of the lens and be lost. Conversely, when light falls on the flat surface between bumps, the majority of it is reflected back to the pickup. The pickup thus sees a disk apparently having alternately good or poor reflectivity. The sensor in the pickup responds to the incident intensity and so the replay signal is unipolar and varies between two levels in a manner similar to the output of a M-R head.

images

Figure 6.17    CD readout principle and dimensions. The presence of a bump causes destructive interference in the reflected light.

Some disks can be recorded once, but not subsequently erased or rerecorded. These are known as WORM (Write Once Read Many) disks. One type of WORM disk uses a thin metal layer which has holes punched in it on recording by heat from a laser. Others rely on the heat raising blisters in a thin metallic layer by decomposing the plastic material beneath. Yet another alternative is a layer of photo-chemical dye which darkens when struck by the high-powered recording beam. Whatever the recording principle, light from the pickup is reflected more or less, or absorbed more or less, so that the pickup senses a change in reflectivity. Certain WORM disks can be read by conventional CD players and are thus called recordable CDs, or CD-R, whereas others will only work in a particular type of drive.

All optical disks need mechanisms to keep the pickup following the track and sharply focused on it. These will be discussed in Chapter 12 and need not be treated here.

images

Figure 6.18    Frequency response of laser pickup. Maximum operating frequency is about half of cut-off frequency E.

The frequency response of an optical disk is shown in Figure 6.18. The response is best at DC and falls steadily to the optical cut-off frequency. Although the optics work down to DC, this cannot be used for the data recording. DC and low frequencies in the data would interfere with the focus and tracking servos and, as will be seen, difficulties arise when attempting to demodulate a unipolar signal. In practice the signal from the pickup is split by a filter. Low frequencies go to the servos, and higher frequencies go to the data circuitry. As a result the optical disk channel has the same inability to handle DC as does a magnetic recorder, and the same techniques are needed to overcome it.

6.8 Magneto-optical disks

When a magnetic material is heated above its Curie temperature, it becomes demagnetized, and on cooling will assume the magnetization of an applied field which would be too weak to influence it normally. This is the principle of magneto-optical recording used in the Sony MiniDisc. The heat is supplied by a finely focused laser and the field is supplied by a coil which is much larger.

Figure 6.19 shows that the medium is initially magnetized in one direction only. In order to record, the coil is energized with a current in the opposite direction. This is too weak to influence the medium in its normal state, but when it is heated by the recording laser beam the heated area will take on the magnetism from the coil when it cools. Thus a magnetic recording with very small dimensions can be made even though the magnetic circuit involved is quite large in comparison.

images

Figure 6.19    The thermomagneto-optical disk uses the heat from a laser to allow magnetic field to record on the disk.

Readout is obtained using the Kerr effect or the Faraday effect, which are phenomena whereby the plane of polarization of light can be rotated by a magnetic field. The angle of rotation is very small and needs a sensitive pickup. The pickup contains a polarizing filter before the sensor. Changes in polarization change the ability of the light to get through the polarizing filter and results in an intensity change which once more produces a unipolar output.

The magneto-optic recording can be erased by reversing the current in the coil and operating the laser continuously as it passes along the track. A new recording can then be made on the erased track.

A disadvantage of magneto-optical recording is that all materials having a Curie point low enough to be useful are highly corrodible by air and need to be kept under an effectively sealed protective layer.

The magneto-optical channel has the same frequency response as that shown in Figure 6.18.

6.9 Equalization

The characteristics of most channels are that signal loss occurs which increases with frequency. This has the effect of slowing down rise times and thereby sloping off edges. If a signal with sloping edges is sliced, the time at which the waveform crosses the slicing level will be changed, and this causes jitter. Figure 6.20 shows that slicing a sloping waveform in the presence of baseline wander causes more jitter.

images

Figure 6.20    A DC offset can cause timing errors.

On a long cable, high-frequency rolloff can cause sufficient jitter to move a transition into an adjacent bit period. This is called inter-symbol interference and the effect becomes worse in signals which have greater asymmetry, i.e. short pulses alternating with long ones. The effect can be reduced by the application of equalization, which is typically a high-frequency boost, and by choosing a channel code which has restricted asymmetry.

images

Figure 6.21    Peak-shift distortion is due to the finite width of replay pulses. The effect can be reduced by the pulse slimmer shown in (a) which is basically a transversal filter. The use of a linear operational amplifier emphasizes the analog nature of channels. Instead of replay pulse slimming, transitions can be written with a displacement equal and opposite to the anticipated peak shift as shown in (b).

Compensation for peak shift distortion in recording requires equalization of the channel,3 and this can be done by a network after the replay head, termed an equalizer or pulse sharpener,4 as in Figure 6.21(a). This technique uses transversal filtering to oppose the inherent transversal effect of the head. As an alternative, pre-compensation in the record stage can be used as shown in (b). Transitions are written in such a way that the anticipated peak shift will move the readout peaks to the desired timing.

6.10 Data separation

The important step of information recovery at the receiver or replay circuit is known as data separation. The data separator is rather like an analog-to-digital convertor because the two processes of sampling and quantizing are both present. In the time domain, the sampling clock is derived from the clock content of the channel waveform. In the voltage domain, the process of slicing converts the analog waveform from the channel back into a binary representation. The slicer is thus a form of quantizer which has only one-bit resolution. The slicing process makes a discrete decision about the voltage of the incoming signal in order to reject noise. The sampler makes discrete decisions along the time axis in order to reject jitter. These two processes will be described in detail.

6.11 Slicing

The slicer is implemented with a comparator which has analog inputs but a binary output. In a cable receiver, the input waveform can be sliced directly. In an inductive magnetic replay system, the replay waveform is differentiated and must first pass through a peak detector (Figure 6.11) or an integrator (Figure 6.12). The signal voltage is compared with the midway voltage, known as the threshold, baseline or slicing level by the comparator. If the signal voltage is above the threshold, the comparator outputs a high level, if below, a low level results.

Figure 6.22 shows some waveforms associated with a slicer. At (a) the transmitted waveform has an uneven duty cycle. The DC component, or average level, of the signal is received with high amplitude, but the pulse amplitude falls as the pulse gets shorter. Eventually the waveform cannot be sliced. At (b) the opposite duty cycle is shown. The signal level drifts to the opposite polarity and once more slicing is impossible. The phenomenon is called baseline wander and will be observed with any signal whose average voltage is not the same as the slicing level. At (c) it will be seen that if the transmitted waveform has a relatively constant average voltage, slicing remains possible up to high frequencies even in the presence of serious amplitude loss, because the received waveform remains symmetrical about the baseline.

images

Figure 6.22    Slicing a signal which has suffered losses works well if the duty cycle is even. If the duty cycle is uneven, as in (a), timing errors will become worse until slicing fails. With the opposite duty cycle, the slicing fails in the opposite direction as in (b). If, however, the signal is DC free, correct slicing can continue even in the presence of serious losses, as (c) shows.

It is clearly not possible to simply serialize data in a shift register for so-called direct transmission, because successful slicing can only be obtained if the number of ones is equal to the number of zeros; there is little chance of this happening consistently with real data. Instead, a modulation code or channel code is necessary. This converts the data into a waveform which is DC-free or nearly so for the purpose of transmission.

The slicing threshold level is naturally zero in a bipolar system such as magnetic inductive replay or a cable. When the amplitude falls it does so symmetrically and slicing continues. The same is not true of M-R heads and optical pickups, which both respond to intensity and therefore produce a unipolar output. If the replay signal is sliced directly, the threshold cannot be zero, but must be some level approximately half the amplitude of the signal as shown in Figure 6.23(a). Unfortunately when the signal level falls it falls towards zero and not towards the slicing level. The threshold will no longer be appropriate for the signal as can be seen at (b). This can be overcome by using a DC-free coded waveform. If a series capacitor is connected to the unipolar signal from an optical pickup, the waveform is rendered bipolar because the capacitor blocks any DC component in the signal. The DC-free channel waveform passes through unaltered. If an amplitude loss is suffered, Figure 6.23(c) shows that the resultant bipolar signal now reduces in amplitude about the slicing level and slicing can continue.

images

Figure 6.23    (a) Slicing a unipolar signal requires a non-zero threshold. (b) If the signal amplitude changes, the threshold will then be incorrect. (c) If a DC-free code is used, a unipolar waveform can be converted to a bipolar waveform using a series capacitor. A zero threshold can be used and slicing continues with amplitude variations.

images

Figure 6.24    An adaptive slicer uses delay lines to produce a threshold from the waveform itself. Correct slicing will then be possible in the presence of baseline wander. Such a slicer can be used with codes which are not DC-free.

Whilst cables and optical recording channels need to be DC-free, some channel waveforms used in magnetic recording have a reduced DC component, but are not completely DC-free. As a result the received waveform will suffer from baseline wander. If this is moderate, an adaptive slicer which can move its threshold can be used. As Figure 6.24 shows, the adaptive slicer consists of a pair of delays. If the input and output signals are linearly added together with equal weighting, when a transition passes, the resultant waveform has a plateau which is at the half-amplitude level of the signal and can be used as a threshold voltage for the slicer. The coding of the DASH format is not DC-free and a slicer of this kind is employed.

6.12 Jitter rejection

The binary waveform at the output of the slicer will be a replica of the transmitted waveform, except for the addition of jitter or time uncertainty in the position of the edges due to noise, baseline wander, intersymbol interference and imperfect equalization.

images

Figure 6.25    A certain amount of jitter can be rejected by changing the signal at multiples of the basic detent period Td.

Binary circuits reject noise by using discrete voltage levels which are spaced further apart than the uncertainty due to noise. In a similar manner, digital coding combats time uncertainty by making the time axis discrete using events, known as transitions, spaced apart at integer multiples of some basic time period, called a detent, which is larger than the typical time uncertainty. Figure 6.25 shows how this jitter-rejection mechanism works. All that matters is to identify the detent in which the transition occurred. Exactly where it occurred within the detent is of no consequence.

As ideal transitions occur at multiples of a basic period, an oscilloscope, which is repeatedly triggered on a channel-coded signal carrying random data, will show an eye pattern if connected to the output of the equalizer. Study of the eye pattern reveals how well the coding used suits the channel. In the case of transmission, with a short cable, the losses will be small, and the eye opening will be virtually square except for some edge sloping due to cable capacitance. As cable length increases, the harmonics are lost and the remaining fundamental gives the eyes a diamond shape. The same eye pattern will be obtained with a recording channel where it is uneconomic to provide bandwidth much beyond the fundamental.

Noise closes the eyes in a vertical direction, and jitter closes the eyes in a horizontal direction, as in Figure 6.26. If the eyes remain sensibly open, data separation will be possible. Clearly more jitter can be tolerated if there is less noise, and vice versa. If the equalizer is adjustable, the optimum setting will be where the greatest eye opening is obtained.

In the centre of the eyes, the receiver must make binary decisions at the channel bit rate about the state of the signal, high or low, using the slicer output. As stated, the receiver is sampling the output of the slicer, and it needs to have a sampling clock in order to do that. In order to give the best rejection of noise and jitter, the clock edges which operate the sampler must be in the centre of the eyes.

images

Figure 6.26    A transmitted waveform which is generated according to the principle of Figure 6.25 will appear like this on an oscilloscope as successive parts of the waveform are superimposed on the tube. When the waveform is rounded off by losses, diamond-shaped eyes are left in the centre, spaced apart by the detent period.

As has been stated, a separate clock is not practicable in recording or transmission. A fixed-frequency clock at the receiver is of no use as even if it was sufficiently stable, it would not know what phase to run at.

The only way in which the sampling clock can be obtained is to use a phase-locked loop to regenerate it from the clock content of the self-clocking channel coded waveform. In phase-locked loops, the voltage-controlled oscillator is driven by a phase error measured between the output and some reference, such that the output eventually has the same frequency as the reference. If a divider is placed between the VCO and the phase comparator, as in Figure 6.27, the VCO frequency can be made to be a multiple of the reference. This also has the effect of making the loop more heavily damped. If a channel-coded waveform is used as a reference to a PLL, the loop will be able to make a phase comparison whenever a transition arrives and will run at the channel bit rate. When there are several detents between transitions, the loop will flywheel at the last known frequency and phase until it can rephase at a subsequent transition. Thus a continuous clock is recreated from the clock content of the channel waveform. In a recorder, if the speed of the medium should change, the PLL will change frequency to follow. Once the loop is locked, clock edges will be phased with the average phase of the jittering edges of the input waveform. If, for example, rising edges of the clock are phased to input transitions, then falling edges will be in the centre of the eyes. If these edges are used to clock the sampling process, the maximum jitter and noise can be rejected. The output of the slicer when sampled by the PLL edge at the centre of an eye is the value of a channel bit. Figure 6.28 shows the complete clocking system of a channel code from encoder to data separator. Clearly data cannot be separated if the PLL is not locked, but it cannot be locked until it has seen transitions for a reasonable period. In recorders, which have discontinuous recorded blocks to allow editing, the solution is to precede each data block with a pattern of transitions whose sole purpose is to provide a timing reference for synchronizing the phase-locked loop. This pattern is known as a preamble. In interfaces, the transmission can be continuous and there is no difficulty remaining in lock indefinitely. There will simply be a short delay on first applying the signal before the receiver locks to it.

images

Figure 6.27    A typical phase-locked loop where the VCO is forced to run at a multiple of the input frequency. If the input ceases, the output will continue for a time at the same frequency until it drifts.

images

Figure 6.28 The clocking system when channel coding is used. The encoder clock runs at the channel bit rate, and any transitions in the channel must coincide with encoder clock edges. The reason for doing this is that, at the data separator, the PLL can lock to the edges of the channel signal, which represent an intermittent clock, and turn it into a continuous clock. The jitter in the edges of the channel signal causes noise in the phase error of the PLL, but the damping acts as a filter and the PLL runs at the average phase of the channel bits, rejecting the jitter.

One potential problem area which is frequently overlooked is to ensure that the VCO in the receiving PLL is correctly centred. If it is not, it will be running with a static phase error and will not sample the received waveform at the centre of the eyes. The sampled bits will be more prone to noise and jitter errors. VCO centring can simply be checked by displaying the control voltage. This should not change significantly when the input is momentarily interrupted.

6.13 Channel coding

In summary, it is not practicable simply to serialize raw data in a shift register for the purpose of recording or for transmission except over relatively short distances. Practical systems require the use of a modulation scheme, known as a channel code, which expresses the data as waveforms which are self-clocking in order to reject jitter, separate the received bits and to avoid skew on separate clock lines. The coded waveforms should further be DC-free or nearly so to enable slicing in the presence of losses and have a narrower spectrum than the raw data to make equalization possible.

Jitter causes uncertainty about the time at which a particular event occurred. The frequency response of the channel then places an overall limit on the spacing of events in the channel. Particular emphasis must be placed on the interplay of bandwidth, jitter and noise, which will be shown here to be the key to the design of a successful channel code.

Figure 6.29 shows that a channel coder is necessary prior to the record stage, and that a decoder, known as a data separator, is necessary after the replay stage. The output of the channel coder is generally a logic level signal which contains a ‘high’ state when a transition is to be generated. The waveform generator produces the transitions in a signal whose level and impedance is suitable for driving the medium or channel. The signal may be bipolar or unipolar as appropriate.

Some codes eliminate DC entirely, which is advantageous for optical media and for rotary head recording. Some codes can reduce the channel bandwidth needed by lowering the upper spectral limit. This permits higher linear density, usually at the expense of jitter rejection. Other codes narrow the spectrum by raising the lower limit. A code with a narrow spectrum has a number of advantages. The reduction in asymmetry will reduce peak shift and data separators can lock more readily because the range of frequencies in the code is smaller. In theory the narrower the spectrum, the less noise will be suffered, but this is only achieved if filtering is employed. Filters can easily cause phase errors which will nullify any gain.

images

Figure 6.29    The major components of a channel coding system. See text for details.

A convenient definition of a channel code (for there are certainly others) is: ‘A method of modulating real data such that they can be reliably received despite the shortcomings of a real channel, while making maximum economic use of the channel capacity.’

The basic time periods of a channel-coded waveform are called positions or detents, in which the transmitted voltage will be reversed or stay the same. The symbol used for the units of channel time is Td.

There are many ways of creating such a waveform, but the most convenient is to convert the raw data bits to a larger number of channel bits which are output from a shift register to the waveform generator at the detent rate. The coded waveform will then be high or low according to the state of a channel bit which describes the detent.

Channel coding is the art of converting real data into channel bits. It is important to appreciate that the convention most commonly used in coding is one in which a channel-bit one represents a voltage change, whereas a zero represents no change. This convention is used because it is possible to assemble sequential groups of channel bits together without worrying about whether the polarity of the end of the last group matches the beginning of the next. The polarity is unimportant in most codes and all that matters is the length of time between transitions. It should be stressed that channel bits are not recorded. They exist only in a circuit technique used to control the waveform generator. In many media, for example CD, the channel bit rate is beyond the frequency response of the channel and so it cannot be recorded.

One of the fundamental parameters of a channel code is the density ratio (DR). One definition of density ratio is that it is the worst-case ratio of the number of data bits recorded to the number of transitions in the channel. It can also be thought of as the ratio between the Nyquist rate of the data (one-half the bit rate) and the frequency response required in the channel. The storage density of data recorders has steadily increased due to improvements in medium and transducer technology, but modern storage densities are also a function of improvements in channel coding. Figure 6.30(a) shows how density ratio has improved as more sophisticated codes have been developed.

images

Figure 6.30    (a) Comparison of codes by density ratio; (b) comparison of codes by figure of merit. Note how 4/5, 2/3, 8/10 and RNRZ move up because of good jitter performance; HDM-3 moves down because of jitter sensitivity.

As jitter is such an important issue in digital recording and transmission, a parameter has been introduced to quantify the ability of a channel code to reject time instability. This parameter, the jitter margin, also known as the window margin or phase margin (Tw), is defined as the permitted range of time over which a transition can still be received correctly, divided by the data bit-cell period (T).

Since equalization is often difficult in practice, a code which has a large jitter margin will sometimes be used because it resists the effects of inter-symbol interference well. Such a code may achieve a better performance in practice than a code with a higher density ratio but poor jitter performance.

A more realistic comparison of code performance will be obtained by taking into account both density ratio and jitter margin. This is the purpose of the figure of merit (FoM), which is defined as DRimagesTw. Figure 6.30(b) shows a number of codes compared by FoM.

6.14 Recording-oriented codes

Many channel codes are sufficiently versatile that they have been used in recording, electrical or optical cable transmission and radio transmission. Others are more specialized and are intended for only one of these categories. Channel coding has roots in computers, in telemetry and in Telex services, but has for some time been considered a single subject. These starting points will be considered here.

In magnetic recording, the first digital recordings were developed for early computers and used very simple techniques. Figure 6.31(a) shows that in Return to Zero (RZ) recording, the record current has a zero state between bits and flows in one direction to record a one and in the opposite direction to record a zero. Thus every bit contains two flux changes which replay as a pair of pulses, one positive and one negative. The signal is self-clocking because pulses always occur. The order in which they occur determines the state of the bit. RZ recording cannot erase by overwrite because there are times when no record current flows. Additionally the signal amplitude is only one half of what is possible. These problems were overcome in the Non-Return to Zero code shown in Figure 6.31(b). As the name suggests, the record current does not cease between bits, but flows at all times in one direction or the other dependent on the state of the bit to be recorded. This results in a replay pulse only when the data bits change from state to another. As a result if one pulse was missed, the subsequent bits would be inverted. This was avoided by adapting the coding such that the record current would change state or invert whenever a data one occurred, leading to the term Non-Return to Zero Invert or NRZI shown in Figure 6.31(c). In NRZI a replay pulse occurs whenever there is a data one. Clearly neither NRZ or NRZI are self-clocking, but require a separate clock track. Skew between tracks can only be avoided by working at low density and so the system cannot be used for digital audio. However, virtually all the codes used for magnetic recording are based on the principle of reversing the record current to produce a transition.

images

Figure 6.31    Early magnetic recording codes. RZ shown at (a) had poor signal-to-noise ratio and poor overwrite capability. NRZ at (b) overcame these problems but suffered error propagation. NRZI at (c) was the final result where a transition represented a one. NRZI is not self-clocking.

6.15 Transmission-oriented codes

In cable transmission, also known as line signalling, and in telemetry, the starting point was often the speech bandwidth available in existing telephone lines and radio links. There was no DC response, just a range of frequencies available. Figure 6.32(a) shows that a pair of frequencies can be used, one for each state of a data bit. The result is frequency shift keying (FSK) which is the same as would be obtained from an analog frequency modulator fed with a two-level signal. This is exactly what happens when two-level pseudo-video from a PCM adaptor is fed to a VCR and is the technique used in units such as the PCM F-1 and the PCM-1630. PCM adaptors have also been used to carry digital audio over a video landline or microwave link. Clearly FSK is DC-free and self-clocking.

Instead of modulating the frequency of the signal, the phase can be modulated or shifted instead, leading to the generic term of phase shift keying or PSK. This method is highly suited to broadcast as it is easily applied to a radio frequency carrier. The simplest technique is selectively to invert the carrier phase according to the data bit as in Figure 6.32(b). There can be many cycles of carrier in each bit period. This technique is known as phase encoding (PE) and is used in GPS (Global Positioning System) broadcasts. The receiver in a PE system is a well-damped phase-locked loop which runs at the average phase of the transmission. Phase changes will then result in phase errors in the loop and so the phase error is the demodulated signal.

images

Figure 6.32    Various communications oriented codes are shown here: at (a) frequency shift keying (FSK), at (b) phase encoding and at (c) differential quadrature phase shift keying (DQPSK).

6.16 General-purpose codes

Despite the different origins of codes, there are many similarities between them. If the two frequencies in an FSK system are one octave apart, the limiting case in which the highest data rate is obtained is when there is one half-cycle of the lower frequency or a whole cycle of the high frequency in one bit period. This gives rise to the frequency modulation (FM). In the same way, the limiting case of phase encoding is where there is only one cycle of carrier per bit. In recording, this what is meant by phase encoding. These approaches can be contrasted in Figure 6.33.

images

Figure 6.33 FM and PE contrasted. In (a) are the FM waveform and the channel bits which may be used to describe transitions in it. The FM coder is shown in (b). The PE waveform is shown in (c). As PE is polarity conscious, the channel bits must describe the signal level rather than the transitions. The coder is shown in (d).

The FM code, also known as Manchester code or bi-phase mark code, shown in Figure 6.33(a) was the first practical self-clocking binary code and it is suitable for both transmission and recording. It is DC-free and very easy to encode and decode. It is the code specified for the AES/EBU digital audio interconnect standard which will be described in Chapter 8. In the field of recording it remains in use today only where density is not of prime importance, for example in SMPTE/EBU timecode for professional audio and video recorders and in floppy disks.

In FM there is always a transition at the bit-cell boundary which acts as a clock. For a data one, there is an additional transition at the bit-cell centre. Figure 6.33(a) shows that each data bit can be represented by two channel bits. For a data zero, they will be 10, and for a data one they will be 11. Since the first bit is always one, it conveys no information, and is responsible for the density ratio of only one-half. Since there can be two transitions for each data bit, the jitter margin can only be half a bit, and the resulting FoM is only 0.25. The high clock content of FM does, however, mean that data recovery is possible over a wide range of speeds; hence the use for timecode. The lowest frequency in FM is due to a stream of zeros and is equal to half the bit rate. The highest frequency is due to a stream of ones, and is equal to the bit rate. Thus the fundamentals of FM are within a band of one octave. Effective equalization is generally possible over such a band. FM is not polarity conscious and can be inverted without changing the data.

Figure 6.33(b) shows how an FM coder works. Data words are loaded into the input shift register which is clocked at the data bit rate. Each data bit is converted to two channel bits in the code book or look-up table. These channel bits are loaded into the output register. The output register is clocked twice as fast as the input register because there are twice as many channel bits as data bits. The ratio of the two clocks is called the code rate, in this case it is a rate one-half code. Ones in the serial channel bit output represent transitions whereas zeros represent no change. The channel bits are fed to the waveform generator which is a one-bit delay, clocked at the channel bit rate, and an exclusive-OR gate. This changes state when a channel bit one is input. The result is a coded FM waveform where there is always a transition at the beginning of the data bit period, and a second optional transition whose presence indicates a one.

In PE there is always a transition in the centre of the bit but Figure 6.33(c) shows that the transition between bits is dependent on the data values. Although its origins were in line coding, phase encoding can be used for optical and magnetic recording as it is DC-free and self-clocking. It has the same DR and Tw as FM, and the waveform can also be described using channel bits, but with a different notation. As PE is polarity sensitive, the channel bits determine the level of the encoded signal rather than causing a transition. Figure 6.33(d) shows that the allowable channel bit patterns are now 10 and 01.

In modified frequency modulation (MFM) also known as Miller code,5 the highly redundant clock content of FM was reduced by the use of a phase-locked loop in the receiver which could flywheel over missing clock transitions. This technique is implicit in all the more advanced codes. Figure 6.34(a) shows that the bit-cell centre transition on a data one was retained, but the bit-cell boundary transition is now only required between successive zeros. There are still two channel bits for every data bit, but adjacent channel bits will never be one, doubling the minimum time between transitions, and giving a DR of 1. Clearly the coding of the current bit is now influenced by the preceding bit. The maximum number of prior bits which affect the current bit is known as the constraint length Lc, measured in data-bit periods. For MFM Lc = T. Another way of considering the constraint length is that it assesses the number of data bits which may be corrupted if the receiver misplaces one transition. If Lc is long, all errors will be burst errors.

images

Figure 6.34    MFM or Miller code is generated as shown here. The minimum transition spacing is twice that of FM or PE. MFM is not always DC free as shown at (b). This can be overcome by the modification of (c) which results in the Miller2 code.

MFM doubled the density ratio compared to FM and PE without changing the jitter performance; thus the FoM also doubles, becoming 0.5. It was adopted for many rigid disks at the time of its development, and remains in use on double-density floppy disks. It is not, however, DC-free. Figure 6.34(b) shows how MFM can have DC content under certain conditions.

6.17 Miller2 code

The Miller2 code is derived from MFM, and Figure 6.34(c) shows that the DC content is eliminated by a slight increase in complexity.6,7 Wherever an even number of ones occurs between zeros, the transition at the last one is omitted. This creates two additional, longer run lengths and increases the Tmax of the code. The decoder can detect these longer run lengths in order to re-insert the suppressed ones. The FoM of Miller2 is 0.5 as for MFM. Miller2 was used in early 3M stationary head digital audio recorders, in high rate instrumentation recorders and in the D-2 DVTR format.

6.18 Group codes

Further improvements in coding rely on converting patterns of real data to patterns of channel bits with more desirable characteristics using a conversion table known as a codebook. If a data symbol of m bits is considered, it can have 2m different combinations. As it is intended to discard undesirable patterns to improve the code, it follows that the number of channel bits n must be greater than m. The number of patterns which can be discarded is:

2n – 2m

One name for the principle is group code recording (GCR), and an important parameter is the code rate, defined as:

images

It will be evident that the jitter margin Tw is numerically equal to the code rate, and so a code rate near to unity is desirable. The choice of patterns which are used in the codebook will be those which give the desired balance between clock content, bandwidth and DC content.

Figure 6.35 shows that the upper spectral limit can be made to be some fraction of the channel bit rate according to the minimum distance between ones in the channel bits. This is known as Tmin, also referred to as the minimum transition parameter M and in both cases is measured in data bits T. It can be obtained by multiplying the number of channel detent periods between transitions by the code rate. Unfortunately, codes are measured by the number of consecutive zeros in the channel bits, given the symbol d, which is always one less than the number of detent periods. In fact Tmin is numerically equal to the density ratio.

images

It will be evident that choosing a low code rate could increase the density ratio, but it will impair the jitter margin. The figure of merit is:

images

since Tw = m/n

Figure 6.35 also shows that the lower spectral limit is influenced by the maximum distance between transitions Tmax. This is also obtained by multiplying the maximum number of detent periods between transitions by the code rate. Again, codes are measured by the maximum number of zeros between channel ones, k, and so:

images

and the maximum/minimum ratio P is:

images

The length of time between channel transitions is known as the run length. Another name for this class is the run-length-limited (RLL) codes.8 Since m data bits are considered as one symbol, the constraint length Lc will be increased in RLL codes to at least m. It is, however, possible for a code to have run-length limits without it being a group code.

images

Figure 6.35    A channel code can control its spectrum by placing limits on Tmin (M) and Tmax which define upper and lower frequencies. The ratio of Tmax/Tmin determines the asymmetry of waveform and predicts DC content and peak shift. Example shown is EFM.

In practice, the junction of two adjacent channel symbols may violate run-length limits, and it may be necessary to create a further codebook of symbol size 2n which converts violating code pairs to acceptable patterns. This is known as merging and follows the golden rule that the substitute 2n symbol must finish with a pattern which eliminates the possibility of a subsequent violation. These patterns must also differ from all other symbols.

Substitution may also be used to different degrees in the same nominal code in order to allow a choice of maximum run length, e.g. 3PM.9 The maximum number of symbols involved in a substitution is denoted by r.10,11 There are many RLL codes and the parameters d,k,m,n, and r are a way of comparing them.

Sometimes the code rate forms the name of the code, as in 2/3, 8/10 and EFM; at other times the code may be named after the d,k parameters, as in 2,7 code. Various examples of group codes will be given to illustrate the principles involved.

6.19 4/5 code of MADI

In the MADI (multi-channel audio interface) standard12, a four-fifths rate code is used where groups of four data bits are represented by groups of five channel bits.

Four bits have 16 combinations whereas five bits have 32 combinations. Clearly only 16 out of these 32 are necessary to convey all the possible data. Figure 6.36 shows that the 16 channel bit patterns chosen are those which have the least DC component combined with a high clock content. Adjacent ones are permitted in the channel bits, so there can be no violation of Tmin at the boundary of two symbols. Tmax is determined by the worst case run of zeros at a symbol boundary and as k = 3, Tmax is 16/5 = 3.2T. The code is thus described as 0,3,4,5,1 and Lc = 4T.

images

Figure 6.36    The codebook of the 4/5 code of MADI. Note that a one represents a transition in the channel.

The jitter resistance of a group code is equal to the code rate. For example, in 4/5 transitions cannot be closer than 0.8 of a data bit apart and so this represents the peak to peak jitter which can be rejected. The density ratio is also 0.8 so the FoM is 0.64; an improvement over FM.

A further advantage of group coding is that it is possible to have codes which have no data meaning. In MADI further channel bit patterns are used for packing and synchronizing. Packing is where dummy data are sent when the real data rate is low in order to keep the channel frequencies constant. This is necessary so that fixed equalization can be used. The packing pattern does not decode to data and so it can be easily discarded at the receiver. Further details of MADI can be found in Chapter 8.

6.20 2/3 code

Figure 6.37(a) shows the code book of an optimized code which illustrates one merging technique. This is a 1,7,2,3,2 code known as 2/3. It is designed to have a good jitter window in order to resist peak shift distortion in disk drives, but it also has a good density ratio.13 In 2/3 code, pairs of data bits create symbols of three channel bits. For bandwidth reduction, codes having adjacent ones are eliminated so that d = 1. This halves the upper spectral limit and the DR is improved accordingly:

images

In Figure 6.37(b) it will be seen that some group combinations cause violations. To avoid this, pairs of three-channel bit symbols are replaced with a new six-channel bit symbol. Lc is thus 4T, the same as for the 4/5 code. The jitter window is given by:

images

and the FoM is:

images

images

Figure 6.37    2/3 code. In (a) two data bits (m) are expressed as three channel bits (n) without adjacent transitions (d = 1). Violations are dealt with by substitution

images

Adjacent data pairs can break the encoding rule; in these cases substitutions are made, as shown in (b).

This is an extremely good figure for an RLL code, and is some 10 per cent better than the FoM of 3PM14 and 2,7 and as a result 2/3 has been highly successful in Winchester disk drives.

6.21 EFM code in CD

This section is concerned solely with the channel coding of CD. A more comprehensive discussion of how the coding is designed to suit the specific characteristics of an optical disk is given in Chapter 12. Figure 6.38 shows the 8,14 code (EFM) used in the Compact Disc. Here eight-bit symbols are represented by 14-bit channel symbols.15 There are 256 combinations of eight data bits, whereas 14 bits have 16K combinations. Of these only 267 satisfy the criteria that the maximum run-length shall not exceed 11 channel bits (k = 10) nor be less than three channel bits (d = 2). A section of the codebook is shown in the figure. In fact 258 of the the 267 possible codes are used because two unique patterns are used to synchronize the subcode blocks (see Chapter 12). It is not possible to prevent violations betwen adjacent symbols by substitution, and extra merging bits having no data meaning are placed between the symbols. Two merging bits would be adequate to prevent violations, but in practice three are used because a further task of the merging bits is to control the DC content of the waveform. The merging bits are selected by computing the digital sum value (DSV) of the waveform. The DSV is computed as shown in Figure 6.39(a). One is added to a count for every channel bit period where the waveform is in a high state, and one is subtracted for every channel bit period spent in a low state. Figure 6.39(b) shows that if two successive channel symbols have the same sense of DC offset, these can be made to cancel one another by placing an extra transition in the merging period. This has the effect of inverting the second pattern and reversing its DC content. The DC-free code can be high-pass filtered on replay and the lower-frequency signals are then used by the focus and tracking servos without noise due to the DC content of the audio data. Encoding EFM is complex, but was acceptable when CD was launched because only a few encoders are necessary in comparison with the number of players. Decoding is simpler as no DC content decisions are needed and a lookup table can be used. The code book was computer optimized to permit the implementation of a programmable logic array (PLA) decoder with the minimum complexity.

images

Figure 6.38    EFM code: d = 2, k = 10. Eight data bits produce 14 channel bits plus three packing bits. Code rate is 8/17. DR = (3 images 8)/17 = 1.41.

images

Figure 6.39    (a) Digital sum value example calculated from EFM waveform. (b) Two successive 14T symbols without DC control (upper) give DSV of –16. Additional transition (*) results in DSV of +2, anticipating negative content of next symbol.

Owing to the inclusion of merging bits, the code rate becomes 8/17, and the density ratio becomes:

images

and the FoM is:

images

The code is thus a 2,10,8,17, r system where r has meaning only in the context of DC control.16 The constraints d and k can still be met with r = 1 because of the merging bits. The figure of merit is less useful for optical media because the straight-line frequency response does not produce peak shift and the rigid, non-contact medium has good speed stability. The density ratio and the freedom from DC are the most important factors.

6.22 The 8/10 group code of DAT

The essential feature of the channel code of DAT is that it must be able to work well in an azimuth recording system. There are many channel codes available, but few of them are suitable for azimuth recording because of the large amount of crosstalk. The crosstalk cancellation of azimuth recording fails at low frequencies, so a suitable channel code must not only be free of DC, but it must suppress low frequencies as well. A further issue is that erasure is by overwriting, and as the heads are optimized for short-wavelength working, best erasure will be when the ratio between the longest and shortest wavelengths in the recording is small.

In Figure 6.40, some examples from the 8/10 group code of DAT are shown.17 Clearly a channel waveform which spends as much time high as low has no net DC content, and so all ten-bit patterns which meet this criterion of zero disparity can be found. As was seen in section 6.21, the term used to measure DC content is called the digital sum value (DSV). For every bit the channel spends high, the DSV will increase by one; for every bit the channel spends low, the DSV will decrease by one. As adjacent channel ones are permitted, the window margin and DR will be 0.8, comparing favourably with the figure of 0.5 for MFM, giving an FoM of 0.64. Unfortunately there are not enough DC-free combinations in ten channel bits to provide the 256 patterns necessary to record eight data bits. A further constraint is that it is desirable to restrict the maximum run length to improve overwrite capability and reduce peak shift. In the 8/10 code of DAT, no more than three channel zeros are permitted between channel ones, which makes the longest wavelength only four times the shortest. There are only 153 ten-bit patterns which are within this maximum run length and which have a DSV of zero.

images

Figure 6.40    Some of the 8/10 codebook for non-zero DSV symbols (two entries) and zero DSV symbols (one entry).

The remaining 103 data combinations are recorded using channel patterns that have non-zero DSV. Two channel patterns are allocated to each of the 103 data patterns. One of these has a DSV of +2, the other has a DSV of –2. For simplicity, the only difference between them is that the first channel bit is inverted. The choice of which channel-bit pattern to use is based on the DSV due to the previous code.

For example, if several bytes have been recorded with some of the 153 DC-free patterns, the DSV of the code will be zero. The first data byte is then found which has no zero disparity pattern. If the +2 DSV pattern is used, the code at the end of the pattern will also become +2 DSV. When the next pattern of this kind is found, the code having the DSV of –2 will automatically be selected to return the channel DSV to zero. In this way the code is kept DC-free, but the maximum distance between transitions can be shortened. A code of this kind is known as a low-disparity code.

In order to reduce the complexity of encoding logic, it is usual in group codes to computer-optimize the relationship between data patterns and code patterns. This has been done for 8/10 so that the conversion can be performed in a programmed logic array. The Boolean expressions for calculating the channel bits from data can be seen in Figure 6.41(a). Only DC-free or DSV = +2 patterns are produced by the logic, since the DSV = – 2 pattern can be obtained by reversing the first bit. The assessment of DSV is performed in an interesting manner. If in a pair of channel bits the second bit is one, the pair must be DC-free because each detent has a different value. If the five even channel bits in a ten-bit pattern are checked for parity and the result is one, the pattern could have a DSV of 0, ± 4 or ± If the result is zero, the DSV could be ± 2, ±6 or ±10. However, the codes used are known to be either zero or +2 DSV, so the state of the parity bit discriminates between them. Figure 6.41(b) shows the encoding circuit. The lower set of XOR gates calculate parity on the latest pattern to be recorded, and store the DSV bit in the latch. The next data byte to be recorded is fed to the PLA, which outputs a ten-bit pattern. If this is a zero disparity code, it passes to the output unchanged. If it is a DSV = +2 code, this will be detected by the upper XOR gates. If the latch is set, this means that a previous pattern had been +2 DSV, and so the first bit of the channel pattern is inverted by the XOR gate in that line, and the latch will be cleared because the DSV of the code has been returned to zero.

images

Figure 6.41    In (a) the truth table of the symbol encoding prior to DSV control. In (b) this circuit controls code disparity by remembering non-zero DSV in the latch and selecting a subsequent symbol with opposite DSV.

Decoding is simpler, because there is a direct relationship between tenbit codes and eight-bit data.

6.23 Tracking signals

Many recorders use track following systems to help keep the head(s) aligned with the narrow tracks used in digital media. These can operate by sensing low-frequency tones which are recorded along with the data. Whilst this can be done by linearly adding the tones to the coder output, this requires a linear record amplifier. An alternative is to use the DC content group codes. A code is devised where for each data pattern, several code patterns exist having a range of DC components. By choosing groups with a suitable sequence of DC offsets, a low frequency can be added to the coded signal. This can be filtered from the data waveform on replay.

6.24 Convolutional RLL codes

It has been mentioned that a code can be run-length limited without being a group code. An example of this is the HDM-1 code used in DASH format (digital audio stationary head – see Chapter 9) recorders. The coding is best described as convolutional, and is rather complex, as Figure 6.42 shows.18 The DR of 1.5 is achieved by treating the input sequence of 0,1 as a single symbol which has a transition recorded at the centre of the one. The code then depends upon whether the data continue with ones or revert to zeros. The shorter run lengths are used to describe sequential ones; the longer run lengths describe sequential zeros, up to a maximum run length of 4.5 T, with a constraining length of 5.5 T. In HDM-2, a derivative, the maximum run length is reduced to 4 T with the penalty that Lc becomes 7.5 T.

The 2/4M code used by the Mitsubishi ProDigi quarter-inch format recorders19 is also convolutional, and has an identical density ratio and window margin to HDM-1. Tmax is eight bits. Neither HDM-1 nor 2/4M are DC-free, but this is less important in stationary head recorders and an adaptive slicer as shown in section 6.11 can be used. The encoding of 2/4M is just as complex as that of HDM-1 and is shown in Figure 6.43.

images

Figure 6.42    HDM-1 code of the DASH format is encoded according to the above rules. Transitions will never be closer than 1.5 bits, nor further apart than 4.5 bits.

Two data bits form a group, and result in four channel bits where there are always two channel zeros between ones, to obtain a DR of 1.5. There are numerous exceptions required to the coding to prevent violation of the run-length limits and this requires a running sample of ten data bits to be examined. Thus the code is convolutional although it has many of the features of a substituting group code.

6.25 Graceful degradation

In all the channel codes described here all data bits are asumed to be equally important and if the characteristics of the channel degrade, there is an equal probability of corruption of any bit. In digital audio samples the bits are not equally important. Loss of a high-order bit causes greater degradation than loss of a low-order bit. For applications where the bandwidth of the channel is unpredictable, or where it may improve as technology matures, a different form of channel coding has been proposed20 where the probability of corruption of bits is not equal. The channel spectrum is divided in such a way that the least significant bits ocupy the highest frequencies and the most significant bits occupy the lower frequencies. When the bandwidth of the channel is reduced, the eye pattern is degraded such that certain eyes are indeterminate, but others remain open, guaranteeing reception and clocking of high-order bits. In PCM audio the result would be sensibly the same waveform but an increased noise level. Any error-correction techniques would need to consider the unequal probability of error possibly by assembling codewords from bits of the same significance.

images

Figure 6.43    Coding rules for 2/4M code. In (a) a running sample is made of two data bits DD and earlier and later bits. In (b) the two data bits become the four channel bits shown except when the substitutions specified are made.

6.26 Randomizing

NRZ has a DR of 1 and a jitter window of 1 and so has a FoM of 1 which is better than the group codes. It does, however, suffer from an unconstrained spectrum and poor clock content. This can be overcome using randomizing. At the encoder, a pseudo-random sequence (see Chapter 3) is added modulo 2 to the serial data and the resulting ones generate transitions in the channel. This process drastically reduces Tmax and reduces DC content. Figure 6.44 shows that at the receiver the transitions are converted back to a serial bitstream to which the same pseudo-random sequence is again added modulo 2. As a result the random signal cancels itself out to leave only the serial data, provided that the two pseudo-random sequences are synchronized to bit accuracy.

images

Figure 6.44 When randomizing is used, the same pseudo-random sequence must be provided at both ends of the channel with bit synchronism.

Randomizing with NRZI (RNRZI) is used in the D-1 DVTR. Randomizing can also be used in addition to any other channel coding or modulation scheme. It is employed in NICAM 728 and in DAB as will be seen in the next section.

6.27 Communications codes

Since the original FSK and PSK codes were developed, advances in circuit techniques have allowed more complex signalling techniques to be used. The common goal of all of these is to minimize the channel bandwidth needed for a given bit rate whilst offering immunity from multipath reception and interference. This is the equivalent of the DR in recording, but is measured in bits/s/Hz.

images

Figure 6.45    A DQPSK coder conveys two bits for each modulation period. See text for details.

In PSK it is possible to use more than two distinct phases. When four phases in quadrature are used, the result is quadrature phase shift keying or QPSK. Each period of the transmitted waveform can have one of four phases and therefore conveys the value of two data bits. In order to resist reflections in broadcasting, QPSK can be modified so that a knowledge of absolute phase is not needed at the receiver. Instead of encoding the signal phase, the data determine the magnitude of a phase shift. This is known as differential quadrature phase shift keying or DQPSK and is the modulation scheme used for NICAM 728 digital TV sound. A DQPSK coder is shown in Figure 6.45 and as before two bits are conveyed for each transmitted period. It will be seen that one bit pattern results in no phase change. If this pattern is sustained the entire transmitter power will be concentrated in the carrier. This can cause patterning on the associated TV pictures. The randomizing technique of section 6.26 is used to overcome the problem. The effect is to spread the signal energy uniformly throughout the allowable channel bandwidth so that it has less energy at a given frequency. This reduces patterning on the analog video signal in addition to making the signal more resistant to multipath reception which tends to remove notches from the spectrum.

A pseudo-random sequence generator as described in Chapter 3 is used to generate the randomizing sequence used in NICAM. A nine-bit device has a sequence length of 511, and is preset to a standard value of all ones at the beginning of each frame. The serialized data are XORed with the LSB of the Galois field, which randomizes the output which then goes to the modulator. The spectrum of the transmission is now determined by the spectrum of the psendo-random sequence. This was shown in Chapter 3 to have a spikey sinx/x envelope. The frequencies beyond the first nulls are filtered out at the transmitter, leaving the characteristic ‘dead hedgehog’ shape seen on a spectrum analyser.

On reception, the de-randomizer must contain the identical ring counter which must also be set to the starting condition to bit accuracy. Its output is then added to the data stream from the demodulator. The randomizing will effectively then have been added twice to the data in modulo 2, and as a result is cancelled out leaving the original serial data.

Where an existing wide-band channel having a DC response and a good SNR is being used for digital signalling, an increase in data rate can be had using multi-level signalling or m-ary coding instead of binary. This is the basis of the sound-in-syncs technique used by broadcasters to convey PCM audio along baseband video routes by inserting data bursts in the analog video sync pulses. Figure 6.46 shows the four-level waveform of the UK DSIS (Dual Channel Sound in Syncs) system which is used to carry stereo audio to NICAM-equipped transmitters. Clearly the data separator must have a two-bit ADC which can resolve the four signal levels. The gain and offset of the signal must be precisely set so that the quantizing levels register precisely with the centres of the eyes.

images

Figure 6.46    DSIS information within the TV line sync pulse.

images

Figure 6.47    In 64-QUAM, two carriers are generated with a quadrature relationship. These are independently amplitude-modulated to eight discrete levels in four quadrant multipliers. Adding the signals produces a QUAM signal having 64 unique combinations of amplitude and phase. Decoding requires the waveform to be sampled in quadrature like a colour TV subcarrier.

Where the maximum data rate is needed for economic reasons as in Digital Audio Broadcasting (DAB) or digital television broadcasts, multilevel signalling can be combined with PSK to obtain multi-level Quadrature Amplitude Modulation (QUAM). Figure 6.47 shows the example of 64-QUAM. Incoming six-bit data words are split into two three-bit words and each is used to amplitude modulate a pair of sinusoidal carriers which are generated in quadrature. The modulators are four-quadrant devices such that 23 amplitudes are available, four which are in phase with the carrier and four which are antiphase. The two AM carriers are linearly added and the result is a signal which has 26 or 64 combinations of amplitude and phase. There is a great deal of similarity between QUAM and the colour subcarrier used in analog television in which the two colour difference signals are encoded into one amplitude and phase modulated waveform. On reception, the waveform is sampled twice per cycle in phase with the two original carriers and the result is a pair of eight-level signals.

The data are randomized by addition to a pseudo-random sequence before being fed to the modulator. The resultant spectrum has once again the sinx/x shape with nulls at multiples of the randomizer clock rate. As a result, a large number of carriers can be spaced at multiples of the randomizer clock frequency such that each carrier centre frequency coincides with the nulls of all the adjacent carriers. The result is referred to as COFDM or coded orthogonal frequency division multiplexing.21

6.28 Convolutional randomizing

The randomizing in NICAM is block based, since this matches the one millisecond block structure of the transmission. Where there is no obvious block structure, convolutional, or endless randomizing can be used. This is the approach used in the Scrambled Serial digital video interconnect which allows composite or component video of up to ten-bit wordlength to be sent serially along with digital audio channels.

In convolutional randomizing, the signal sent down the channel is the serial data waveform which has been convolved with the impulse response of a digital filter. On reception the signal is deconvolved to restore the original data. Figure 6.48(a) shows that the filter is an infinite impulse response (IIR) filter which has recursive paths from the output back to the input. As it is a one-bit filter its output cannot decay, and once excited, it runs indefinitely. The filter is followed by a transition generator which consists of a one-bit delay and an exclusive-OR gate. An input 1 results in an output transition on the next clock edge. An input 0 results in no transition.

A result of the infinite impulse response of the filter is that frequent transitions are generated in the channel which result in sufficient clock content for the phase-locked loop in the receiver.

Transitions are converted back to 1s by a differentiator in the receiver. This consists of a one-bit delay with an exclusive-OR gate comparing the input and the output. When a transition passes through the delay, the input and the output will be different and the gate outputs a 1 which enters the deconvolution circuit.

Figure 6.48(b) shows that in the deconvolution circuit a data bit is simply the exclusive-OR of a number of channel bits at a fixed spacing. The deconvolution is implemented with a shift register having the exclusive-OR gates connected in a reverse pattern to that in the encoder. The same effect as block randomizing is obtained, in that long runs are broken up and the DC content is reduced, but it has the advantage over block randomizing that no synchronizing is required to remove the randomizing, although it will still be necessary for deserialization. Clearly the system will take a few clock periods to produce valid data after commencement of transmission, but this is no problem on a permanent wired connection where the transmission is continuous.

images

Figure 6.48 (a) Modulo-2 addition with a pseudo-random code removes unconstrained runs in real data. Identical process must be provided on replay. (b) Convolutional randomizing encoder, at top, transmits exclusive OR of three bits at a fixed spacing in the data. One-bit delay, far right, produces channel transitions from data ones. Decoder, below, has opposing one-bit delay to return from transitions to data levels, followed by an opposing shift register which exactly reverses the coding process.

6.29 Synchronizing

Once the PLL in the data separator has locked to the clock content of the transmission, a serial channel bitstream and a channel bit clock will emerge from the sampler. In a group code, it is essential to know where a group of channel bits begins in order to assemble groups for decoding to data bit groups. In a randomizing system it is equally vital to know at what point in the serial data stream the words or samples commence. In serial transmission and in recording, channel bit groups or randomized data words are sent one after the other, one bit at a time, with no spaces in between, so that although the designer knows that a data block contains, say, 128 bytes, the receiver simply finds 1024 bits in a row. If the exact position of the first bit is not known, then it is not possible to put all the bits in the right places in the right bytes; a process known as deserializing. The effect of sync slippage is devastating, because a one-bit disparity between the bit count and the bitstream will corrupt every symbol in the block.22

The synchronization of the data separator and the synchronization to the block format are two distinct problems, which are often solved by the same sync pattern. Deserializing requires a shift register which is fed with serial data and read out once per word. The sync detector is simply a set of logic gates which are arranged to recognize a specific pattern in the register. The sync pattern is either identical for every block or has a restricted number of versions and it will be recognized by the replay circuitry and used to reset the bit count through the block. Then by counting channel bits and dividing by the group size, groups can be deserialized and decoded to data groups. In a randomized system, the pseudo-random sequence generator is also reset. Then counting derandomized bits from the sync pattern and dividing by the wordlength enables the replay circuitry to deserialize the data words.

In digital audio the two’s complement coding scheme is universal and traditionally no codes have been reserved for synchronizing; they are all available for sample values. It would in any case be impossible to reserve all ones or all zeros as these are in the centre of the range in two’s complement. Even if a specific code were excluded from the recorded data so it could be used for synchronizing, this cannot ensure that the same pattern cannot be falsely created at the junction between two allowable data words. Figure 6.49 shows how false synchronizing can occur due to concatenation. It is thus not practical to use a bit pattern which is a data code value in a simple synchronizing recognizer. The problem is overcome in NICAM 728 by using the fact that sync patterns occur exactly once per millisecond or 728 bits. The sync pattern of NICAM 728 is just a bit pattern and no steps are taken to prevent it from appearing in the randomized data. If the pattern is seen by the recognizer, the recognizer is disabled for the rest of the frame and only enabled when the next sync pattern is expected. If the same pattern recurs every millisecond, a genuine sync condition exists. If it does not, there was a false sync and the recognizer will be enabled again. As a result it will take a few milliseconds before sync is achieved, but once achieved it should not be lost unless the transmission is interrupted. This is fine for the application and no-one objects to the short mute of the NICAM sound during a channel switch. The principle cannot, however, be used for recording because channel interruptions are more frequent due to head switches and dropouts and loss of several blocks of data due to a single dropout is unacceptable.

images

Figure 6.49    Concatenation of two words can result in the accidental generation of a word which is reserved for synchronizing.

In run-length-limited codes this is not a problem. The sync pattern is no longer a data bit pattern but is a specific waveform. If the sync waveform contains run lengths which violate the normal coding limits, there is no way that these run lengths can occur in encoded data, nor any possibility that they will be interpreted as data. They can, however, be readily detected by the replay circuitry. The sync patterns of the AES/EBU interface are shown in Figure 6.50. It will be seen from Figure 6.33 that the maximum run length in FM coded data is one bit. The sync pattern begins with a run length of one and a half bits which is unique. There are three types of sync pattern in the AES/EBU interface, as will be seen in Chapter 8. These are distinguished by the position of a second pulse after the run-length violation. Note that the sync patterns are also DC-free like the FM code.

images

Figure 6.50    Sync patterns in various applications. In (a) the sync pattern of CD violates EFM coding rules, and is uniquely identifiable. In (b) the sync pattern of DASH stays within the run length of HDM-1. (c) The sync patterns of AES/EBU interconnect.

In a group code there are many more combinations of channel bits than there are combinations of data bits. Thus after all data bit patterns have been allocated group patterns, there are still many unused group patterns which cannot occur in the data. With care, group patterns can be found which cannot occur due to the concatenation of any pair of groups representing data. These are then unique and can be used for synchronizing.

In MADI, this approach is used as will be seen in Chapter 8. A similar approach is used in CD. Here the sync pattern does not violate a run length limit, but consists of two sequential maximum run lengths of 11 channel bit periods each as in Figure 6.50(a). This pattern cannot occur in the data because the data symbols are only 14 channel bits long and the packing bit generator can be programmed to exclude accidental sync pattern generation due to concatenation.

References

1. Deeley, E.M., Integrating and differentiating channels in digital tape recording. Radio Electron. Eng., 56 169–173 (1986)
2. Mee, C.D., The Physics of Magnetic Recording, Amsterdam and New York: Elsevier–North Holland Publishing (1978)
3. Jacoby, G.V., Signal equalization in digital magnetic recording. IEEE Trans. Magn., MAG-11, 302–305 (1975)
4. Schneider, R.C., An improved pulse-slimming method for magnetic recording. IEEE Trans. Magn., MAG-11, 1240–1241 (1975)
5. Miller, A., US Patent. No. 3,108,261 (1960)
6. Mallinson, J.C. and Miller, J.W., Optimum codes for digital magnetic recording. Radio and Electron. Eng., 47, 172–176 (1977)
7. Miller, J.W., DC-free encoding for data transmission system. US Patent 4,027,335 (1977)
8. Tang, D.T., Run-length-limited codes. IEEE International Symposium on Information Theory (1969)
9. Cohn, M. and Jacoby, G., Run-length reduction of 3PM code via lookahead technique. IEEE Trans. Magn., 18, 1253–1255 (1982)
10. Horiguchi, T. and Morita, K., On optimisation of modulation codes in digital recording. IEEE Trans. Magn., 12, 740–742 (1976)
11. Franaszek, P.A., Sequence state methods for run-length linited coding. IBM J. Res. Dev., 14, 376–383 (1970)
12. AES Recommended practice for Digital Audio Engineering – Serial Multichannel Audio Digital Interface (MADI). J. Audio Eng. Soc., 39, No.5, 371–377 (1991)
13. Jacoby, G.V. and Kost, R., Binary two-thirds-rate code with full word lookahead. IEEE Trans. Magn., 20, 709–714 (1984)
14. Jacoby, G.V., A new lookahead code for increased data density. IEEE Trans. Magn., 13, 1202–1204 (1977)
15. Ogawa, H. and Schouhamer Immink, K.A., EFM – the modulation method for the Compact Disc digital audio system. In Digital Audio, edited by B. Blesser, B. Locanthi and T.G. Stockham Jr, pp. 117–124, New York: Audio Engineering Society (1982)
16. Schouhamer Immink, K.A. and Gross, U., Optimisation of low frequency properties of eight-to-fourteen modulation. Radio Electron. Eng., 53, 63–66 (1983)
17. Fukuda, S., Kojima, Y., Shimpuku, Y. and Odaka, K., 8/10 modulation codes for digital magnetic recording. IEEE Trans. Magn., MAG–22, 1194–1196 (1986)
18. Doi, T.T., Channel codings for digital audio recordings. J. Audio Eng. Soc., 31, 224–238 (1983)
19. Anon., PD format for stationary head type 2-channel digital audio recorder. Mitsubishi (January 1986)
20. Schouhamer Immink, K.A., Graceful degradation of digital audio transmission systems. Presented at the 82nd Audio Engineering Society Convention (London, 1987), Preprint 2434(C-3)
21. Alard. M. and Lasalle, R. Principles of modulation and channel coding for digital broadcasting for mobile receivers. EBU Review, 224, 168–190 (1987)
22. Griffiths, F.A., A digital audio recording system. Presented at the 65th Audio Engineering Society Convention (London, 1980), Preprint 1580(C1)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset