2

Video

Video is one of the most widely-used formats for creating moving images, by both amateurs and professionals alike, and it’s surprising how strictly standardized it is, considering the numerous manufacturers in constant competition. Take, for example, the lowly VHS cassette. Despite the many different makes and models of VHS cassette recorders, and the multitude of different VHS cassette manufacturers, any standard VHS cassette can be used in just about any VHS recorder. It’s a similar story with playback—intercontinental issues aside (you can’t necessarily play a European VHS in a U.S. recorder, for reasons that will be covered later)—a VHS recorded on one machine will play as intended on any other machine.

There are many different video formats though, which are constantly undergoing revision. The current crop of HD, DV, and DVD formats will gradually replace VHS and Betacam SP formats, just as previously popular M-II and U-Matic formats were replaced in mainstream video production. Of course, today’s formats will eventually give way to future formats, and in time, the use of video may give way entirely to the use of completely digital systems.

2.1 A Brief History of Television

The way that modern video systems work is determined to a large degree by the systems that were created decades ago. Video technology stems from the invention of television, and color television is built upon the principles of black-and-white television. The fundamental building block of electronic television is the cathode ray tube (CRT). Put simply, the CRT is a beam of electrons fired down a vacuum tube at a screen. The screen is coated with a phosphorescent material, which glows when struck by electrons. The basic setup allows a glowing dot to be produced on the screen. By using a combination of magnets and electrical current, both the intensity and position of the beam can be altered. And because the phosphors continue to glow for a short period after the beam has moved, the beam can rapidly sweep out patterns on the screen.

This concept was adapted to form complete images by sweeping the beam across several rows. During the trace of each row, the intensity of the beam can be varied to form a monochrome picture.

For this to work for television, the path of the beam had to be standardized, so that the picture area and timing of the beam was the same regardless of the particular television set used for display. In essence, the television signal received consisted of the intensity component only. This process was tested in the 1930s, resulting in the first television broadcasts in 1936.

images

Figure 2–1   A single beam can be used to trace intricate patterns on a phosphor screen

In the 1960s, color television became a reality. Importantly, the color television signals could still be seen on black-and-white televisions, a feat achieved by encoding the color components separately from the luminance. By then, the resolution of the image had improved greatly, and numerous technical innovations had resulted in televisions sets and broadcast systems that created and displayed images with increasing clarity, at reduced cost.

Around this point, national standards were defined. The NTSC standard was used for U.S. broadcasts, but the PAL standard was adopted by many other countries, as NTSC showed problems trying to replicate certain colors. The completion of these standards led to widespread manufacture of commercial televisions and the use of video casette recorders (VCRs).

2.2 Video Image Creation

In many ways, video cameras generate images in much the same way as digital cameras. Video cameras are essentially an optical system built around a detector, typically a CCD (charge coupled device), which exploits the photoelectric effect to generate an electrical signal.1 This electrical signal is then recorded to magnetic tape for later playback. The basic idea is that the more light received by the detector, the stronger the electrical signal. Because an electrical signal can be measured continuously, the changes in light can be recorded over time. That’s nice if you want to keep track of the amount of light at a single, specific point in space (and in fact, this forms the basis of most light meters used in photography), but it’s not much good for recording recognizable images.

So rather than having a single light detector, video cameras have a grid (or “array”) of them. Provided there are enough elements in the grid, a detailed gray scale image can be formed. And, as before, the image constantly updates over time, resulting in the formation of a moving image. To record a color image, the light received must be separated into spectral components (typically red, green, and blue—RGB—which recombine to form white)—for example, by splitting the incoming beam into three beams and diverting each beam to a CCD array that is sensitive to a particular component.

images

Figure 2–2   Video images consist of individual picture elements arranged in a grid

There are some limitations to this process. The main problem is the system bandwidth (i.e., the maximum amount of information the system can carry at a given point in time). Especially with color video of a reasonable resolution (i.e., level of spatial detail), a large quantity of signal information is generated, in many cases more than the format allows for (or the components of the system can handle). There are several ways to minimize the bandwidth requirements, which include limiting the maximum resolution and the color space, splitting the recording into fractions of time (frames and fields), and using compression, each of which is covered in this chapter.

images

Figure 2–3   Color video images are formed by separating red, green, and blue components of the image and recording each separately. (See also the Color Insert)

Video images aren’t created by video cameras exclusively. Many computer systems are able to directly output video signals (e.g., outputting exactly what is seen on a monitor), which can be recorded to tape as needed (this also applies to DVD players and video game consoles). In addition, other types of devices may generate video signals, such as pattern generators used to test and calibrate video equipment.2

2.3 Anatomy of a Video Image

Video images are just a single continuous electrical signal (with the exception of component video images, which are covered later). This means that part of the signal carries image information and part of it carries special “sync pulses” to identify where each part of the image begins and ends (although the timing of the signal is also critical in ensuring this happens).3 Video images are constructed from the signal line by line, field by field, and frame by frame.

2.4 Video Standards

For various historical reasons, different geographic regions use different video standards that are incompatible. The United States and Japan, for example, use the NTSC standard, while Australia and most of Europe, use the PAL video standard. Other parts of the world use the less-common SECAM standard.

The choice of video standard defines many of the attributes of the video signal, such as the resolution and frame rate (see the Appendix for details on the different systems). Each system is incompatible with the others, which is why European video tapes won’t play on U.S. VCRs (although they can still be used for making new recordings), and vice versa. However, it is possible to get around this limitation by using “standards converters,” which are devices that alter video signals, changing them from one standard to another.

2.4.1 Definition

Similar in concept to the difference between different video standards, a recent development is videos of different definition. Originally, all video types could be categorized in what is now known as “standard definition” (or SD), with fixed frame rates, resolutions, and so on, as determined by the standard used. However, now a new generation of high definition (or HD) video formats have arrived that deliver greater resolution than standard definition formats, are in many ways independent from the different video standards, and are typically defined by their resolution and frame rate.4

Other formats, such as the broadcast HDTV or the new HDV formats, offer improved quality over SD formats and fall under the high definition umbrella, although they may not be of the same level of quality as other high-end HD formats. For more details on different SD and HD standards, refer to the Appendix.

2.5 Resolution

In broad terms, resolution determines the level of detail in an image. In video signals, the resolution is the measure of the number of horizontal lines used to build a picture. The higher the number of lines, the higher the resolution, and the higher the level of detail that can be “resolved,” and hence, the greater the quality of the image.

In practical terms, the resolution of a video image is determined by the standard. For example, PAL videos have 625 lines of resolution, and NTSC 525. In practice, however, some of these lines are not used for image information, so the actual picture resolution is slightly lower.5 Each high definition format also has a specific number of lines associated with it. For instance, both the 1080p30 and 1080i50 formats have 1080 lines of picture (and unlike SD formats, all the lines are used for image information).

images

Figure 2–4   An image at HD resolution (top) and an equivalent SD resolution image. © 2005 Andrew Francis

2.6 Temporal Frequency

Like most moving picture media, video gives the illusion of motion by showing multiple images in rapid succession. Each image is known as a “frame,” and the frequency of the frames being displayed is known as the “frame rate,” usually measured in frames per second (or Hertz). As with resolution, the frame rate is determined by the video standard. PAL and SECAM systems use 25 frames per second, while NTSC runs at 29.97.6 Humans are not particularly sensitive to differences in frame rate faster than 20 frames per second (fps), so a higher frame rate does not necessarily impart a greater degree of visual quality to a sequence. The most important aspect about frame rate is that it should be consistent throughout a project to ensure accurate editing later on. It is possible to “retime” or change the frame rate digitally, either keeping the apparent speed of motion constant between two formats or to allow for temporal effects, such as slow motion.

images

Figure 2–5   A sequence captured at 30fps

images

Figure 2–6   The same sequence at 25fps. Although the 30fps sequence has more frames during the same time period, the two sequences are perceptually the same

2.6.1 Fields

To confuse issues further, video systems (with the exception of newer “progressive scan” formats) divide each image frame into two “fields,” Each field consists of half of the lines of the original image, with one field containing all the even-numbered lines, and the other field containing the odd-numbered lines. These are “interlaced” to form the complete frame.

Interlacing is done to limit the bandwidth during recording. Most video cameras record imagery using a “line transfer” method, whereby each line of a frame is sent from the CCD to the recording mechanism sequentially. The problem was that video cameras could not do this fast enough (because the bandwidth was insufficient to allow it), and so a significant amount of time passed between transferring the first line of a frame and the last line. The last line of the image actually happened a significant amount of time after the first, meaning that the top part of the picture was “older” than the lower part. The result is that strange streaking and other visual artifacts will appear, particularly in regions with lots of motion.7

images

Figure 2–7   Two fields are combined to form an interlaced frame

Frames are therefore displayed a field at a time—so quickly (close to 60 fields per second for NTSC video) that individual frames are perceived as entire frames, which are in turn perceived as continuous motion. Most video cameras are designed so that each field is recorded independently, so that each is an accurate recording of that moment in time, rather than recording a complete frame and dividing it into two fields.

Interlaced HD formats denote the interlacing with an “i” preceding the field rate (as in 1080i59.94), whereas progressive formats use a “p” (as in 1080p29.97).8 There are also references to “progressive segmented frames” (or PsF), which are essentially progressive frames that have been split into two fields for encoding purposes only.

When recording video for display directly on television sets, the issue of fields is somewhat irrelevant. Provided that the correct video standard is adhered to, a video recording will play back as intended, without requiring any knowledge of the composition of the fields and frames of the images. However, digital intermediate environments are almost exclusively based around progressive image formats, which allows for easy conversion between various formats, layering of elements, and repositioning and resizing of footage, among other options. Use of interlaced imagery in a frame-based environment, be it digital, film, or progressive scan video, can result in visible artifacts being produced. For this reason, it is advisable to work with progressive scan (full-frame) formats or to utilize some de-interlacing methods (discussed in Chapter 9) on interlaced video footage to be used in a digital intermediate pipeline.

images

Figure 2–8   Interlaced formats record sequences a field at a time

images

Figure 2–9   Progressive formats record sequences a frame at a time

2.7 Color

We’ve established that, in order for a color image to be recorded, the light received by the camera must be split into separate components. Typically, these components are red, green, and blue light, which recombine in various proportions to form the full spectrum of color.

The majority of video formats do not record color information as separate red, green, and blue components though. Instead, the color information is stored according to the luminance component (the level of brightness or darkness), with color information stored separately. For PAL systems, a YUV color system is identified (Y being the luminance component, and U and V being chromacity components), and for NTSC systems, a YIQ (Y is again the luminance component, and I and Q the chromacity components).9 Black-and-white televisions are able to extract the Y component (and ignore the color components) to produce a decent black-andwhite image. For color televisions, the three components may be recombined to form full color images. In addition, the organization of the three components allows for some specific advantages to transporting the video signals.

images

Figure 2–10   Color video images are made up of one luminance channel and two chromacity channels. © 2005 Andrew Francis. (See also the Color Insert.)

2.7.1 Gamma

The Y component of a digital video signal (which includes most modern video cameras) is not a true measure of luminance (the absolute light level in a scene) but is actually a nonlinear, weighted average of the RGB values received by the camera, created by converting the received RGB values to Y’IQ (or Y’UV).10 The difference is subtle and requires the received light to be “gamma corrected” before being encoded.

A nonlinear response to light is one where twice the amount of light does not necessarily result in a corresponding signal that is twice as strong. Many different media exhibit this nonlinearity, such as film projection, and even the image displayed on a monitor. Digital images, on the other hand, encode luminance (or in this case, “luma”) values on a linear scale. This means that they will not display as intended if output directly to a monitor.

Gamma correction attempts to solve this problem by altering the luminance of the RGB values in a nonlinear way, which better matches the response of a monitor. The result is that the recorded image looks correct when viewed on a monitor. Different video standards (and therefore different video cameras) use different gamma values, which are used during recording to encode the image values.

2.7.2 Video Compression

The human eye responds better to changes in luminance than it does to changes in chromacity. This fact is used to reduce the amount of information that a video signal needs to carry. It means that for a typical YIQ signal, the Y component is far more important than the I and Q components and suggests that the Y component should carry more information than the other two. To limit the required bandwidth of the signal, the color information is made smaller (or “compressed”) by discarding some of it. This allows signals to be broadcast much more efficiently and greatly simplifies the electronics associated with decoding and displaying video signals.

Different video formats (specifically, different video tape systems, such as Betacam or VHS) use different levels of compression, which results in different levels of quality (but also different inherent costs). The level of compression is often described in terms of a “sampling ratio,” with the relative quality of each component. For example, a sampling ratio of 4:4:4 (either for Y:I:Q or Y:U:V) indicates no color compression, as each component contains the same amount of compression. On the other hand, a ratio of 4:1:1 means that almost no color information is present at all (although the luminance level is intact).11 Such a low level of color information may be imperceptible to the human eye in many cases; however, as discussed further in Chapter 8, even the imperceptible color information is used during image editing, in particular during color-grading processes.

In addition, some video formats further reduce bandwidth requirements by discarding some of the spatial information, which means reducing the stored resolution. For example, a PAL VHS stores around only 275 lines (compared to the broadcast standard of 625), which are then “up-sampled” (i.e., duplicated or blended) to fill the display, again resulting in an inferior-quality image. Refer to the Appendix for the compression levels of the most common video formats.

2.7.3 Composite and Component

Video signals may be recorded (as well as transported) either as “component video” signals, such that each of the Y, I, and Q components are recorded and transported separately, or as “composite video” signals, where the components are encoded as a single signal. Component video signals are generally of higher quality by being kept separate (because doing so prevents the different signals from contaminating each other and reduces the chance of noise and other distortions affecting the signal) but are more expensive, requiring additional electronics to process each signal, as well as additional cables for transport.12

2.7.4 Precision

Another aspect to video color quality is the maximum precision of each point recorded. Compression discards color information to reduce the bandwidth of the signal, but even where no compression is used (or when looking at only the luminance part of the signal), there is a limit to the accuracy of the measurements or recordings. For the most part, distinguishing between light areas and dark areas of an image is easy, but regions where the difference is more subtle may cause problems.

Suppose you have an apple pie. You could be fairly confident of dividing it into two halves. You could further divide each of these into two more halves, and so on. But at some point, you could not confidently subdivide the pie without risking it falling apart (or some similar culinary faux pas). One possible solution is to start with a bigger pie.

In the same way, the full range of light in a video image may be divided into luminance values, like dividing a ruler into millimeters or inches. But there is a limit to the number of divisions that may be made, and that limit is due to the maximum precision of the video system. Higher levels of precision result in more accurate tonal rendition and therefore higher-quality images.

Precision is usually measured in bits per sample (in much the same way that digital images are, as covered in Chapter 4), so that 10 bits per sample (which is used by digital Betacam formats) represents greater precision than 8 bits per sample (as used by DVCAM formats).

Note that color precision can only be accurately measured for digital video formats because the precision of analog formats depends on a large number of factors, such as the condition of the tape and VCR used.

2.7.5 Headroom

At what point can a paint stroke be considered white rather than gray? How bright must a shade of red be for it to be considered bright red? These are not necessarily deep philosophical questions, and the answer is simple: it depends. It depends upon the context of the scene, and it depends upon the will of the person creating the image.

The human eye is very versatile. After around 30 minutes of being immersed in a completely dark environment, the eye develops night vision, allowing it to see under extremely low levels of light. Objects that were previously invisible might suddenly appear very bright. And this is true of pretty much any situation. To determine the brightness of a particular point in a scene, you compare it to every other point in the scene. Put a gray cat in front of a black curtain, and it will appear much brighter than the same cat in front of a white curtain. We can look at a photograph of a scene and perceive the light and dark values similar to our experience when we are actually at the scene, because the person taking the photograph (or indeed, the camera itself) defined explicit boundaries as to the locations of the light and dark regions. The photographer sets the exposure so that bright areas are recorded as such (or conversely, so that dark areas appear dark).

Different mediums respond differently to luminance range (or “dynamic range,” which is covered in Chapter 5), which means that film can capture greater differences between light and dark regions of a scene. Nevertheless, the basic concept remains: a point is set for every scene where a specific level of illumination can be considered white, and a point where a lower level of illumination is considered black.

Video systems define these regions in terms of signal voltage. By definition, any point in the signal that falls below a certain voltage is considered black, and anything above another voltage is considered white. This is also true of most digital image formats, where a pixel value of 0 is considered black, and 255 is considered white. The key difference between video and digital formats, however (and in fact, one of digital imaging’s key failings), is that video signals that stray outside these boundaries are still recorded. They still display as black or white respectively, but there is a degree of “headroom,” from which the luminance can be rescued by reprocessing the signal. The amount of headroom depends on the video tape quality and the specifics of the video format. It should also be noted that the integrity of the video image in the headroom regions is not as secure as it is within the predefined luminance ranges and may be more susceptible to noise and other degradation.

images

Figure 2–11   A video image can contain more information than is within the visible boundaries (shown by the dashed lines on the graph). The image on the left is the image as it would be displayed on a monitor, while the graph shows the “hidden” information (as seen on a waveform monitor). The full detail can be rescued even after the recording, as shown in the image on the right

2.8 Timecode

To make video editing easier, most video formats carry “timecode” information alongside the picture. Timecode is a numeric method for identifying each frame in a sequence. The most commonly used timecode format is the SMPTE/EBU timecode, which has eight digits denoting hours, minutes, seconds, and frames. For example, a timecode of 10:33:04:12 is 10 hours, 33 minutes, 4 seconds, and 12 frames.

During recording, suitably equipped video cameras inscribe a timecode on the first frame of video, and it counts upward, much like a clock. Timecode can be set to start at any arbitrary value, such as 00:00:00:00 or 07:21:55:04, but it will always count upward. Timecode can be provided by external devices, which is especially useful for synchronizing to audio, for instance. For example, if you are shooting a music promotional video, the music track can be playing on the set, and the track time can be fed from the music player to the video camera. This enables the video elements to be easily assembled into the correct part of the song during editing. It is also possible to generate timecode based on the time of the day (called simply “time of day” recording), so that footage recorded at 1:03 pm might have a timecode of 13:03:00:00 and so on, which is useful for live recordings. This means however, that the recorded timecode will be discontinuous every time the recording is stopped. For example, if a recording stops at 15:06:07:04 and then starts 10 minutes later, the recorded timecode will jump from 15:16:07:04, which can cause problems with some video systems later on. This is known as “free run” recording.

Many video productions adopt the practice of using continuous timecode on all recordings. A day of shooting will start with video recording on a timecode of 01:00:00:00, which continues without breaks. Anytime a recording is stopped in the middle of a tape, the next recording picks up the timecode from where it left off (a process known as “record run” recording), so the tape is free of timecode “breaks.”13 When the tape is filled, a new one is started at a new timecode, such as 02:00:00:00 (assuming each tape lasts for approximately 60 minutes). This practice allows footage to be accurately identified during editing and playback. It is also considered good practice to allow an extra few seconds of recording at the start and end of each recording session to compensate for the “pre-roll” and “post-roll” mechanisms of many editing systems, which use this extra time to position the tape in the right place and accelerate the tape transport to the correct speed (or decelerate the tape upon completion of the transfer).

Timecode is usually recorded to tape using a combination of vertical interval timecode (VITC), which writes the information into the unused lines of the video signal, and longitudinal timecode (LTC), which is available on certain tape formats and writes the information to a physically separate part of the tape as an analog signal. The main difference between the two timecode types is that VITC can be set only during the recording process, whereas LTC can be modified without affecting the image. VITC tends to be accurately read when running a tape slowly (such as when “jogging” through a tape frame by frame to find a particular frame), while LTC works better at high speed (such as when rewinding a tape). Because both types of timecode can exist on a single tape, conflicting timecodes can cause editing problems. Should any such problem arise, it is recommended that the LTC be adjusted to match the VITC.14 In general, anytime dubs (i.e., copies) of tapes are made that contain timecode information, this information is usually carried over to the new tape, which can ease tracking and editing.

2.8.1 Drop-Frame & Non-Drop-Frame Timecode

Timecode runs slightly differently depending on the video standard. PAL systems run at 25 frames per second, so the seconds of the timecode are incremented every 25 frames. This means that the timecode of the frame that follows 00:00:00:24 will be 00:00:01:00. For NTSC video formats, the timecode runs at 30 frames per second, which means that the timecode of the frame after 00:00:00:24 will be 00:00:00:25 and so on, up to 00:00:00:29 at which point the next frame becomes 01:00:01:00.

There is a problem with this method though: NTSC video does not actually run at 30 frames per second. It actually runs at 29.97 frames per second. This is also true for the 29.97p and 59.94i HD video formats, although 30p and 60i formats run at exactly 30 frames per second (which in turn makes 30p and 60i formats incompatible with most NTSC systems). This tiny discrepancy (of 0.1%) may not seem significant, but it will accumulate, resulting in a difference (or “drift”) of about 3.6 seconds for every hour recorded. Clearly this drift creates an issue when using time-of-day timecode continuously, or when broadcasting—you may suddenly find you have an extra 3.6 seconds of material to broadcast every hour. (PAL systems do not have this issue.)

To solve this problem, NTSC drop-frame (DF) timecode may be used. To ensure that the timecode does not drift, the timecode “drops” (or discards) 18 frames every 10 minutes. In doing this, no content is lost; the timecode just skips a number. The way that DF timecode is usually implemented is that two frames are dropped on the minute, every minute, except for every tenth minute. This means that the timecode will go from 00;00;59;29 to 00;01;00;02 (rather than 00:01:00:00), but also from 00;29;59;29 to 00;30;00;00.15 This method ensures that the timecode is perfectly synchronized to “real” time every 10 minutes.

Where such accuracy is not important, it may be desirable to bury your head in the sand and just pretend that this drift in timecode does not exist for the sake of simplicity. In this case, non-drop-frame (NDF) timecode can be used, which provides a more continuous, logical method to count frames, such that 00:00:59:29 does in fact lead to 00:01:00:00 (but that 01:00:00:00 of footage is in fact only 59 minutes and 53.4 seconds in length). Regardless of the system used, consistency is essential. Mixing DF timecode with NDF timecode can cause problems when editing or assembling video footage.

images

Figure 2–12   A video sequence recorded with NDF timecode

images

Figure 2–13   The same sequence as shown in Figure 2–12, this time recorded with DF timecode

It should also be noted that 23.98p HD formats, designed to ease the transition of material shot at 24fps to NTSC formats, do not support the use of DF timecode and therefore cannot be used to make time-of-day recordings.

2.8.2 Burnt-in Timecode

With editing systems, the timecode information can be extracted from the video tape and displayed onscreen alongside the video image. In some situations, it is also desirable to display timecode onscreen even if a suitable editing system is not being used—for example, when a recording was made onto a Betacam tape (which supports different types of timecode) and is then dubbed across to a DVD for playback on a TV (which does not usually allow for timecode to be displayed alongside the picture). In this case, timecode may be “burnt-in” to the video image. With burnt-in timecode (BITC), each frame is visually printed with corresponding timecode, and the timecode will remain through copying to other formats (although it cannot be later removed).

2.9 Limitations of Video

Video is certainly one of the cheapest moving picture formats to use. At the low end, miniDV cameras are very inexpensive and are able to produce reasonable quality images. At the high end, HD cameras produce images that approach the quality of 35mm film, but cost slightly less when taking production factors into account.

One important issue is that color rendition of video images is inferior to those produced by other formats. The dynamic range (a measure of the brightness and contrast abilities of a system, discussed in Chapter 5) of video images is much lower than the dynamic range of film images (although new cameras are emerging that have a nonlinear response to light, which is more like film), and color compression ultimately degrades the video image (although some cameras have a sampling ratio of 4:4:4 and do not discard color information). Even where color compression is not noticeable when viewing the footage, subsequent processes, such as color grading and duplication, can exacerbate even small artifacts.

All video formats suffer from “generational loss.” This means that every time a copy of any video footage is made, the copy will be of lower quality than the original, because during the dubbing process, additional errors and defects are introduced into the signal, particularly noise. The effect becomes more pronounced with each generation, meaning that a twentieth-generation video will be particularly degraded compared to the original.16 With a digital intermediate pipeline however—even one that simply digitizes the footage and outputs it straight from the digital version—all copies can be made digitally, and so generation loss becomes less of an issue. With such a pipeline, the video undergoes less degradation, and so the overall quality remains higher.

Finally, there is the question of speed. All video runs in real time, meaning it takes an hour to transport (i.e., play or copy) an hour of video footage. In some respects, this speed is faster than other formats, but in time, digital equivalents will perform operations many times faster, meaning it might take 10 minutes to copy an hour of digital footage across the world.

2.10 Digital Video

Digital video formats, such as DVCAM, DV SP, and the various HD formats (and to some extent, DVD-video) can be treated the same as analog formats for most purposes. The key difference has to do with the way that digital video images are stored. Unlike analog video recorders, which inscribe the electronic signal directly onto the magnetic video tape, digital video recorders first digitize the video signal (refer to Chapter 5 for more on the digitization process) and then record the images to tape as a series of digits. Doing so retains the quality of the images better than analog recordings because there is less chance of noise affecting the signal.

For the most part, however, digital video systems behave like analog ones. Output is typically made using analog cables (either component or composite cables) in real time, which means that noise or other errors can still be introduced and that each copy is still prone to generation loss (although at a much lower level).17 The color components of the images are usually handled in the same way as with analog counterparts, with chromaticity separated from the luminance components and possibly compressed.

Digital video formats also have the advantage of recording additional, nonimage information. For example, as well as recording continuous timecode, digital cameras can record time of day as a separate data stream. The separate time-of-day recording can be used during editing to quickly separate shots. The additional data that can be recorded onto tape can also include the camera settings, such as the aperture and shutter settings. In the future, it may be possible to encode even more information digitally, such as production notes and take numbers, providing easy access later on.

The line between video and native digital formats is getting increasingly thin. We already think of digital still cameras as producing entirely digital images. In addition, there is very little difference between digital still cameras and modern digital video cameras that can record directly to a digital storage device, such as a hard disk.

2.11 Summary

There are several different standards for video images, each varying in quality, and they tend to be incompatible. Even within a given standard, different video formats can affect the final quality of recorded footage. The reasons for this are largely historical, because video itself stems from the progress of television. In coming years, this trend may change, with new high definition video systems setting the standard for future television broadcasts.

Though video lacks the resolution and color range of photographic film, notable features such as the ability to track footage through timecode assignment, and the huge advantage of being able to view video footage as it is recorded, and with significant ease at any point later, make it a very useful format. Further, the available headroom that allows for extra color information to be squeezed into a recording makes video preferable to digital images in some instances.

Video is a very hard medium to understand conceptually, but it is very easy to use in practice. You get a VCR, push in a video tape, and press Play. However, to integrate video into a digital intermediate pipeline, it is important to be aware of factors such as interlacing and compression, to ensure the maximum possible level of quality. The similarities with completely digital systems mean that video is highly suited to a digital intermediate work flow. Photographic film, on the other hand, is much easier to understand but has many more practical considerations that have to be addressed.

1 The photoelectric effect is one where light hitting certain materials produces an electrical current.

2 Also worth noting is that video signals can be generated in a sense by using television antennae or satellite receivers. Although this process could technically be considered a form of transportation rather than creation, it is a perfectly reasonable way to record a signal consisting of nothing but random noise.

3 These sync pulses, when taken cumulatively, create a “vertical interval” of the signal where no image information is present. The vertical interval can be used to carry additional information, such as the “teletext” or “closed captions” that can optionally be encoded during broadcast and displayed on the receiving television. The vertical interval may also be used to carry certain types of timecode for editing purposes.

4 Most HD formats match the frame rate of SD systems to ease conversion, but there is no such thing as an NTSC HD tape.

5 The horizontal resolution of a video image is determined by part of the signal, the “subcarrier” frequency, which also is determined by the video standard.

6 This somewhat awkward number leads to problems that will be covered in Chapter 7.

7 There was also a problem with televisions trying to display full frames, in that the top half of the picture started to fade from the screen while the lower half was being traced, creating a flickering effect.

8 It is actually more correct to always refer to the frame rate for both interlaced and progressive types, but the vast majority of manufacturers adopt the convention used here.

9 YIQ and YUV are similar in principle, although the associated mathematics for deriving each of the components from an RGB image is slightly different. However, it is fairly easy to convert between the two.

10 A similar process occurs when video images are displayed on an RGB monitor (such as a computer monitor).

11 The exact method of reducing the color components is a function of the camera, which may use sophisticated sampling techniques or may introduce color artifacts.

12 Generally speaking, component video signals have a much better signal-to-noise ratio than composite video equivalents.

13 Many cameras are able to perform this function as long as they do not run out of power.

14 It is important to note that some nonprofessional video systems do not support true timecode, but instead use a frame “counter.” While it can be used to get estimates of the duration of a particular sequence, the counter is reset with each tape and does not help to identify specific frames.

15 Note the use of semicolons (;) rather than colons (:) to denote the use of drop-frame timecode.

16 Ironically, this fact reduces the availability of high-quality pirated video material.

17 With some digital video systems, it is possible to use digital cables (such as firewire cables) to directly transport the digital data rather than the analog data, although this must still be performed in real time and is thus vulnerable to playback errors.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset