Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 22 Digital Set-Top Terminals and Consumer Interfaces

22.1 Digital Set-Top Terminals and Interfaces

With the widespread availability of digital video transmission on cable systems, digital set-top terminals (STTs) have taken over the marketplace. Extensive activities under the auspices of CableLabs and the SCTE, coupled with initiatives from the FCC and inputs from the consumer electronics industry, have resulted in a comprehensive set of standards for digital video transmission and reception. Many of the standards are recognized by ANSI, the American National Standards Institute. Where ANSI standard numbers were available at the time of writing, we refer to the ANSI number and also to the former standard numbers assigned by the SCTE.

A decision by the FCC to separate the STT from program denial created the need for a replaceable module to contain descrambling functions. The replaceable module is called a POD (point-of-deployment) security module also known as “cablecard”. The idea was to allow consumers to purchase their own STTs and obtain a POD module from their cable company for reception of premium programs. In addition, TV manufacturers could choose to incorporate provision for POD modules in TV sets, eliminating the need for an STT altogether, even for premium programming. The POD module handles all vendor-specific in-band and out-of-band communications with the headend, supported by modulators and demodulators in the host device. Host is a generic term used to mean any device that accepts a POD module. This can include STTs, TVs, and any other device in which you might want to insert a POD module. The program input to the POD module is the demodulated 64- or 256-QAM datastream as it comes from the cable TV plant on the so-called forward application transport channel. (The program demodulator is located in the host.) The POD module descrambles the selected video stream from the incoming MPEG transport stream. It also handles out-of-band communications with the headend via the out-of-band (OOB) channel or the DOCSIS channel (advanced host). In addition, the POD module can handle additional services, such as emergency alert.

Concerns of the entertainment industry have resulted in the need for secure signal interfaces between STTs and display equipment. The ability to record signals is controlled by the copyright owner of the program. In this chapter we outline the main issues involved in delivering digital TV to the subscriber. Of necessity, there is a tight coupling between this information and the information on digital transmission in Chapter 4 and on headend signal processing as shown in Chapters 8 and 9.

22.2 Relevant Standards

Several sets of standards contribute to the technologies described in this chapter. Most were generated under the auspices of CableLabs, in cooperation with the SCTE, and formalized into standards by the SCTE. The SCTE then submitted them to ANSI, which adopted them as U.S. standards. Additional standards have come from the Consumer Electronics Alliance (CEA), part of the Electronics Industries Alliance, and from European standards bodies.

Digital Video Subcommittee (DVS): A set of standards describing how TV signals are compressed and transmitted in cable television systems. This initiative by CableLabs has substantially completed the transition to ANSI standardization.

OpenCable: The OpenCable™ initiative was begun in 1997 with a goal of helping the cable industry deploy interactive services over cable. OpenCable provides a set of industry standards that help accomplish this goal via key objectives to define the next-generation digital consumer device, encourage supplier competition, and create a retail hardware platform.

OpenCable has two key components: a hardware specification and a software specification. The hardware specification allows a retail receiver to provide interoperability. The software specification of the OpenCable project, called the OpenCable Applications Platform (OCAP), is designed to solve the problem of proprietary operating system software by creating a common software platform upon which interactive services may be deployed.

VOD metadata: This is a newer initiative. It is a project to investigate the distribution of content assets (for example, movies) from multiple content providers and sent over diverse networks to cable operators. Metadata is descriptive data associated with a content asset package. It may vary in depth from merely identifying the content package title, to information for populating an EPG, to providing a complete index of different scenes in a movie or providing business rules detailing how the content package may be displayed, copied, or sold.

PacketCable: PacketCable is a CableLabs-led initiative aimed at developing interoperable interface specifications for delivering advanced, real-time multimedia services over two-way cable plant. Built on top of the cable modem infrastructure, PacketCable networks use Internet Protocol (IP) technology to enable a wide range of multimedia services, such as IP telephony, multi media conferencing, interactive gaming, and general multimedia applications. Much of PacketCable is covered in Chapter 6. Another early and continuing initiative, DOCSIS, is also covered in Chapter 5.

22.3 Cable TV Digital In-Home Processing

We could have described Figure 22.1 as a representative high-end digital STT, but it doesn’t necessarily represent a real device.¹ Rather, it could be looked at as a superset of most real digital STTs or any device that accommodates POD modules. We’ll refer to this as a host, an all-encompassing term used to describe any device that accommodates a POD module. You may not find all of the features shown on every host.

Figure 22.1 Representative digital host.

Many STTs are built with the POD module functions internal to the STT, so a separate POD module is not necessary. However, the intent of the FCC rules is to promote the availability of retail STTs that will use a POD module for customization of scrambling. The idea is to allow the consumer to purchase an STT, TV, or whatever with the features he or she wants and to get only the POD module from the cable operator. Also, it is possible to build TV sets, VCRs, and other consumer devices with POD module sockets, to allow them to operate as standalone devices, but to use the POD module to descramble digital signals. Note that the POD module is required only for scrambled programming; programs sent in the clear don’t need a POD module to be installed.

22.3.1 Block Diagram Description

Video signals are tuned in the video tuner, which can be the same double conversion tuner used in analog STTs. Figure 21.9 illustrates a tuner in the blocks from the RF in through the second mixer M2. Dual conversion is used to achieve flat response over a very wide range of frequencies while not having to reject an in-band image. In Figure 22.1, the ovals with D or A in them indicate routes taken by digital or analog signals, respectively. An oval with an H indicates a high-definition signal path, which may also carry standard-definition signals.

Analog signals are routed to an NTSC demodulator of conventional design. A VBI decoder recovers VBI data to be inserted into the video signal in the multimedia processor. VBI data includes closed captioning text sent in the VBI of an analog signal, usually on line 21.

The multimedia processor accepts video in several forms as well as accepting VBI data from the VBI data decoder. Conceptually, the signals coming from the MPEG-2 decoder may or may not be in NTSC format (depending on the decoder used). Normally the multimedia processor will convert any format to RGB for insertion of graphics. The multimedia processor converts the signal to NTSC for conventional display and to whatever other formats are required, as described later. The graphics processor is the onscreen display of Figure 21.9, which adds VBI text data when called for and also displays text and graphics generated in the host. An example of host-generated graphics would be the grid of an electronic program guide.

The output from the multimedia processor may take a number of forms, depending on the capabilities intended for the host. We’ll discuss the interfaces in more detail later, but can name them briefly here.

DVI is a digital interface used to transmit non-compressed HDTV pictures to a display. HDCP describes the copy protection specified for use with DVI. The YPrPb display is an analog component interface that is popular in early HDTV displays. It couples luminance (Y) and two color difference signals (Pr and Pb) to the display device.

Baseband, or BB, video is conventional analog NTSC (or PAL or SECAM) baseband video, as described in Appendix B and in Chapter 2. Ch. 3/4 RF output is again the conventional RF interface used to couple signals to TVs that don’t have any other interfaces available. Generally, the quality of the RF interface will be the lowest quality of any interfaces, though in many cases you will be hard pressed to see the difference between the RF interface and the baseband video interface. S-VHS is an analog interface that is the same as baseband video except that the luminance and chroma are transferred on separate wires. This has some distinct advantages in allowing for wider luminance bandwidth and in elimination of interference between the luminance and chroma information.

Baseband audio is conventional one- or two-channel analog audio for use by stereo systems. SPDIF is a digital audio interface originated by Sony and Philips (the Sony-Philips digital interface). It is a common digital interface for digital audio using the AC-3 format specified for digital transmission in North America.

Note that there is one digital interface that does not originate in the multimedia processor: the 1394 interface specified for standard-definition digital interfaces. DTCP is the copy protection system specified to be used with 1394 display interfaces. The reason it isn’t shown originating from the multimedia processor is that it is able to transfer the MPEG-compressed video from the incoming video stream, plus information telling the display how to overlay any characters or graphics intended for display (the overlay information is provided from the multimedia processor). Thus, when using 1394, the display device, rather than the source device, is required to form the composite signal by overlaying graphics and/or characters on the picture. The multimedia processor does not have to assume this responsibility.

Returning now to a brief tour of Figure 22.1, an optional real-time MPEG encoder is shown. It is used if the host wants to be able to convert NTSC signals into digital signals for transmission to the display. This could be useful to allow consistent 1394 interfaces to the display. Another reason for including a MPEG encoder would be to accommodate a hard disk drive for storing video. The host would need such an encoder to accommodate analog signals. The storage function has come to be known as personal video recorder (PVR). Note the content encryption at the input to the hard disk interface. This is to ensure that content captured on the hard disk can be accessed only by the user who was authorized to capture the video and to limit its period of use if desired.

The POD module controls decrypting of digital video-encrypted signals. All of the program denial functions that allow the operator to sell premium services are embedded in the POD module, so the host without a POD module is useless for receiving anything except basic service.

Two options for out-of-band (OOB) communications are shown. The out-ofband channel is a QPSK tuner and demodulator plus a QPSK modulator and transmitter conforming to one of two OOB communications protocols recognized by the SCTE. The two are described later. The second OOB communications channel, anticipated for so-called advanced hosts, is a DOCSIS modem, as described in Chapter 5. The idea is to build in a DOCSIS modem, which handles host communications needs. The host could have a data interface to allow the subscriber to use this DOCSIS modem for all data communication needs.

Finally, the host CPU is the computer that controls all the functions of the host. These functions include user interfaces, such as the remote control and on-host controls, as well as any onscreen or on-host displays. The CPU will also house some resident applications that ship with the host. OCAP middleware running on the host CPU allows external applications to be written and loaded to the host. Other functions include external communications, the host role in network management, management of files, diagnostics, and whatever other functions the host manufacturer wants to include.

22.3.2 Cable Input

The cable input is standard cable TV signals, including return signals exiting the host. Input signals range from 54 to about 864 MHz, possibly increasing in the future. (The low frequency of 54 MHz is a North American standard. Other localities often use higher minimum downstream frequencies, allowing a wider upstream bandwidth.) The controlling document is SCTE 40 2001 (formerly DVS 313).² This document further references EIA-23 and EIA/CEA-542-B, specifications developed by the former NCTA/EIA Joint Engineering Committee. EIA-23 defined certain analog interfaces between cable systems and consumer electronics equipment. EIA/CEA-542-B defines channelization. The channel boundaries stated in this specification are reproduced as Appendix A. SCTE 40 2001 defines several types of signals that can be present on the cable plant:

NTSC: These are conventional analog TV channels on the cable.

FAT: This unfortunate acronym stands for forward application transport channels, which carry digital information via MPEG-2 transport streams. These are the digital video channels. Modulation is either 64- or 256-QAM.

NTSC and FAT channels may occupy the spectrum from 54 to 864 MHz and are channelized according to the table in Appendix A. (If history is any indication, the maximum frequency will increase in the future.) Signal level is measured at the end of the drop, except as noted. The maximum signal level is +15dBmV for analog and FAT channels. For analog signals, the minimum signal level is 0 dBmV (governed by FCC rules) at the first consumer-owned device. Digital signal levels are carried at lower levels because they don’t need as high a carrier-to-noise ratio (C/N) as do analog signals. 64-QAM modulated signals are carried nominally — 10 dB with respect to the level of a hypothetical analog carrier on the same frequency. 256-QAM signals are carried nominally — 5 dB from the hypothetical analog level. The specification allows a wider range of signal levels at the consumer device for digital signals, −15 dBmV to +15 dBmV for 64-QAM and −12 dBmV to +15 dBmV for 256-QAM.

C/N is an important quantity used to judge the quality of a signal delivered to subscribers. Today, C/N is sometimes replaced by the carrier-to-composite-noise ratio (CCN) or carrier-to-noise-plus-interference ratio (C/(N +I)). These terms recognize that nonlinear distortions and leakage, which introduced essentially CW interference in analog systems, introduce interference that has the characteristics of noise in digital systems. C/N for an analog NTSC signal should be better than 43 dB (controlled by FCC rules), though good engineering practice dictates higher numbers. For 64-QAM the C/(N +I) should be no less than 27 dB, and for 256-QAM it should be no less than 33 dB. Chapter 4 explains why.

FDC: Out-of-band forward data channels carry control information to STTs and other devices that need it. This is also known as the downstream out-ofband (OOB) channel for addressable STTs. The information transported includes STT channel and program information (collectively known as system information, SI) and STT queries. The main information transmitted are entitlement management messages (EMMs), used to tell the STT or POD module what services it is entitled to receive. FDCs are usually modulated with QPSK (4-QAM) and set to — 8 dB with respect to a hypothetical analog picture carrier at that frequency. FDCs are carried between 70 and 130 MHz.RDC: Out-of-band reverse data channels carry return information from the STT and other devices to the headend. These channels may be carried anywhere between 5 and 42 MHz, depending on the cable system. Though reverse channel output levels may be up to +60 dBmV coming out of a device, the level allowed to appear at the input to another device is limited to +42 dBmV.

22.4 Out-of-Band Channels

The out-of-band FDC and RDC channels may take any of several forms. For advanced hosts, a DOCSIS modem may be included in the host to handle OOB communications. If a DOCSIS modem is included, the host may have an external interface for the DOCSIS modem to allow it to serve data needs for the entire home, not just for the host. The legacy OOB data channel shown in Figure 22.1 is based on one of two specifications: DVS-178 (SCTE 55-1 2002) or DVS-167 (SCTE 55-2 2002). Table 22.1 illustrates key features of the two standards: both the OOB FDC and the OOB RDC. SCTE 55-1 was originally developed by General Instrument; SCTE 55-2 was adapted by Scientific Atlanta from the DAVIC standard. In addition to two-way communications, it is important to provide for a oneway communications mode so that if the return channel fails for any reason, basic broadcast functionality, such as EPG, may be maintained.

Table 22.1 Comparison of out-of-band standards

	SCTE 55-1	SCTE 55-2
Formerly	DVS-178—mode A	DVS-167—mode B
Based on	MPEG, ATM	DAVIC
Downstream
Data rate	2.048 Mb/s	1.544 Mb/s (grade A, required)
		3.088 Mb/s (grade B, optional)
Nominal information rate	2.005 Mb/s	1.28 Mb/s
Frequency	70–130 MHz	70–130 MHz
Channel spacing	1.8 MHz	1 MHz (grade A)
		2 MHz (grade B)
Modulation	D-QPSK, raised cosine α= 0.5 (filter at receiver)	D-QPSK, root raised cosine α= 0.3 (filter split)
Error correction	Randomization, R-S, interleaving	R-S, interleaving, randomization
Received carrier power	-10 to +5 dBmV	-18 to +15 dBmV
Segmentation and reassembly	Based on MPEG-2	ATM AAL5
Addressing	Broadcast	Broadcast
	Singlecast_unit	Singlecast
	Singlecast_network
	Multicast 40, −16, 24
Upstream
Data rate	256 kb/s	Grade A (optional): 256 kb/s
		Grade B (required): 1.544 Mb/s
		Grade C (optional): 3.088 Mb/s
Modulation and filtering	D-QPSK, α= 0.5	D-QPSK, α= 0.3
Access scheme	Polling, Aloha	Slotted TDMA synchronized to a common clock, reserved and contention slots
Packetization	ATM cell preceded by 28-bit unique word and 1-byte packet sequence, followed by 8 bytes R-S parity	ATM cell preceded by 32-bit unique word, followed by 6 bytes R-S parity and 1 byte guard band
Frequency	8.096-40.160 MHz	8-26.5 MHz
Channel spacing	192 kHz	Grade A: 200 kHz
		Grade B: 1 MHz
		Grade C: 2 MHz
RF output power	+24 to +60 dBmV	+25 to +53 dBmV
Received level at headend	+3 ± 10 dBmV	Four target settings, −5, +3, +11, +19 dBmV, all ±3 dB
Required C/N at receiver	20 dB @ PER < 10^—7 after FEC	20 dB @ 10⁻⁶ packet loss rate after error correction
Acknowledgment protocol	STT waits for acknowledgment from headend or enters backoff algorithm	Positive acknowledgment sent for every successfully received ATM cell
Maximum upstream channels per downstream channel	6	8

22.4.1 Downstream 00B Channels

In the downstream direction, SCTE 55-1 operates at a wire rate (the actual data rate on the coax) of 2.048 Mb/s, with an information data rate of 2.005 Mb/s. SCTE 55-2 has two downstream grades, or modes, of which grade A is the only one implemented as of this writing. It features a downstream wire rate of 1.544 Mb/s with a data throughput rate of 1.28 Mb/s. Both standards include the same downstream frequency range of SCTE 40, though in addition SCTE 55-1 shows a default frequency of 75.25 MHz.

Note a difference in the downstream filtering in the two standards. In Chapter 4 we showed that most digital transmission systems employ raised cosine bandpass filtering and that most split the filter evenly between the transmitter and the receiver. This leads to the description of a root raised cosine filter (the “root” terminology arises from the fact that the filter response at each end is the square root of the overall filter, so when the transmit and receive filters are cascaded, a raised cosine filter results). This is the technique used in SCTE 55-2, but SCTE 55-1 places all the filtering at the receiver. This results in a wider occupied spectrum on the cable but improves the C/N characteristics of the receiver.

The numbers in Table 22.1 were taken from the SCTE 55-1 and −2 specifications. There is a difference between them and the downstream received signal level specification of SCTE 40, but the difference is not of practical concern if you follow the recommended practice of placing the data carrier — 8 dB from a hypothetical analog picture carrier at the same frequency. Note also that the downstream addressing for both standards allows addressing either individual hosts or all hosts at once.

22.4.2 Upstream SCTE 55-1 Channel

In the upstream direction, SCTE 55-1 uses a single data rate of 256 kb/s. The difference in upstream access schemes is of interest. SCTE 55-1 has a polling channel. A polling protocol may be thought of as a “speak only when spoken to” protocol. Each host is polled and responds when polled and only when polled. This polling channel (which can be a separate return frequency) is used to allow the host to report its health and to report any pay-per-view programs that have been watched since the previous poll. Other non-time-critical data may be transmitted on the polling channel. Polling takes place on an infrequent basis—each Host may be polled perhaps once per day.

SCTE 55-1 also specifies one or more Aloha channels. These may be used by different applications running on the host. For example, a video-on-demand (VOD) system would like to use the host to request programs at any time and would also like to allow the viewer to command the program to pause and resume, rewind and fast-forward (so-called VCR functions). Aloha plays a central role in the vast majority of Ethernet networks in use today. The Aloha Protocol was developed at the University of Hawaii by Norman Abramson and colleagues in the late 1960s as a way to communicate by radio between the various islands and interisland shipping interests. The radio channels used in Hawaii had a lot in common with the return channel of cable systems today. The Aloha Protocol allowed any station to transmit when it needed to transmit, rather than having to wait until it was “invited” to transmit, using polling techniques. There was no guarantee that one station would be able to hear any other station that was transmitting at the same time. In the original application, this was because of the vagaries of radio paths. In cable plant, there is no way to route a signal from one host to another, so when one host is transmitting, no other hosts will know it.

In a pure Aloha system such as in SCTE 55-1, when a host has something to say, it transmits and then waits for an acknowledgment from the headend. If no one else is using the channel, the transmission gets through, the headend acknowledges it, and the host goes on about its business. If an acknowledgment is not received within a designated time, the host assumes its message didn’t get through because another transmission at least partially overlapped with its transmission. This means there is at least one other host trying to communicate. So the host waits (“backs off”) a random time and tries again. Chances are that the other host who was transmitting at the same time has backed off for a different random time and that on the next try both will get through and will be acknowledged.

Pure Aloha systems work well when there are a lot of potential “talkers,” each needing the channel a small percentage of the time but needing it quickly when it is needed. A VOD application is a good example. Normally, VOD systems allow the subscriber to have VCR-like control, in that they can pause and resume the program, rewind, or fast-forward. A lot of subscribers can be watching VOD and sharing a return channel to send those commands to the headend. Rarely does any one subscriber want to use any of those VCR-like controls; but when he does, he needs a response fast from the headend.* Polling wouldn’t work, because even fast polling of many hosts could take minutes if not hours. A VOD viewer would not wait that long to get a response to, say, a pause command. But since there are few commands issued by each subscriber, many can use the channel and not experience significant interference. A pure Aloha system can be utilized to about 18% before delays due to collisions become unacceptable.³

Figure 22.2 illustrates a typical STT communications channel based on SCTE 55-1 (formerly DVS-178). Some cable TV systems are intended to be locally controlled in terms of STT authorization, and these systems use a control computer. Other systems using SCTE 55-1 are designed to be controlled from a central point via satellite. These systems don’t use a control computer but, rather, have a master integrated receiver-transcoder (IRT—a digital satellite receiver) that supplies control information. Either the master IRT or the control computer supply signals to an OOB modulator, which sends the downstream QPSK signal as described in Table 22.1. This signal is combined with all other downstream signals and supplied to the distribution network, commonly via a downstream optical transmitter.

Figure 22.2 SCTE 55-1 communications system.

In the upstream direction it is common (though not universal) to make provision for several upstream return path demodulators for each downstream modulator. Upstream signals, either polling or Aloha Protocol, come to the headend, usually on an upstream optical receiver. Since there may be multiple combinations of upstream signals for different services, as described in Chapter 9, there will be some combining and splitting loss in the signal path before the upstream signals reach the assigned return path demodulator. The demodulators used may be assigned to different groups of STTs (by the way the upstream signal management is done), or one receiver may be assigned as the polling receiver and the others assigned as Aloha Protocol receivers, each supporting a different service (such as VOD or game playing). The receivers may be assigned by virtue of being tuned to different upstream frequencies. For example, the polling receiver might be assigned to 8.096 MHz (the lowest channel) and the VOD receiver (Aloha Protocol) to 8.288 MHz, the next channel up.

Each receiver demodulates the QPSK upstream signals and supplies it to the headend management system and the network controller. The headend management system receives the information on the polling channel and handles interfaces with the billing system and other systems needing access to the information. The network controller provides interface to third-party applications, such as a VOD system.

22.4.3 Upstream SCTE 55-2 Channel

The system used in SCTE 55-2 (formerly DVS-167) for return transmission is significantly different from the SCTE 55-1 system. It is called a time division multiple access (TDMA) system. SCTE 55-2 allows for three different upstream data rates, though only one is implemented as of this writing. Figure 22.3 shows the overall system. Systems using SCTE 55-2 were generally designed to operate with local host control. A digital network control system (DNCS) provides physical interface to applications needing access to the STTs. The control transmitter is the QPSK transmitter that generates the downstream signals shown in Table 22.1. As with SCTE 55-1 systems, several receivers may be used to improve the upstream throughput.

Figure 22.3 SCTE 55-2 communications system.

The upstream receivers demodulate the incoming signals and pass them to the control transmitter, which in turn passes signals to the DNCS. The DNCS manages some of the communications with hosts itself and relays other communications to and from other computers running various host tasks.

Upstream transmission is synchronized from a common clock at the headend. Downstream transmissions are divided into repeating signaling link extended superframes (SL-ESFs). Each SL-ESF consists of 24 frames, each carrying 193 bits (1 synchronizing bit and 192 data bits). Timing for the upstream begins with receipt of SL-ESF START from the headend. Upstream time slots are defined from this start of the downstream SL-ESF. Normally each Host that needs to talk reserves one or more of the upstream time slots in which to talk. Time slots may be assigned for one-time use or may be reserved for a longer duration if an application needs it. An application may also reserve adjacent slots. For upstream transmission at 1.544 Mb/s, the most common speed, there are nine upstream time slots for each SL-ESF. These time slots consist of a unique word (UW) that serves as a run-in to synchronize the receiver clock. Following this is a 53-byte ATM cell. This is followed by a Reed-Soloman error correction word and finally by a guard time, which ensures that small timing errors don’t cause transmission overlap from different hosts.

Propagation time between the headend and the host is important, since it is necessary to time all transmissions to arrive at the headend at the proper time. The propagation delay difference between a host that is close to the headend and one that is distant is significant, so a ranging function must be applied to each host when it is first added to the system. The headend measures the time delay between the “expected” response time from the host and the actual response time, and it tells the host how much to advance its timing so that the signal arrives at the headend at the correct time. Other systems that depend on tight timing also have this ranging function.

In order to allow a host to gain access to a time slot, there must be provision for a host to inform the headend that it needs a time slot. Also, when a new host is added to the system, it is necessary to allow it to inform the headend of its presence. In order to meet these needs, a return time slot is occasionally set aside for contention. This time slot will work very similarly to the Aloha system described earlier, except that rather than the host’s having the ability to begin “talking” at any arbitrary time, it must wait for a contention time slot, which is defined by the headend. This modification of the pure Aloha Protocol is called slotted Aloha. There is a time slot set aside periodically in which any host needing access to a reserved time slot so informs the headend in the contention time slot. The headend acknowledges the request and assigns an available time slot for that host. Whereas a pure Aloha system becomes unusable at about 18% utilization, slotted Aloha can operate with a utilization of up to about 37%, because there are no partial time overlaps.

The inset drawing at the lower right of Figure 22.3 illustrates the sequence of events that a host must go through in order to deliver an upstream message. Step 1 is to wait until a contention slot comes along. The host transmits its request for an upstream time slot to the control transmitter, which manages the assignment of time slots. In step 2, the control transmitter assigns a time slot to the host. In step 3, the host transmits its data. The upstream time slot is normally assigned in the second SL-ESF following the request. Some short messages may be sent in the contention slot rather than going through the process of getting an assigned time slot.

22.4.4 Advanced Hosts

As shown in Figure 22.1, a DOCSIS communications channel may be substituted for the SCTE 55 communications channel. This so-called DOCSIS set-top gateway, DSG, uses Ethernet tunneling to transmit data to the host, and so it doesn’t need the QPSK channel described earlier. See Chapter 5 for a discussion of DOCSIS modems. If this is done, it is also possible that the host will include an interface to computers, so the modem may be used both for host communications needs and for computer communications. Possible interfaces include Ethernet, USB, and IEEE 802.11, among others.

22.5 Output Interfaces and Copy Protection

A number of output interfaces are shown for the model host of Figure 22.1. We shall first cover the various analog interfaces that are popular on early-generation digital STTs; then we shall cover the digital interfaces.

22.5.1 Analog Interfaces

BB Video

This is the common baseband analog video interface described in Chapter 2 and Appendix B. Sound is not included. The significant frequencies extend from nearly 0 to 4.2 MHz for NTSC video and from nearly 0 to 5 MHz for PAL video. Composite video (including sync, as described in Appendix B) is transmitted at a level of 1 volt p-p, with a source and load impedance of 75 ohms. Normally, the common “RCA” connectors are used, though occasionally other connectors, such as BNC, are employed. Impedance matching is important if the connecting cable is long, to prevent ringing or ghosting. The quality of the interface is better than that of the Ch. 3/4 interface described next but not as good as the other analog interfaces described, because the chrominance information is combined with the luminance, and the two must be separated before they can be processed. The filters used to separate the two are imperfect; even if perfect filters were available, there would be some inevitable interference.

Ch. 3/4 RF

This is baseband video modulated onto an RF carrier, identical to signals transmitted on cable systems and over the air. The one exception is the lower sideband, which is normally removed by the vestigial sideband filter (described in Section 8.3.7). Vestigial sideband filtering is done to allow adjacent 6-MHz channels to be used. Often the vestigial sideband filter is omitted in consumer products. This has no effect on the quality of the signal received, but it does preclude placing another signal in the lower adjacent channel. Also due to cost, the sound carrier frequency may not be as accurate, and the picture/sound carrier amplitude ratio may vary. Again, these differences should not pose a problem in consumer applications.

Although the quality of the Ch. 3/4 video interface can be quite good, of all the interfaces available, this one would be considered the lowest in quality, since it starts with composite video and modulates it onto a carrier, requiring demodulation in the receiver. The extra modulation and demodulation steps cannot improve the quality of the signal.

In North America it is common to use channel 3, except in areas having an off-air channel 3, in which case channel 4 is used. Channel 2 is not commonly used because it is the second harmonic of the old citizen’s band radio service (CB), and harmonics of CB services have been known to impair reception.

The signal level coming out of the STT should be above 0 dBmV and less than approximately 15 dBmV. Sound is modulated onto a subcarrier at 4.5 MHz (NTSC), 5.5 MHz (PAL-B/G), or 6 MHz (PAL-I). F connectors are used almost universally in consumer equipment. It is important to ensure that the source and load impedances are 75 ohms and that 75-ohm cable is used, regardless of the length of the cable.

S-VHS

This standard originated in the mid 1980s as S(uper)-VHS, an improved version of VHS consumer videotape. The recording standard did not become popular for consumer use, though it has been used for semiprofessional purposes. Compared with normal VHS recording, there were some differences in the way the recording was made on the tape. The interface is now sometimes called S-video.

Luminance and chrominance (color) signals are transmitted to the display on different wires. The luminance information is transmitted as a baseband 1-V p-p signal including sync, just as in baseband video. The color subcarrier (3.58 MHz in NTSC and 4.43 MHz in PAL) is not superimposed on the luminance as it is in normal baseband video. Rather, it is transmitted on a different wire. The advantage is that no filtering is needed to separate the luminance and chrominance information. This permits wider bandwidth of the color signal, improving color resolution. Many TVs today have S-VHS inputs, and excellent quality can be achieved using this interface.

Besides allowing wider chrominance and luminance bandwidths, S-VHS eliminates cross-color effects that arise when the chrominance and luminance signals interfere with each other. Cross-color effects may be seen, for example, when fine diagonal stripes, such as in a striped jacket, are on the screen. You have seen spurious colors surrounding such images. Spurious color occurs when luminance signals with a lot of high-frequency content (as are generated by diagonal stripes) mix with the color spectrum, confusing the color demodulator. Since properly produced digital video is derived from a signal that has never existed in composite analog video format, use of S-VHS connection eliminates cross-color effects.

S-VHS uses a four-pin miniature connector. As shown in Figure 22.4, the luminance (Y) and chrominance (C) each have their own ground. The luminance signal is identical in every respect with the baseband luminance signal, including sync. The chrominance signal is a modulated 3.58-MHz (NTSC) or 4.43-MHz (PAL) signal identical with the color subcarrier of the composite baseband signal. If you combine the Y and C signals, you will have a proper composite baseband signal. Again, the reason for the S-VHS connector is to keep the two signals separate so that they will not interact. A second improvement is that separating the chrominance from the luminance allows a wider chrominance bandwidth, resulting in sharper color edges.

Figure 22.4 S-VHS pin function.

Audio is not carried on the connector and is normally carried as left and right audio on a pair of cables terminated in RCA jacks.

YPrPb

This is an analog component interface using three cables to transfer the luminance (Y) and red and blue difference signals (Pr and Pb). It is used for standard definition video and may optionally be provided for HDTV display interface. Sync is carried on the luminance channel. Analog copy protection may be included, as described later. Three separate coaxial cables are used, often bound into a single cable. For consumer applications, RCA connectors are used, whereas BNC connectors are frequently used in professional applications. As with other video signals, 75-ohm cable should be used, though non-impedance-controlled cable is sometimes used in consumer applications.

As shown in Chapter 2, a color television signal is composed of red, green, and blue components. For transmission, these components are converted to a wideband luminance (Y) signal and two narrowerband color difference signals. When the signals are transmitted in analog form, the color signals are called the Pr and Pb signals, for the luminance minus red and luminance minus blue components. The same components, when in digital form, are called YCrCb.

The components are defined by CCIR Recommendation 601 for modern systems as follows:

The physical interpretation is as follows. Assume the red, green, and blue (RGB) outputs from the camera range from 0 for black to 1 (the units can be volts or some arbitrary unit) for maximum of each primary color (the camera must be adjusted such that these maximums correspond to the “brightest” color to be reproduced). Then at each pixel (picture element, or the smallest resolvable element in the scene), R, G, and B will each have a value. The luminance (Y) signal is computed from the first line of the equation, which approximates the response of the human visual system to each color. In order to make up the two color difference signals, the difference between Y and the R and B channels is taken, with the scaling factors shown. Notice that the color difference signals can take on negative values, even though there are no real negative colors. The two color difference signals usually have one-half the bandwidth of the luminance signal (less in analog transmission systems).⁴

The foregoing equations are usually written in matrix form,

Analog Copy Protection

An analog copy protection system was developed a few years ago to prevent consumers from recording premium programs on VCRs. It is used to prevent recording on analog video outputs from the host as well as on DVD players. The system takes advantage of the fact that VCRs use AGC on the video signal to set the peak-to-peak amplitude of the video signal before recording. Special signals are inserted in the vertical blanking interval that cause the AGC to set up to the incorrect level part of the time.

BB Audio

This interface is conventional analog transmission of baseband audio. In stereo systems, left and right channel information is transferred. For home theater use, 5.1 channel audio can be transmitted on six cables. The channels are left, center, and right front channels, left and right rear channels, and a low-frequency enhancement (LFE) channel. The LFE channel is limited to approximately 20–100 Hz and hence is considered “0.1 channel.” Used to provide “feel” to movies in home theaters, it is an optional feature of digital TV. (There is a newer practice that uses a one or two rear channel for 6.1- or 7.1-channel audio, but this is not officially supported in most interfaces as of this writing.)

Common interface levels are about 1 V rms at full modulation. The terminating impedance is usually 2 kilohms or higher, and the source impedance is much lower. Because the frequencies of audio are so low, there is no need to maintain impedance match in consumer systems, and maintaining a much higher load impedance than the source impedance enables more than one load to be placed in parallel is desired. Compare this practice with professional equipment, where a load impedance of 600 ohms is used. The source impedance is often, but not always, either 150 or 600 ohms. Though the impedance of the cable is not critical, it is important to use a high-quality shielded cable to minimize hum pickup and crosstalk.

Consumer equipment almost always uses unbalanced transmission, whereas professional equipment almost always uses balanced transmission, drastically improving the immunity to hum and spurious signal pickup. The issues related to balanced and unbalanced transmission are covered in Section 8.9.2. If long cables are used, there is some possibility of high frequency roll-off due to cable capacitance. This is normally not a serious problem in small home entertainment clusters, but if you find yourself running audio cables between rooms or connecting them with Y adapters, you should pay attention to the total cable capacitance. With a source impedance of 2 kilohms and a high load impedance, a total cable capacitance of some 800 pf could marginally impair frequency response (we are being conservative). Typical audio cables seem to exhibit capacitance of about 30 pf per foot, so as the total cable distance approaches 25 feet per channel, it could introduce some very minor response problems if the source has a 2 kilohm output impedance.

22.5.2 Digital Interfaces

DVI

This is the interface specified for high-definition digital displays. Uncompressed RGB (red-green-blue components, as opposed to luminance and color difference) digital video is transmitted, so the MPEG decoder is in the host. DVI has its roots in computer monitors. Manufacturers recognized that with high-resolution monitors the analog interfaces that had been used for many generations of product were now limiting performance. Furthermore, with LCD displays, the signal to be displayed was converted to analog in the computer and then sent to the display, where it was converted back to digital. DVI was created by the Digital Display Working Group⁵ (DDWG) to convert analog signals into digital signals to accommodate both analog and digital monitors. The standard specifies a single plug and connector that encompass both the new digital and legacy analog interfaces as well as a digital-only plug connector. However, for television applications, the analog channel is not used. DVI handles bandwidths in excess of 160 Mb/s per channel and thus supports UXGA and HDTV with a single set of links. Higher resolutions can be supported with a dual set of links (not used for HDTV applications).^6, ⁷

The interface uses a physical t ransmission protocol known as transition minimized differential signaling (TMDS). In concept it is similar to 8B/10B encoding used with gigabit Ethernet and described in Chapter 19. Two additional bits are added to each word for the purpose of minimizing the number of transitions (in this case) while removing any DC component of the signal.

Figure 22.5 summarizes information concerning the physical interface used in DVI. Figure 22.5(a) shows the transmission of signals between a host such as an STT on the left and a display on the right. Pixel and control data are transferred to a TMDS transceiver and across a data channel (a DVI cable) to a TMDS transceiver on the display, which converts the signals back to pixel and control data for the display. Two complete digital links are included in the DVI specification, called link 0 and link 1. Each link consists of three data channels, for blue, green, and red data. A common clock line is used for all channels. The reason for two links is to double the effective data transfer speed. Only link 0 is needed for HDTV applications. Data channel 0 carries blue information, data channel 1 carries green, and data channel 2 carries red information. Horizontal and vertical sync information is carried in the blue channel. (Note that this is a different practice from that normally used in analog component video, where the green channel carries sync. Green is the logical choice in analog practice because the picture is made up predominantly of green. In digital, there is no particular reason for the sync to appear on one channel over any other.)

Figure 22.5 The DVI. (a) Communications channels. (b) DVI plug. (c) Equivalent electrical circuit. (d) Waveform.

DVI can operate at data rates from 25 Mb/s to 165 Mb/s, depending on how much data is required to be transferred. At the same frame rate, there is up to 5.3 times as much information that must be transferred for HDTV as for SDTV, depending on the display format being used.

Figure 22.5(b) illustrates the connector used for DVI. Though not specified for HDTV use, there is an option for conventional analog signals in accordance with the VESA specification. If it is supplied, an additional four connections are used at the bottom of the connector. The digital signals are transferred on the 24 pins shown.

Table 22.2 shows the connections in the DVI connector. The two italicized rows in the center are used for link 1 and are not used for HDTV applications. Display and clock data are transmitted as balanced signals. Each data channel is individually shielded, with the shield ground being shared between link 0 and link 1. The DDC clock and data are used to identify the type of display, controlled by the VESA data display channel specification. Negotiations carried over this link allow the host and the display to agree on operating parameters. The hot plug detect on pin 16 places a ground on the host pin 16 when a display is plugged in. It allows a user to plug in a display after the host is powered up. The ground generates an action by the host to determine the type and capabilities of the monitor that has just been connected.

Table 22.2 DVI connector pin functions

Figure 22.5(c) illustrates the method used for clock and data signal transmission between the host and the display. This is a balanced transmission technique using terminating resistors at the receiver and switched current sources at the transmitter. Note the similarities between this method of data transmission and balanced audio transmission as shown in Section 8.9.2. Balanced transmission is known for reliable transmission while generating low radiated emissions. Data actuates the switch in the transmitter, which alternately connects the data + and data-lines to the current sink, a circuit that pulls 10 mA to ground at all times.

When the switch is open for the data+ line, it is closed for the data-line, so current is pulled through R1, creating a nominal half-volt drop in R1. Since no current is being pulled through R2, the data+ line is at the supply voltage, nominally 3.3 volts. When the switch changes states with data, the opposite occurs. Figure 22.5(d) illustrates the waveform on data+ and data-. The voltage varies between +3.3 V and +2.8 V, with the two data lines at opposite polarity at all times. Thus, the total nominal swing between the two data lines is 1 V peak-to-peak. A differential amplifier in the receiver converts the differential signal to a single-ended signal for use within the receiver.

Since all signals are balanced with respect to ground, and assuming the two wires carrying each signal in the cable are twisted, then very little signal can radiate. There are rather wide tolerances on the voltages and currents shown, so you may measure a significantly different voltage swing in a real system. There is no absolute maximum specified length of the cable; indeed, it would depend on the clocking speed. For test purposes, the cable is to be 5 feet long. Nominal length to connect the cable between a STT and a monitor might be 6–12 feet; there is some thought that at least at lower speeds the cable could work to about 35 feet. In contrast to the IEEE 1394 interconnect, DVI is not intended for bus applications. It is not possible under the existing standard to transfer individual picture components and have the display put them together, so it is necessary for the multimedia processor to assemble the complete picture before sending it to the monitor.

Copy protection for DVI is described in Section 22.6.4.

1394a

This interface is specified for standard-definition digital displays and for recording. It is a serial bus designed, among other things, for interconnecting a cluster of entertainment devices, such as an STT, a display, a camera, a multichannel audio system, and whatever else the user wishes to connect. It is designed to transmit control information as well as program audio and video, and it can transmit various elements of the final picture separately, relying on the display to put them together. Whereas DVI is a point-to-point link intended to connect a video source to a video display, the IEEE 1394 link includes additional control features and is intended to be connected in bus form, interconnecting multiple devices. MPEG compressed video is transmitted on IEEE 1394a links, as opposed to the uncompressed video transmitted on DVI links.

IEEE 1394 is also known commercially as FireWire™ or i.LINK™. The Consumer Electronics Association (CEA) has specified compatibility standards for using IEEE 1394 in video interfaces, EIA-775-A, DTV 1394 Interface Specification.⁸ Equipment conforming to this standard bears the logo for DTV Link™. Products bearing this logo also conform to the profiles of EIA/CEA-849-A, Application Profiles for EIA-775-A Compliant DTVs. The receiver or display has the capability of decoding an MPEG2 datastream, so the information passed over an IEEE 1394 link is not decompressed in the host. Furthermore, the host can pass a bit-mapped graphic overlay to the display, and the display will overlay it on the picture once it decompresses it. Thus, it is not necessary for the 1394- delivered picture to pass through the multimedia processor of Figure 22.1.⁹ On the other hand, the multimedia processor does need to provide the graphics information to the 1394 interface so that it can be passed to the host.

Figure 22.6 illustrates several aspects of the IEEE 1394 bus as used in entertainment applications. A number of devices, called nodes, are connected together. The data rate is up to 400 Mb/s today, with higher data rates being developed. (Actual wire data rates are 98.304 Mb/s, 196.608 Mb/s, and 393.216 Mb/s. Conventionally the data rates are rounded to the nearest hundred Mb/s.) Termination of each connection is essential at these rates, so each device must regenerate the signal to pass it on to the next device, even if it is not otherwise involved in a particular transaction. A maximum of 63 nodes may be connected together, as shown in Figure 22.6(a). If more nodes are desired, a bus bridge may be used to connect them. The maximum length of each cable segment is loosely specified to be 4.5 m (nearly 15 f), though this is not necessarily considered to be a limit. Advanced versions of the IEEE 1394 specification allow longer-distance operation on twisted-pair and fiber-optic cable.

Figure 22.6 IEEE 1394 connection system. (a) Connecting multiple devices. (b) Incorrect connection. (c) Use of A and B twisted pair.

Figure 22.6(b) illustrates an illegal connection of nodes. Here the three devices are interconnected by a loop, so there is more than one path for a data signal to take. This connection will cause the IEEE 1394 interconnection to fail. As long as a loop connection is not made, the system can automatically determine what devices are connected and can determine the capabilities of all other devices in the network. Connections may be daisy-chained and they may be branched, but they may not be looped.

As illustrated in Figure 22.6(c), the IEEE 1394 interface consists of two circuits that are crossed in the cable. Each circuit consists of a twisted-pair controlled-impedance cable, giving rise to the nomenclature of TPA and TPB. In the cable, TBA on one end is connected to TPB on the other, and vice versa. Both pair are used for data communication in both directions in a half-duplex communications protocol. (Half-duplex means that communication is carried in both directions but at different times.) Among other information, data is transmitted on TPB and received on TPA, and a strobe is transmitted on TPA and received on TPB. The transmission system is nonreturn-to-zero (NRZ) data in which each bit is represented by only one logic level. To aid in clocking, the strobe line makes a transition every time two consecutive bits are the same. This ensures one transition for every bit, which aids in clock recovery at the receiver.

Figure 22.7 illustrates the IEEE 1394 cable in more detail. The original 1394 specification, adopted in 1995, called out a six-conductor connector, which accommodated the two twisted pair (TPA and TPB), as well as a power source (8–40 Vdc at 1.5 A maximum). When 1394a was adopted in 2000, a four-pin connector was used, with the power requirement being dropped. This allowed a more compact connector, with each device responsible for its own power. Only the four-pin connector is specified in OpenCable.

Figure 22.7 IEEE 1394 interconnections. (a) Six-wire to four-wire cable. (b) Four-wire to four-wire cable.

Figure 22.7(a) illustrates a six-pin to four-pin cable. The shell is used for a shield connection, also called VG on the four-pin connector. This is the only ground connection between the two ends of the cable. Although balanced transmission is used, it is still necessary to maintain an approximate common ground between the two ends to prevent amplifier saturation and possibly static buildup problems. As shown in the pin function chart, TPA and TPA* (the opposite side of the balanced pair) on pins 5 and 6 of the six-pin connector are crossed to pins 1 and 2, the TPB and TPB* pins on the four-pin connector. Figure 22.7(b) illustrates the more common four-pin to four-pin connector. Note that each twisted pair is shielded, with the shields connected to the shell, VG, at each end of the cable. TPA/A^* and TPB/B^* are always crossed in the cable.

Figure 22.8 illustrates the electrical connections on the two ends of each twisted pair. TPA is illustrated on the left, and the mated TPB pair is on the right. Follow first the transmission from the strobe transmitter (on the left) to the strobe receiver (on the right). The differential impedance of the twisted pair is 110 ohms, or 55 ohms from either side to ground. The source impedance is set by resistors R1 and R2, the series combination of which is placed differentially across TPA and TPA^*. The junction of the two is pulled to a bias voltage, T_PBias’. The strobe transmitter is a pair of switched current sources similar to those illustrated in the transmitter of Figure 22.5(c) for DVI (the current source in IEEE 1394 is closer to 2 mA than to 10 mA). By using switched current sources, the source impedance is set by R1 and R2 and is constant regardless of the state of the strobe transmitter.

Figure 22.8 IEEE 1394 electrical circuits.

The signal is transmitted from TPA/TPA^* on the left to TPB/TPB^* on the right over the IEEE 1394 cable. Resistors R8 and R9 form the differential termination at the receive end. The junction of the two is pulled toward ground by R10 and C2. The strobe receiver recovers the logic-level strobe signal. It is a differential receiver, which puts out a logic 1 or a logic 0 depending on which of its two inputs is higher in voltage. The differential voltage amplitude on the signal and strobe lines is between 118 and 265 mV, depending on the speed of transmission.

Since the IEEE 1394 bus can operate at different speeds, it is necessary for two nodes to agree on the speed at which they will transmit and receive. This is done by adjusting the amount of current in the two current sinks I_S on the TPB side. If a node can operate only at 100 Mb/s, then it does not have the current sinks, and the value of I_S is 0. If the node wants to propose (or confirm) operation at 200 Mb/s, then it sets I_S = 3.5 mA. For 400 Mb/s operation, I_S = 10 mA. The current for these two current sinks comes from T_PBias’. on the TPA side through R1 and R2. This current changes the common mode voltage on the twisted pair, with the common mode voltage being detected on the two speed receivers in the lower left corner of the diagram. The common mode voltage is nominally 1.665 V for 100-Mb/s transmission and 1.438 and 1.03 V, respectively, for 200- and 400-Mb/s transmission.

The arbitration receivers detect the value of differential voltage on the pair during arbitration, the process by which nodes compete for the bus. Arbitration ensures that only one node at a time is transmitting. Upon completion of arbitration, the winning node is able to transmit a packet or initiate a short bus reset. Arbitration signaling is a protocol for the exchange of bidirectional, unclocked signals between nodes during arbitration. The differential voltage may be forced using a combination of the states of the strobe and data transmitters. Two arbitration receivers are needed to distinguish all the possible states of the connection.

The nodes must have some way to know when a new node is added; this is provided by the I_CD current source in the upper left. It provides a certain amount of common mode current on the bus when the T_PBias’ is disabled. The current creates a voltage drop across R10, which is detected by the bias detect comparator in the lower right. This notifies the existing node that a new node has been connected to it, initiating the discovery process.^10, ¹¹

Two classes of data communication between nodes are supported: asynchronous and isochronous. Asynchronous transmissions occur whenever they occur, not synchronized to any event and not at any particular interval. They are always acknowledged. Isochronous communications occur at regular intervals. An example of asynchronous traffic might be a control message telling a node to adjust its volume or to change a channel. An example of isochronous traffic would be a packet of video or audio information being delivered from an STT to a monitor. These packets must be delivered promptly and at a regular interval if good performance is to be maintained. IEEE 1394 provides for isochronous cycles 8,000 times a second. There is often little point in acknowledging an isochronous packet, because by the time a missed packet could be retransmitted, it is too late to use it. Isochronous cycles take priority over asynchronous traffic, but the protocol guarantees that a minimum portion of the bus time is reserved for asynchronous data transfers. An ongoing isochronous communication between nodes is referred to as a channel. Once a channel has been established, the requesting node is guaranteed to have the requested amount of bus time for that channel every isochronous cycle. Only one node may send data on a particular channel, but several nodes may receive the data. A single node may have multiple channels allocated, and additional channels may be added as long as isochronous capacity is available.¹²

Digital Audio—S/P DIF

Some people actually do try to make a word (spe-dif) out of this acronym. It stands for the Sony/Philips digital interface, a serial digital audio interface that can be used to carry either uncompressed stereo or compressed (that is, non-PCM data) Dolby Digital (AC-3) with up to 5.1 discrete channels of surround sound. (AC-3 audio compression is described in Chapter 3.) The basic specification covering the S/P DIF connection is IEC 60958. It is a 75-ohm coaxial interface using RCA connections. Only one-way communication is supported.

The transmitter supplies 0.5 ± 20% V p-p into 75 ohms. The receiver must operate with signal levels as low as 0.2 V p-p. The signal is self-clocking at a data rate that is 64 times the sampling rate (the rate at which audio data is sampled). As explained in Chapter 6, in order to avoid aliasing, the audio data must be sampled at more than two times the maximum frequency of the signal. If you want to carry audio frequencies up to 20 kHz, you have to sample at greater than 40ks/s (thousand samples per second). A common sampling rate used in CD audio is 44.1 ks/s, whereas for certain other high-quality applications, such as DVD and digital broadcast, the audio is commonly sampled at 48 ks/s, resulting in a data rate of 3.072 Mb/s on the S/P DIF interface. A sampling rate of 32 ks/s is also used for some applications. The sampling rate may be varied by up to 12.5% for varispeed applications. A self-clocking coding scheme called biphase mark is used in which there is a transition in level at the beginning of each bit cell. If the bit transmitted is a 1, there is an additional transition in the middle of the cell. If the bit is a 0, there is no midcell transition.*

When transmitting uncompressed two-channel audio, each frame consists of two subframes, one for left channel data and one for right channel data. Up to 24 audio bits are supported per subframe, with each subframe a total of 32 bits long. (Conventional CD audio data is limited to 16-bit word length and is sampled at 44.1 ks/s.) The additional bits are used for synchronization, auxiliary data, audio sample validity, a user bit, audio channel status, and subframe parity. Two subframes make up one frame, which thus includes one data sample of each of the left and right channels.

A two-channel Dolby Digital (AC-3) decoder will usually be available in the host in order to permit the recovery of digitally tiered audio programming. In some cases, programmers may offer discrete multichannel programming (that is, 5.1), and the host audio decoder must be capable of presenting this program either in stereo via the line level outputs or as a monophonic down-mix via the RF remodulator while the host is tuned to digital programming. This task is accomplished by down-mixing the multichannel audio program using a set of rules carried within the Dolby Digital bitstream using metadata. However, in many if not all cases, the host may default to placing Dolby Digital (AC-3) data on the digital audio interface (the data format is specified in IEC 61937) for use in home theater systems. The receiving device will then convert the Dolby Digital (AC-3) encoded signal to discrete 5.1 channel audio. IEC 61937 defines the format for carrying nonlinear PCM-encoded audio bitstreams over the IEC 60958 interface. To summarize this format, the Dolby Digital (AC-3) data is formed into a sequence of data bursts, where each data burst is made up of a 64-bit burst preamble (Pa, Pb, Pc, and Pd) followed by the burst payload itself (that is, the encoded audio data). The burst preamble consists of a sync word (Pa and Pb), information about the burst payload (Pc), a bitstream number (Pc), and a burst length code (Pd). The complete data burst is then broken up into 16-bit chunks and placed into the audio payload area of the IEC 60958 subframe, in time slots 12 to 27. Both subframes are used to simultaneously carry 32 bits of data. This allows IEC 60958 to convey nonlinear PCM-encoded bitstreams such as Dolby Digital (AC-3) to home theater systems. For more complete information, refer to IEC 61937.¹³

22.5.3 Basic Concepts of Copy Protection

Before we discuss specific copy protection measures, we need to establish the several basic concepts that different copy protection methods have in common. These include the concept of encryption, public-private key pair, and a certificate. Hang in through this discussion, because after we cover this material, we’ll show how these concepts are used in real host applications.

Encryption

Encryption (called scrambling in the context of cable TV program denial) is the process of rendering a packet of data unintelligible but capable of being returned to its intelligible, or unencrypted, original form by some process. Encryption uses a key or key pair known by the sender and the receiver. The keys may be identical (a symmetrical key system) or different (asymmetrical keys). A key in this context is a binary sequence used to encrypt another binary sequence. That is, I take so-called plaintext, which could, for example, be a packet of data representing a portion of a picture I am sending from a host to a display and/or a recording device. I need to encrypt the packet so that it cannot be read by another device, which might usurp the information for a purpose not intended by the owner of the content. In essence, I combine the plaintext with a key: some piece of data that renders the plaintext unable to be read (that is, encrypted). The intended receiver, maybe a video display, knows how to decrypt the packet, returning it to the original plaintext that can be displayed. That is, the intended receiver needs a decryption key, a piece of information used to recover the original information.

Figure 22.9 illustrates a basic encryption system. Suppose Bob wants to send a message to Alice. (Let “Bob” be the host and “Alice” be the display. For some unknown reason, the people who study encryption, cryptographers, traditionally refer to communication between Bob and Alice. We don’t know what happened to Carol and Ted.) Bob has a piece of plaintext, maybe a packet of video display data, that he is ready to send to Alice, the display. But he must send it over an insecure channel, the cable between the two. This cable is insecure because it is possible for the subscriber to connect it to some pirate device, maybe to copy the program to a DVD that can be duplicated and given to people who have not paid for the program. Or the subscriber might do something else with the program that is not authorized by the content owner.

Figure 22.9 Basic data encryption.

In order to allow communication over this insecure channel, the packet is encrypted before being sent to the receiver. A simple form of symmetric encryption is the use of an exclusive OR logic gate. The exclusive OR, XOR, operation is used to add two binary numbers; here it is used to encrypt data. Bit by bit, the data in the packet is presented to the XOR gate. An XOR gate puts out the input bit if the encryption key is a logic 0, and it puts out the opposite bit if the encryption key is 1. Table 22.3 shows the logical operation. When the signal reaches the receiver (Alice), the same encryption bit sequence is used to recover the original information. In order for this to happen, the sender and the receiver have to share a secret, the encryption key. Exchanging, or agreeing on, the key is a major issue in cryptography. (Note that in some of the protocols discussed later, the sender and the receiver don’t have the same key, but rather they have related keys.) Somehow Bob and Alice must agree on a secret key or a pair of keys, but the only channel they have for communicating is an insecure one.

Table 22.3 XOR logic

Plaintext	Key	Out
0	0	0
0	1	1
1	0	1
1	1	0

Public-Private Keys

Exclusive OR encryption as shown in Table 22.3 is representative of simple symmetric key systems. There are also asymmetric key systems, some of which rely on an exponentiation operation rather than an exclusive OR operation. They tend to be much more computationally complex, but they yield some nice advantages, such as simplifying the key exchange problem.¹⁴ Asymmetric keys are used in public-private key pairs.

Figure 22.10 illustrates a basic public-private key system. Suppose Bob needs to send that video packet to Alice. First Alice selects (or in some cases has stored at manufacture) a random number that only she knows. This random number is her private key. From that she can compute a public key (step 2) by applying certain prescribed mathematical operations to that private key. One method of computing a public key from a private key is given later in the description of Diffie-Hellman key exchange. The operations are such that the calculation she does is straightforward, but the inverse calculation is extremely difficult. We show a real example later in the chapter. Alice gives Bob her public key, which Bob uses to encrypt the message. Alice can decrypt it using her private key, which was not sent over the channel. It is not proved here that you can derive a public-private key pair that works this way, but since the mathematics is extremely complex, we must ask you to accept this as fact and refer you to texts that expand on the subject.^15, ¹⁶

Figure 22.10 Public-private key exchange.

An eavesdropper who learns Alice’s public key still cannot decrypt the message because the keys being used are asymmetrical: Alice’s public key encrypted the message, but it cannot decrypt it. Only Alice knows her private key, the only one that can decrypt the message. So the message is relatively safe being passed on the same insecure data channel used to send the encrypted data.

Either the private key can be used for encryption and the public key for decryption or the other way around: the public key can be used for encryption and the private for decryption. The same key cannot be used for both encryption and decryption.

Digital Signatures and Certificates

A problem remains. How does Bob know for sure that it is Alice who is sending her public key and not an interloper? For example, how does Bob, the host, know he is not sending video to a pirate box that invented a public-private key pair? After all, the method of calculating the public key from the private key is public knowledge. Maybe “Alice” is a pirate box that will decrypt the picture and record a DVD. This possibility is avoided by the use of digital certificates. There are many, but a common one is the X.509 certificate, promulgated by the International Telecommunications Union (ITU).

The first important concept in certification is not technical: There must be a trusted party to issue certificates. That party, called a certificate authority (CA) is a third party that both parties to the data exchange have decided to trust. There are companies set up to be this CA, and there are organizations that act as their own CA for specific purposes. If Bob and Alice wish to communicate, their manufacturers must get them certificates from a CA. Bob cannot send pictures to Alice unless he knows for sure that “she” is a legitimate display who will not usurp the video for a nefarious purpose. Alice cannot get a certificate until the CA has determined, using whatever steps it deems appropriate, that Alice is legitimate and will not violate the usage conditions established by the content provider.

So Alice’s manufacturer must go to the appropriate CA and receive a digital certificate before he can build Alice. The digital certificate is then stored in Alice upon manufacture. Each and every copy of Alice, that is, each and every display that is built, will have its own certificate individually prepared by the CA. Alice’s manufacturer sends data for the certificate to the CA, which encrypts portions with its private key and returns the finished certificate to Alice’s manufacturer. Fortunately, once the credibility of Alice’s manufacturer has been manually established, the certificates may be constructed automatically.

Figure 22.11 shows the current version, 3, of the X.509 certificate, the information stored in Alice, provided by the CA. Even Alice’s manufacturer cannot construct the certificate, because it involves the CA’s private key. The certificate must be individually prepared by the CA based on information supplied by Alice’s manufacturer. The certificate is a public document. It does not include enough information for anyone to decrypt data. Rather, it is used only for the purpose of identifying the intended receiver of information and also to convey Alice’s public key.

Figure 22.11 X.509 certificate.

The version section of the certificate identifies the format of the certificate (version 3 is current as this is written). The serial number is a number unique within the CA by which it can identify the certificate. The algorithm identifier conveys the encryption algorithm used by the CA (often the RSA algorithm, named for its inventors, Rivest, Shamir, and Adleman, is used). There are some public parameters used in specifying the public key from the private key, and these are included with the algorithm identifier. (In early versions of the standard, this field is called the signature, but this is a misnomer.)

The next section of the certificate is the name of the CA. The period of validity specifies the beginning and ending dates that the certificate is good. Outside of these dates, the certificate is invalid and may not be used. Subject is the name of the user. The next section bears the subject’s public key. It names the algorithm being used (often RSA) and any parameters associated with the public key. Finally, this section provides the public key that will be used by the sender to encrypt data.

The issuer unique ID, the subject unique ID, and the extension were recently added and may or may not be used. These are fields that may be used by the issuer or user as desired. The last section, the signature, is very important. The signature is the subject’s (Alice in our example) public key plus other certificate data, encrypted with the CA’s private key. This is the reason that the CA must provide individual certificates for each and every device made by a manufacturer. No one, and we mean no one, except the CA must know the CA’s private key. If this private key is ever compromised, then all certificates issued by the CA are rendered invalid.¹⁷

During an initial discovery process, Alice sends her certificate to Bob. Bob must know the CA’s public key. He can know it by virtue of its having been programmed into him during his manufacture or by obtaining it later. Bob decrypts Alice’s public key and other information from the signature, using the CA’s public key. He compares the public key and other information with the same data sent as plaintext in the signature. If the two sets of information match, then he must be talking to a legitimate Alice, because a pirate would not be able to get a certificate from the CA. Bob now has Alice’s public key, so he can send data to her, encrypted with her public key.

The certificate is public knowledge. Can an interloper steal Alice’s certificate? Yes. It wouldn’t do any good, because it does not contain her private key, but the interloper could steal it. As an additional verification that there is a legitimate “Alice” out there, Bob can issue a challenge. He sends Alice a random number called a nonce, which she encrypts with her private key and sends back to Bob. Bob decrypts it with Alice’s public key, which he received as part of her certificate. If the decrypted number matches the number Bob originally sent, he knows that the real Alice is out there and not someone who happens to have her certificate.

Can Bob be absolutely sure he is talking only to Alice and not also to an eavesdropper? No, but it doesn’t matter. There could be a so-called man in the middle. For example, a man-in-the-middle attack could happen if the subscriber has connected a legitimate host (Bob) to an illegitimate device designed to capture the video on its way to Alice. The illegitimate device might then be connected to the real Alice, the legitimate display. The man in the middle passes data between Bob and Alice, but he eavesdrops, or hears all that is going on. When Alice sends her certificate to Bob, the man in the middle passes it through to Bob, who recognizes it as Alice’s legitimate certificate. But it will do the man in the middle no good to have the certificate. He still doesn’t have Alice’s private key, which he would need in order to decrypt what Bob sends to Alice. Because Bob got Alice’s public key in the certificate, he knows that he will be encrypting data for a legitimate display, regardless of who else is out there.

So Bob knows that the real Alice is out there because no one except the real Alice would have her certificate and be able to pass the challenge he sent. He knows the certificate is real because he was able to match information contained therein using the CA’s public key to decrypt part of the certificate and to match it to the same information sent unencrypted. Only the real CA could have signed the certificate. And both sides have placed their trust in the CA. Bob can safely send data to Alice, knowing that the real Alice is receiving it. If anyone else is out there, the data will be useless to him.

In some instances the CA is designated by the originator of the standard utilizing the certificate. For example, for DVI, HDCP is the encryption used. It specifies that Digital Content Protection, LLC, an activity set up by the Digital Display Working Group that defined the specification, is the CA.¹⁸ Anyone desiring to make a device that either sends video to or receives video from a device conforming to the DVI specification must apply to them for licensing. For the IEEE 1394 interface, licensing of the DTCP copy protection algorithm and certification is provided by the Digital Transmission Licensing Administrator, DTLA.¹⁹

In the present application, a certain CA is usually specified. However, in the more general case it is possible to have two devices that were certified by different CAs. Certification may still take place through the use of a chain of certificates.

Certificate Revocation

From time to time it is possible that some device possessing a valid certificate may need to have its certificate revoked. This might happen if the CA’s private key becomes known or, perhaps more likely, if it is discovered that the device has been modified into a pirate device. Many systems have a mechanism for disseminating to devices that they should not honor a certain certificate. One method of disseminating the information is that as soon as the breach of security is recognized, future content will have embedded in it the identity of the certificate that must be revoked. As that content is disseminated to different devices (hosts, legitimate digital VCRs, etc.), each remembers the revoked certificate and passes that information to devices to which it is, or becomes, attached. In this way, sooner or later most devices that may connect to the device with the revoked certificate will learn of the revocation.²⁰ In some cases, discussed later, the host must check with a database to confirm the validity of a certificate.

Key Length, Security, and Computational Complexity

One attack on an encryption key is called an exhaustive search. Encryption keys are simply long strings of 1s and 0s. In an exhaustive search, a would-be pirate simply tries all possible key combinations (sets of 1s and 0s) until the output makes sense. Often this can be automated, because incorrect keys yield results that can be distinguished, by computer, from logical results. The longer the key, the more difficult is an exhaustive search. If a key has a length (number of binary characters) of n, then there are 2ⁿ combinations that must be tried. A short key might have only 24 bits, so there are a maximum of 2²⁴ = 16.78 million combinations that might work. If a computer can try 1 million keys per second, then the maximum time required to crack the encryption is 16.78 million divided by 1 million, or about 17 seconds. On the other hand, a 56-bit key could take 2300 years to crack at the same rate. Keys are used that have up to 256 bits, which could take 3.7 × 10⁶³ years to break at the stated rate of trying keys. Average times, though, would only be half of these numbers.

Sometimes churning keys are used to improve security. This technique simply means that the key is changed fairly often. Each time a key is changed, the pirate would have to begin again to break it. So a pirate may capture a scrambled program, but to unscramble it he would have to break a series of keys that might be changed every few seconds to every few minutes.

Symmetric Algorithms

Many asymmetric algorithms, such as the RSA algorithm used in a number of systems, involve taking a block of plaintext (treated as a binary number) to the power of the key, using some modulus. The computational power required as the key length increases becomes prohibitive for encrypting and decrypting video streams. Computational power may be traded off for time; but when you are transferring digital video, whether compressed or not, you don’t have a lot of time to process the signal. Thus, many times a good asymmetric algorithm, such as RSA, will be used for certificate verification and/or key exchange, which are only done periodically. Then a symmetric algorithm will be used for actual data encryption. In the DTLA nomenclature, this symmetric key is called the content channel key. The key may be long, but the way it is used reduces the computational complexity to something manageable at the requisite speed.

Often the RSA algorithm is used for certificate validation. Then a symmetric key is agreed upon for the actual information scrambling and descrambling. The symmetric key is used in some manner to generate the actual encryption string, which in some encryption techniques is exclusive ORed with the plaintext to create the encrypted data, or ciphertext.

Figure 22.12 illustrates a simple four-bit shift register used to generate ciphertext.²¹ Real shift registers are a lot longer, and multiple shift registers may be used. The shift register is clocked at the same rate as the plaintext data is clocked. Each time the shift register is clocked, the bits in each position shift one place to the right. (The clock is a signal that synchronizes operations dealing with delivery of the video.) The shift register is a series of flip-flops. On a clock cycle, the bit (1 or 0) in b4 is shifted to b3, the bit in b3 is shifted to b2, and so on. The bit in b1 is supplied to XOR2, which exclusive ORs it with the plaintext bit presented on the same clock cycle.

Figure 22.12 Linear feedback shift register used to generate ciphertext.

This type of shift register is called a linear feedback shift register (LFSR). The linear feedback is supplied by XOR1. It XORs the output of two bits, in this case, b4 and b1, feeding the result back as the input to the shift register, which is loaded into b4 on the following clock cycle. The bit sequence (a pseudorandom stream) obtained from a properly designed n-bit LFSR is 2ⁿ −1 states (clock cycles) long before it repeats. It can be initialized with the secret symmetric key, which loads the starting state into the LFSR. A listener will not be able to duplicate the LFSR initialization without knowing the key used to initialize it.

Many other methods of generating the encrypting code are known. A different LFSR may be used for each bit of a parallel encryption system. The LFSRs may be initialized with the same key or different portions of the same key. Nonlinear generators may be used, in which AND functions combine the outputs of various stages to form the output. The outputs of several LFSRs may be combined to form the encryption signal.

Diffie-Hellman Key Exchange Protocol

Though by no means the only useful key exchange protocol, the Diffie-Hellman (D-H) protocol is used in quite a few systems. D-H permits both parties to an exchange (Bob and Alice) to supply part of the secret key. It is done in such a way that the entire key is not sent across the unsecured channel. Thus, a snooper is not going to gain the information needed to steal the secret key. The protocol proceeds as follows.

1. Alice chooses a random large integer x and sends Bob X = g^x mod n, where g and n are D-H parameters that must be passed between the two parties. For example, suppose D-H is used to agree on a key in an X.509 certificate. In the “subject’s public key” portion of Figure 22.11, g and n would be passed as parameters, and D-H would be identified as the algorithm. X is the public key. The parameter n must be large and must meet certain criteria, but g may be a single-digit number.

These parameters are not secret, and an eavesdropper will not know anything useful if he learns them. In some closed systems, g and n may be preprogrammed during manufacturing. The parameter that must be kept secret is x, and, as we shall see, Alice never divulges it in any form. The modulus, n, represents the maximum value the result can take (plus 1). It is the remainder after a division by the modulus. To calculate X = g^x mod n, take the remainder from the operation g^x/n.

2. Bob chooses a random large integer y and sends Alice Y = g^y mod n.

3. Alice computes k = Y^x mod n. Recall that Y was sent to her by Bob and that x is a secret parameter that only Alice knows.

4. Bob computes mod n. He got X from Alice and knows y. Now, from the foregoing (and not writing the modulus), . Also, k = Y^x = g^xy, so k and are equal. Thus, Alice and Bob have the same secret key, and at no time did all the information required to know the secret key pass over the insecure link.

is the symmetrical key applied to the LFSR in foregoing. It is possible to derive a D-H key among three or more parties by expanding on the preceding process.²²

In order for an eavesdropper to learn the secret key, he would have to derive x and y from X and Y, the two public keys that are passed over the link. Computing × and y is not feasible. And without them, knowing k is not possible. One might attempt to do an exhaustive search, but it will take many lifetimes to do an exhaustive search if x and y are large enough.

DES Encryption Algorithm

There is yet one more encryption algorithm we need to discuss, because it is used in handling digital video. This is the data encryption standard, or DES (usually pronounced “dez”). DES is based on work done at IBM in the early 1970s. The National Institute of Standards and Technology (NIST, formerly NBS, the National Bureau of Standards) established it as a national standard in 1976. The American National Standards Institute approved DES as a private-sector standard in 1981 (ANSI X3.92). It has been reaffirmed as a U.S. standard several times since then, though it is now no longer approved for federal information applications, having been replaced by AES in 2001. Just because DES is no longer the approved algorithm does not mean that it is broken, just that it has been replaced for federal information applications. Special-purpose DES chips are available.

DES is a 56-bit-key symmetrical algorithm, meaning the same key is used for encryption and decryption. Though complex, it uses only simple operations, such as exclusive OR, permutation (changing the order of bits), and shifting of bits. There are four modes of operation: electronic code book (ECB), cipher block chaining (CBC), output feedback (OFB), and cipher feedback (CFB). We shall describe the complete algorithm, though only a portion of it is used in some scrambling systems.

Figure 22.13 illustrates the operations performed on the data. DES is a block cipher; it operates on a block of 64 bits of data, sends that data on, and starts working on a new block of 64 bits. If a block has fewer than 64 bits, additional bits are added (bit stuffing) to bring the total to 64 bits. An object of DES is to make sure that every bit in the block of ciphertext depends on every bit of the input and every bit of the key. The algorithm is divided into 16 nearly identical rounds, each involving all the bits of the data. Only one key is used, but, as we shall see, the bits in the key are changed for each round.

Figure 22.13 Operation of DES.

As shown at the top of Figure 22.13, DES starts by performing an initial permutation on the block of plaintext. This simply means that the order of the bits is changed in a prescribed manner. The 58th bit becomes the first bit, the 50th bit becomes the second bit, the 42nd bit becomes the third bit, and so on (the referenced text for this section shows all of the permutations involved). The initial permutation has nothing to do with the security of the algorithm. As best anyone can tell today, its primary purpose was to make it easier to load plaintext into a DES chip in byte-sized pieces. Next, the data is divided into a 32-bit left half and a 32-bit right half. Each half will be operated on separately in each round; at the conclusion of each round, the left and right halves are swapped. The left half is referred to as L_n and the right half as R_n, where n is the round number.

For each round, the right half is operated on by a function f, which we shall define shorthy. Function f involves expanding the data from 32 to 48 bits by duplicating certain bits, XORing the expanded data with 48 bits derived from the key, and performing certain other operations. The result is then XORed with the left half of the bits and the result is stored as the right half for the next round. The right half is stored as the left half for the next round.

After 16 rounds, an inverse of the initial permutation is performed; the resulting 64-bit block is the ciphertext that is transmitted. There is defined a triple DES, in which this algorithm is performed three times using two or three different keys.

Figure 22.14 shows what goes on during each round of DES. On the right we show what happens to the key during each round, and on the left we show what happens to the data during that round. The input key is the key from the previous round or, for the first round, is the agreed-upon key. The actual key is 56 bits long, but it is often expressed as a 64-bit number. The difference is the inclusion of 8 parity bits, each derived from 7 bits of the key. The key is divided into halves, and each half is shifted by one or two bits depending on the round. The shifting is done circularly, meaning that bits are shifted in a shift register, with the bit shifted out of one end being shifted into the other end of the shift register.

Figure 22.14 Operations performed in one round of DES.

Next a compression permutation is performed. Permutation means that the order of bits is changed according to a prescribed ordering. Compression means that certain bits are removed from the key to bring the number of bits remaining down to 48. Again, the bits removed are carefully prescribed. The 48 bits taken from this operation make up the key used in the round. The shifted halves of the key before the compression permutation become the 56-bit key for the next round.

The data encryption performed in this round is shown on the left. The right half block of data (R_i-l) becomes the left half for the following round (L_i). Also, the right half is subjected to a set of operations that define the function f. What goes on in function f is shown inside the dashed lines. The first operation in f is an expansion permutation, in which certain bits are repeated to bring the 32 bits up to 48 bits (the same length as the key for the round), and the order of bits is changed. The output of the expansion permutation is XORed with the key.

The next operation is the so-called S-box substitution. A substitution is an early form of a cipher. A simple substitution might be to build a table of substitute letters, as shown in Table 22.4. In this simple example, A becomes Z, B becomes P, and so on through the alphabet. A person intercepting such an encrypted message would not be able to make sense out of it unless he had the decoding table. This is the principle behind the S-box substitution in DES. Recall that the input to the S-box substitution is 48 bits, but only 32 of those bits carry the data. The other bits are discarded simultaneously with the substitution. For each six input bits, four output bits are produced according to a prescribed set of eight substitution tables following the same principle as Table 22.4 (the inputs and outputs are sets of bits rather than letters).

Table 22.4 Substitution cipher

In	Out
A	Z
B	P
C	J
D	A

The output of the S-box substitution is the input to the P-box permutation. This is a simple permutation of the order of the bits, according to yet another prescribed ordering. The output of the P-box permutation is the output of the function f shown in Figure 22.13. It is XORed with the left half bits and placed in the right half block for the next round of the permutation.

The set of actions depicted in Figure 22.14 is repeated through the 16th round, after which the inverse of the initial permutation is performed, as shown in Figure 22.13. The output of the inverse initial permutation is the 64 bits of ciphertext, which are ready to be transmitted, and a new block of 64 bits enters the encryption as the process starts over.

In order to decrypt the data, the same steps are repeated, with the only difference being that the same key is used but the order of processing the key is reversed. Note that we have not proved DES works, nor have we proved the security of DES. For those wishing to study the subject and having several years of free time on their hands, many books have been written on the subject.²³

22.5.4 Digital Video Copy Protection

Having established the basics of encryption, we can now describe the several copy protection methods used in host devices. Both DVI and the IEEE 1394 interface have associated copy protection mechanisms, and a third protects content coming out of a POD module. (Yet another encryption is applied to the scrambled data transmitted over the cable plant, to be decrypted on the POD module.)

DVI and HDCP

This interface is intended to operate between a host and a display only. Recording the signal is not an appropriate use of the interface. The interface, as described earlier, transfers decoded red, green, and blue pixel information, each serially on three balanced lines. HDCP, the copy protection mechanism, is governed by the specification high-bandwidth digital copy protection system, currently revision 1.0, 17 February 2000.²⁴ Implementation of the specification requires a license from Digital Content Protection, LLC. The system allows for use of repeaters, that is, devices that take in the signal and also put it out. A repeater might be a digital video distribution amplifier, or it might be a monitor that supplies an output to another monitor. It is possible to have seven levels of repeaters and as many as 128 devices in total. When repeaters are used, the source (host in our terminology) is made aware of all downstream devices before transmissions commence.

There are three elements of the content protection system. The first is authentication, in which the host verifies that all devices connected are legitimate, licensed devices. Shared secrets are established during authentication and form the encryption key for program transmission, the second element. Finally, renewability allows the host to identify a compromised device and to desist from sending data to it.

Authentication, the first element of the content protection system, also precedes in three steps. It is done slightly differently than what we showed earlier. Each licensed device is given an array of 40 secret keys, each 56 bits long. These come from Digital Content Protection, LLC. The device also receives a key selection vector, KSV. All are programmed when the device is manufactured. Figure 22.15 shows the secret keys and the KSV of the opposite party. The KSV is functionally equivalent to a public key, and the 40 secret keys are just that: secret keys. Authentication is initiated by the host sending an initiation message to the display. This message includes the host’s KSV and a 64-bit pseudo-random number, An. The receiver responds with a message including its KSV and a repeater bit if the device is a repeater. The host checks the KSV to make sure it has not been revoked and that it includes 20 1s and 20 0s.

Figure 22.15 HDCP keys.

A shared secret is derived as follows, in the first step of authentication. From its secret keys, the host selects the ones corresponding to 1s in the receiver’s KSV. All selected secret keys are added together, modulo (mod) 56, producing a 56-bit key. Call this result km in the host and km′ in the receiver. The secret keys and the KSVs are chosen such that the host and the receiver will compute km = km′, so each end will have the same secret key. If either device has an invalid set of secret keys or a bad KSV, then the two keys will not be the same.

At this point, both devices have several pieces of information in common. They know km, they know the random number generated at the Host, An, and they know the repeater bit. These values are used to generate three values, Ks, M₀, and R₀. These are generated by a block cipher algorithm defined in the specification. Ks is a 56-bit session key. M₀ is a 64-bit secret value used in the second part of the authorization protocol. R₀ is a 16-bit response value that the video receiver returns to the host as in indication as to the success of the authentication exchange. It must be returned within 100 ms. If R₀ returned from the receiver matches that generated by the host, then authentication was successful.

In the second step in the authentication process, the host (transmitter) and any repeaters gather information regarding all the devices that are attached. The host confirms that none of the devices attached are on the revocation list. This step is omitted if there are no repeaters (for instance, there is only one host sending pictures to one display). The step can take up to about five seconds to complete. Finally, the third step occurs in the vertical blanking interval (VBI) about every two seconds while pictures are being transferred. New cipher initialization values are calculated to establish a new 56-bit key for encrypting video content. The integrity of the link is also confirmed.

The second element of the HDCP protocol is encryption of the red, green, and blue data using a pseudo-random sequence generated from secrets exchanged in the authentication process. The pseudo-random sequence is reset frequently. The pseudo-random sequence is exclusive ORed with the data, as shown earlier.

The third and final element of the HDCP protocol is renewability, the process of revocation of devices that have been compromised. The devices are identified by their 40-bit KSV, as described in Figure 22.15. System renewability messages (SRMs) containing one or more KSV revocation lists, called vector revocation lists (VRLs), are distributed with programs as described above. They are checked by the transmitter (host) when they are available. The SRM carries a digital signature, as described earlier, so that the host can determine that the data did indeed come from Digital Content Protection, LLC.

IEEE 1394 and DTCP

The digital transmission content protection (DTCP) standard for secure transmission over IEEE 1394 physical links was developed by five companies — Intel, Matsushita (Panasonic), Toshiba, Sony, and Hitachi — and is thus sometimes called the “5C” content protection scheme. 1394 is intended as a broader-use standard than is DVI, being intended not only for transmission to a display or displays, but also for transfer to recording devices, computers, and whatever other devices you wish. DVI transfers uncompressed (hence very wideband) red, green, and blue data for immediate display; IEEE 1394 transmits compressed (therefore lower bandwidth) data.

DTCP, the associated verification and encryption standard, includes the same elements as does HDCP, with one more. It includes authentication and key exchange (AKE), transmission, referred to as content encryption, and renewability. It also includes a new element, copy control information.

DTCP includes instructions to the receiving device as to how it should treat the data. This is called copy control information, CCI, which is embedded in the content. The actual data in the CCI is known as the encryption mode indicator (EMI). It may take on any of four states. The material may be used for unlimited copying, copy free, and the opposite state, copy never. Another state is copy once, such as to allow for time shifting a TV program. When the material is played, the EMI is changed to copy no more. These last two values of the EMI are intended for devices with limited computational ability. The associated limited, or restricted, authentication required involves fewer computations than does full authentication.

Obviously, the same concerns apply here as to other systems. Every device licensed to receive material protected by DTCP must obey these rules. Thus, the same need exists for a trusted authority to certify each device. DTCP is administered by the Digital Transmission Licensing Administrator (DTLA), established by the five companies. The DTLA acts as the certificate authority (CA) for conformant devices. Besides the CCI, DTCP includes authentication and key exchange, content encryption, and renewability.

Two levels of authentication and key exchange are specified: Full authentication using public key cryptography as described earlier is used for all types of content. For copy once and copy no more content, a more limited, restricted authentication using common key cryptography is permitted.

For full authentication, the method is similar to, but uses different formats than, that described in Section 22.5.3.²⁵ A certificate is embedded in each device and is verified as described earlier. Both the source device (supplying the content) and the sink device (receiving the content) check the other device’s certificate. A form of Diffie-Hellman key exchange is used for derivation of a secret key used to encrypt program material. Restricted authentication may use a short form of authentication, where the sink device proves that it shares a secret with the source device.²⁶

After authentication, the source device (the host) sends an exchange key, encrypted with the authentication key, to the sink device. This exchange key is sent as a seed, a random number generated by the source, from which both devices can compute the decryption key. The key is changed periodically, every 30–120 seconds.

In order to provide for system renewability, system renewability messages (SRMs) are sent to each IEEE 1394-compliant device. These may be sent as part of a program stream from a headend, or they may be incorporated into other distribution media, such as DVDs. The SRMs carry identification of any device that has had its certificate revoked. As material containing the identification propagates throughout the nation, devices eventually learn of any certificate revocations.

POD Module Output

The POD module shown in Figure 22.1 is responsible for descrambling the output of the transmission from the headend. POD, an acronym for point of deployment, is the term used for renewable security modules. The POD module is built into a standard PCMCIA (Personal Computer Memory Card International Association) card, the familiar card used for modems, Ethernet connections, and many other added functions in portable computers. It uses a 68-pin connector. The POD module output is also copy protected. We can easily explain the first stage of the copy protection algorithm used at this interface. It is the X. 509 certificate and encryption algorithm described in Section 22.5.3. The certificate exchange is, however, just the first phase of having the POD module and the host authenticate each other.

In phase 2 of the authentication process, the host and the POD module identify themselves to the headend. The headend holds the DTLA (see preceding subsection) certification revocation list (CRL), showing the identification of all hosts and POD modules that have had their certification revoked. Note the contrast with the general case we showed earlier for revocation of certificates. Previously we showed the CRL being contained with the content. But POD modules and hosts identify themselves back to the headend, which then checks its CRL and will refuse to authorize descrambling if either device is on the CRL.

Identification to the headend is done differently depending on whether a two-way communications path is available and or unavailable. If a communications path is unavailable, then the host displays a message on the monitor, telling the subscriber to call a number and to give the identification numbers of the POD module and the host. The identification numbers are numerical, so it is possible to automate the process of relaying the information to the headend. Also, when the subscriber purchases his or her equipment, it is possible for the seller to perform identification. If two-way communications are available, the POD module can relay the information back to the headend conditional authorization system without subscriber action.

The third phase of authentication consists of computation of a long-term authentication key by both the POD module and the host. This authentication key is stored in nonvolatile memory (that is, the content remains even when the device is not powered). It is used to confirm authentication after power has been removed and is reapplied. Upon a subsequent power-up, the POD module and the host confirm that they have the same authentication key; if so, no further action is required before they can begin exchanging programming information.

The complete process of authentication is referred to as a binding between the POD module and the host. A POD module may bind to only one host at a time, but the POD module may be moved from one host to another, with the authentication process starting all over when that happens; that is, the POD module will restart the binding procedure. In two-way systems, the process is unseen by the subscriber. But in a one-way system, the subscriber will have to telephone the POD module and host identification numbers to the service provider. Regardless of successful authentication, programs sent in the clear (no scrambling) may be processed.²⁷

22.6 Using Two Conditional Access Systems

Operators consider it desirable to be able to use more than one STT system in order to gain possible price and feature advantages. However, each manufacturer would like to use his own conditional access system, to which he ascribes unique benefits. A simplistic approach is to duplicate programming using two scrambling systems. However, this approach doubles bandwidth requirements, an obvious impracticality. In 1997 the so-called harmony agreement was negotiated in the industry, whereby one encrypted datastream could be used along with more than one conditional access system.

Besides addressing scrambling, the harmony agreement includes an agreement on the use of open standards for transmitting digital video. These include MPEG-2 video, Dolby Digital audio, MPEG-2 transport, ATSC system information, and ITU-J83B modulation (64- and 256-QAM).²⁸

Figure 22.16 illustrates the principle of this harmony agreement. The agreement starts in the headend with clear video (called plaintext). This plaintext is encrypted in the common service encryptor, which scrambles the signal using an agreed-upon algorithm, the DES encryption standard described earlier in this chapter. A common (shared) control word, or encryption key, is generated in the control word generator and supplied to the common service encryptor and also to the conditional access system for each vendor whose STTs are being used. The service encryptor is the device that scrambles the program data. It is called “common” because the same scrambling is applied for use in both systems.

Figure 22.16 Principle of the harmony agreement.

The control word is also supplied to each vendor’s conditional access system, which then distributes that control word as it sees fit. The control word may be encrypted for distribution according to the individual manufacturer’s proprietary facility. Control words are embedded in so-called entitlement control messages (ECMs), which are transmitted as part of the MPEG transport stream.²⁹

As this book goes to press, another means of sharing conditional access has been demonstrated but not yet commercialized. This method extracts critical data from the MPEG-2 datastream. This data is required for reconstructed video. The critical data is encrypted using any number of encryption methods desired by different vendors. The remaining portion of the MPEG signal is not encrypted. The decoder extracts the critical data encrypted with its encryption system and combines it with the majority of the video data that was not encrypted.

Other means of encrypting the signal are available and are practiced in Europe and other localities.

22.7 Optional Personal Video Recorder

An optional personal video recorder (PVR) function is shown in Figure 22.1. PVR capability allows a subscriber to record and play back material from a hard drive. He or she can use it for time shifting programs and also to generate an “instant replay” of a scene. If the manufacturer wants to allow the PVR to be used with analog signals, the manufacturer must include the optional real-time MPEG encoder shown in Figure 22.1.

Of concern if the subscriber records a program in a PVR, is the possibility of exchanging the program with someone else, either by transmitting the program on to the other person or by physically removing the hard drive from the host and putting it in another host or in a computer. For this reason, the content recorded on the hard drive must also be encrypted, unless it is marked for unlimited recording. As of this writing, an encryption standard has not been specified for this interface. The host manufacturer chooses one.

22.8 Middleware

A modern STT or host is as much software as hardware. In the beginning, STTs were limited to the manufacturer’s software. Several proprietary sets of so-called middleware were developed; then standards began to emerge for certain levels of software. In particular, it is important to allow downloaded programs that will run on any STT to which they are downloaded so that an operator or subscriber may select from any of several STTs on the market. This can be achieved by a standardized middleware solution. The CableLabs/SCTE-standardized middleware is known as OpenCable applications platform (OCAP). As of this writing, two OCAP specifications exist, the OCAP 1 and 2 profiles.³⁰ OCAP is based on an earlier European initiative called Multimedia Home Platform, MHP.³¹ There are a lot of similarities between OCAP, MHP, and DASE, the ATSC’s DTV Applications Software Environment.³² As of this writing, there are efforts to harmonize DASE, used in North American digital off-air broadcast, and OCAP, used in cable TV.

OCAP 1 closely follows MHP 1.0.2, but there are differences, some of which we will mention in this description.

22.8.1 Layers of Software

You may think of at least three layers of software running on the microprocessor(s) in a host. The lowest layer is the operating system, just as a desktop computer has an operating system. It is specific to the hardware being used and is typically purchased from a third-party vendor by the host manufacturer for the processor(s) chosen to be used. (This is consistent with the way OCAP is explained in the standards, but note that in some cases there is a layer of software below the operating system, call the hardware abstraction layer (HAL). The HAL allows the operating system to be platform independent.) The highest layer of software is the application software, which may, for example, provide an EPG in the form that the operator wishes to display it. Many other applications software packages are possible, from communications to games to messaging to VOD and others. Usually, application software is downloaded to a particular host based on what the user purchases and what the operator wants to provide. Application software is usually downloaded from a central server and may change from time to time.

The problem is that the operator needs to be able to download one software package for one application and expect it to run the same way on all hosts and STTs in the system. This must obtain even if there are different STTs and hosts made by many different manufacturers at different times, using different processors and operating systems, and having different capabilities. If a capability is supported by the hardware in a box, then the same software should enable it in all boxes. Making this happen is the role of middleware. The host manufacturer is responsible for providing middleware for his products that conforms to the appropriate OCAP specification, 1.0 or 2.0, as specified by the cable operator. It is possible to use a host that supports 1.0 but not 2.0. Each specification details the services to be supported and how relevant data is communicated to the host.

22.8.2 OCAP

OCAP 1 is an application interface that includes the application program interfaces (APIs) necessary to implement certain specified applications. It is intended to define a minimum software capability for OCAP-certified hosts sold at retail. Software providers may assume the existence of this level of capability when developing host applications. API is the set of routines, protocols, and tools for building software applications. A good API makes it easier to develop a program by providing all the building blocks. An application programmer puts the blocks together.³³ Thus, OCAP provides the programmer with all the tools needed to put together a certain application, which may then be downloaded to all STTs and hosts, without regard to who made the device. The performance will meet some minimum standard of consistency on all devices.

OCAP 1 is based on a Java virtual machine model, a popular standard for the so-called execution engine. That is, you write Java programs to run on the host. Related terms you will encounter include Hypertext Markup Language (HTML), the authoring language used in creating Web pages. Hypertext Transfer Protocol (HTTP) defines how messages are formatted and transmitted to Web browsers and what actions Web servers and browsers should take in response to various commands. HTML is not a part of OCAP 1 (it is part of OCAP 2), but it is possible to use Java to write a Web display application if desired. Java can leverage the graphics capabilities supported by the hardware.

All applications in MHP are bound. That is, they are tied to the channel to which the host is tuned. If the viewer tunes off the channel, then the application stops unless it is also associated with the newly tuned channel. On the other hand, OCAP recognizes both bound and unbound applications. An unbound application continues to run even when the channel is changed.

Besides bound and unbound applications, and unlike MHP, OCAP allows native applications. Any function that can be programmed in Java (meaning just about any function) may be programmed and downloaded to an OCAP host. However, unlike bound and unbound applications, native applications are not guaranteed to run on any host. Native applications are written for a particular host.

Because it will not always be the case that the cable operator can control what applications are downloaded to a host, OCAP includes a monitor application, which decides what other applications are allowed to run on the host. This way, the operator has some control of what the subscriber does with his host and what the host might do to the network. An application can be downloaded on the FAT, the high-bandwidth channels on which video is downloaded. But the FAT channel is often passed through the headend without the operator’s making any changes, opening up the system for other applications forwarded by the program supplier.

Figure 22.17 illustrates the relationship between OCAP 1 and other software and hardware in the host.³⁴ At the lowest layer is the OCAP host device hardware, what we traditionally think of as the host or STT. This is the collection of hardware on which the software runs. The lowest layer of software is the operating system/middleware layer. This includes a lot of traditional operating system functions, such as scheduling different activities, handling interrupts (such as when the subscriber presses a key), software to handle different interfaces, and memory management.

Figure 22.17 OCAP 1 software architecture.

The second layer of the software is the execution engine, whose functionality and upward-bound interfaces are specified in OCAP 1. It provides prescribed interfaces with the application software above, through OCAP APIs. The upward-bound interfaces must be device independent. That is, any host, implemented with any software operating system, must respond in the same way when an application issues some command or supplies some piece of data.

Above the execution engine are the various applications, which might come from the host vendor but which might also come from other vendors. Common applications include EPG and VOD, but many other applications are possible. Some examples are listed later. The monitor application is included in OCAP 1. It controls what other applications may reside on the host, and it controls resource management between the applications and other functions that involve one or more applications. This is typically operator specific, because each operator will want to make his or her own decision about what runs on the hosts in the system.

The cable network is generally defined in SCTE 40, formerly DVS/313. It defines the levels and characteristics of the input and return signals interfacing on the left of Figure 22.1.

Native applications are any applications that run on a specific host and are not conformant to OCAP. Such applications may be written when the execution speed of an OCAP application is not fast enough (some games fall into this category). If an operator wants to enable such applications on several different hosts, he or she may have to write the application several times, once for each different host (assuming they are not compatible). In contrast, an OCAP application need be written only once, and it will run on all OCAP 1-compliant platforms.

OCAP 1 Representative Applications

A number of representative applications are presented, which OCAP 1-compliant systems are required to support. Typically these applications are written on top of OCAP 1 (using its services). This is not an exhaustive list — other applications can also be supported. There is no requirement that all of these applications be present on a particular host, but it is required that the box support software that implements these applications as appropriate. An OCAP 1-compliant box will be capable of having application software downloaded to support these services, but it is not a given that, just because a box is OCAP compliant, these applications exist on it. They exist when client software is downloaded. Also, every client might not have the hardware required to support every application listed.

Electronic program guide: Electronic program guide (EPG) software must be capable of being written using OCAP APIs, but OCAP does not specify a particular EPG. This is up to the software chosen by the operator. EPGs typically offer a grid showing programs available on the system at the present time and for some length of time into the future. The user can move through the grid, with current location being highlighted. If he or she presses a “SELECT” or “OK” key when on a particular program listing, the host tunes to that program. If the program is not on yet, the EPG may set the host to tune to that channel when the program does come on, or it may present a reminder to the viewer that the program is starting. The EPG may also be used for purchasing pay services such as pay-per-view (PPV) programs, which are presented at preset times and billed as an extra service. It can also be used for purchasing video on demand (VOD), a similar service except that the subscriber can order it to start when he or she wants it and can exercise VCR-like control (stop, play, fast-forward, rewind).

Watching TV: Though this might not be thought of as a software function, there are certain aspects of watching TV that might be controlled by applications written on middleware, and OCAP must support them. These applications support such functions as selecting standard- or high-definition viewing and, if in high definition, the format (aspect ratio, scan format, and resolution). The subscriber would want to control whether audio is delivered as mono, stereo, or 5.1-channel surround (based of course on what formats are transmitted). He or she might also want to select an alternate language or to select closed captioning information to be overlaid on the picture. Picture-in-picture, time shifting, emergency alert, and content advisory (“V-chip”) are other services related to watching TV. Software to implement these services must be supported by OCAP.

Pay-per-view: Pay-per-view (PPV) is a traditional way of ordering pay programming. The subscriber contacts the system operator in advance to arrange for purchase of a particular event. The operator then downloads authorization to the host. At the time of the event, the EPG may provide a reminder to the subscriber. Related services for delivery of pay programming include Impulse Pay-Per-View (IPPV), video on demand (VOD), and near video on demand (NVOD). These are described in Section 21.4. Additional services may be provided using the EPG capability of the host.

Email: Email clients are supported by OCAP. This would be an example of an unbound application, because it is not tied to any one broadcast channel. The operator is free to download an email client of his or her choice to an OCAP-compliant host.

Chat/conferencing: A chat or conferencing application would most likely be another unbound application, available no matter to what channel the subscriber was tuned. It might appear as a floating box over the video, in which the viewer and others could send instant messages to one or more other participants to discuss a show or anything else people can think of to chat about.

IP telephony: OCAP 1 specifically lists IP telephony support, though the host may not have the interfaces required for telephone service. IP telephony is covered in Chapter 6.

Games: OCAP 1 provides certain APIs required for game playing. These include windowed video and playing sound effects and animation. OCAP does not provide support for multiplayer networked games, but this capability can be provided in native application software. Some games may require faster response than provided under OCAP 1, in which case the game client may include native-code interfaces to support certain hardware.

Music/radio: OCAP provides for decoding audio without decoding associated video to permit you to transmit digital audio broadcasts. If you want to download audio in any other format, this could be done by downloading a software decoder. An audio decoding application would most likely be bound, because when the subscriber tunes away from the channel, the audio would stop.

E-commerce: These applications are usually broken into shopping and banking applications. Shopping benefits from high-quality full-screen graphics and flashy transitions and other effects to allow a merchant to present wares in the most pleasing way. Shopping also involves exchange of credit card information, so security is of importance. OCAP does not support a secure connection, leaving this to the application software. The operator must address privacy concerns. Banking services don’t require the same high-quality video, but they do require extremely good security and user identification. Again, the application software must provide these services.

Minimum Functionality

OCAP also specifies certain minimal functionality that any host must support. Any OCAP host must support background, video, and graphics resolution of at least 480 × 640 pixels, which is roughly equivalent to the monochrome resolution of an NTSC signal with a 4:3 aspect ratio (4/3 × 480 = 640). This resolution must be supported for both 4:3 and 16:9 aspect ratios.

The minimum host must support certain decoder format conversions (DFCs). That is, when the signal arrives in one display format, the host must support conversion to another format supported by the display device. For example, if a picture arrives in a wide-screen (16 × 9) format and the display cannot handle wide screen, the host must be able to convert to a conventional 4:3 format. There are several ways this is to be done. A central portion of the picture may be sent to the display, or a letterbox format (black bands above and below the picture) may be sent. A number of specific processing options are required, as specified in the MHP documentation.³⁵ In addition, and different from MHP, an OCAP 1-compliant host must support a JMF (Java Media Framework) player for presenting subtitling or closed captioning information.

There is a rather lengthy list of remote control commands that are to be accepted. Support for both keyboards and remote control transmitters is recommended. A certain minimum amount of memory and the recommended architecture of that memory are included in the OCAP 1 specification. STTs and other hosts usually are running even when the user perceives power to be off. This is a standby mode, and OCAP recommends that the middleware recognize the difference between “standby” and “on” modes, because this is important to some applications.

22.8.3 OCAP 2

OCAP 2 is a second, related specification that builds on OCAP 1. Besides the functionality built into OCAP 1, OCAP 2 adds content formats based on Web technologies, such as XML, DOM, and ECMAScript. It is possible to provide complete functionality in an OCAP 1 device, but OCAP 2 opens additional options for providing that functionality.

Extensible Markup Language (XML) was developed by the World Wide Web Consortium,³⁶ an organization that develops technologies to expand the potential of the Web. XML allows designers to create their own customized tags, enabling the definition, transmission, validation, and interpretation of data between applications and between organizations.

The document object model (DOM) is the specification for how objects in a Web page are represented. The DOM defines what attributes are associated with each object and how the objects and attributes can be manipulated.

The European Computer Manufacturers Association (ECMA) has created a standard (ECMA-335) that allows applications to be written in a variety of high-level programming languages and executed in different system environments.³⁷

22.9 Summary

In this chapter we have presented a view of modern techniques used for digital video reception in the home. As this is written, many of the concepts and standards are new and have not been completely implemented in commercial equipment. The standard now is so-called thin client STTs, having limited functionality within themselves and depending on the headend for many functions. Thick client STTs, which include more processing power, have been used. The standards we have summarized can cover both. These standards are intended to support a wide range of functionality, to provide the cable operator with a rich choice of models through which to sell advanced services to subscribers. Of great importance is satisfying video content owners that their content will not be pirated. The standards anticipate the availability of devices besides STTs. For example, it is possible to subsume security needs into TV sets by using POD modules provided by the operator. Time will tell how the market for advanced services progresses.

In the next chapter, we discuss interfaces between consumer and cable TV equipment.

Endnotes

^* There are some instances involving networked PVRs in which a number of people will want to simultaneously access the VCR-like controls, stressing Aloha.

^* Biphase mark encoding is similar to, but not to be confused with, Manchester encoding, another self-clocking mechanism in which a transition always occurs in the center of the bit. The first half of the bit time contains the bit being transmitted; the second half of the bit time contains the complement of the bit being transmitted. Manchester encoding is described in Chapter 4.

^1. Cable Television Laboratories, OpenCable™ Host Device Core Functional Requirements, OC-SP-HOST-CFR-I11-021126, November 26, 2002. As of this writing, the standard is available at http://www.opencable.com.

^2. www.scte.org.

^3. Charles E. Spurgeon, Ethernet: The Definitive Guide. Boolder, CO: Cablelubs 2000.

^4. Charles Poynton, HDTV Video Short Course, tutorial presented at the 2002 IEEE Consumer Electronics Society’s International Conference on Consumer Electronics.

^5. http://www.ddwg.org.

^6. http://www.webopedia.com.

^7. Another good reference is Luke Chang and Joe Goodart, Digital Visual Interface, May 2000. Available at www.dell.com — search for DVI.

^8. Available from http://global.ihs.com.

^9. Mark Eyer, EIA/CEA 775-A: The IEEE 1394-Based Digital Interface for DTV Receivers, IEEE Consumer Electronics Society Newsletter, August 2002, No. 3.

^10. IEEE 1394–1995, IEEE Standard for a High Performance Serial Bus, available from www.ieee.org.

^11. IEEE 1394a-2000, IEEE Standard for a High Performance Serial Bus, Amendment to IEEE 1394–1995, available from www.ieee.org.

^12. Adam Kunzman and Alan Wetzel, 1394 High-Performance Serial Bus: The Digital Interface for ATV, IEEE Transactions on Consumer Electronics, Vol. 41, No. 3, August 1995.

^13. Information for this section was provided by Jeffrey Riedmiller of Dolby Laboratories.

^14. One asymmetric algorithm, RSA, is described in Justin Junkus, Digipoints, Vol. 2. San Francisco: SCTE, 2000, p. 169.

^15. Bruce Schneier, Applied Cryptography. New York: Wiley, 1996.

^16. Larry L. Peterson and Bruce S. Davie, Computer Networks San Francisco: Morgan Kaufmann, 1996.

^17. Stephen Thomas, SSL and TLS Essentials. New York: Wiley, 2000, Chap. 2.

^18. http://www.digital-cp.com.

^19. http://www.dtcp.com.

^20. Bill Pearson, Digital Transmission Content Protection (slides), Intel Corp. 1999. Available at http://www.dtcp.com.Look for the DTCP tutorial.

^21. Bruce Schneier, op. cit., section 16.2.

^22. Bruce Schneier, op. cit., section 22.1.

^23. Bruce Schneier, op. cit., Chap. 12.

^24. http://www.digital-cp.com.

^25. The cryptographic schemes, primitives, and encoding methods are described in IEEE P1363.

^26. DTLA, Digital Transmission Content Protection White Paper. Available at http://www.dtcp.com.

^27. SCTE, ANSI/SCTE 41 2001, POD Copy Protection System. Available at www.scte.org.

^28. Justin Junkus, DigiPoints, Vol. 2. Exton, PA: Society of Cable Telecommunications Engineers, 2000, p. 166.

^29. Michael Adams and Tony Wasilewski, Multiple Conditional Access Systems, Communications Technology, November 1997.

^30. See http://www.opencable.com. At some point these specifications may be transferred to the SCTE Web site.

^31. See http://www.mhp.org.

^32. See http://www.atsc.org.

^33. http://www.webopedia.com.

^34. OpenCable™ Application Platform Specification OC-SP-OCAP1.0-I04-021028.

^35. ETSI TS 101 812 V1.2.1 (2002-06).

^36. http://www.w3.org.

^37. http://www.webopedia.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 22: Digital Set-Top Terminals and Consumer Interfaces

Create new playlist

Sign In

Sign Up