Bit rate
The next step after sampling is quantisation, in which sample values are rounded up to the nearest integer quantum values. The precise number of quantums or levels is determined by bit depth—the number of bits in the code. For example, if eight bits are used, there will be 28 = 256 quantum levels. The bit rate can be calculated as:
Bit rate = Number of samples per second x Number of bits per sample
Number of samples per second = Number of samples per picture x Number of pictures per second
Number of samples per PAL picture = 720×576 = 414,720
Given a picture rate of 25,
Number of samples per second = 720×576×25 = 10,368,000
Therefore the bit rate generated by the luminance component using an 8-bit code is:
720×576×25×8 = 82,944,000 = 82.944 Mbps
The bit rate for the chrominance components depends on the sampling structure used. For a 4:2:2 sampling structure with only horizontal sub-sampling:
Number of samples = 360×576 = 207,360 per picture
Bit rate for each chrominance component is:
360×576×Picture rate×Number of bits = 360×576×25×8
= 41.472 Mbps, which is half the luminance bit rate.
Total chrominance bit rate is therefore:
41.472×2 = 82.944 Mbps
Giving a total bit rate of:
82.944 + 82.944 = 166 Mbps
For a 4:2:0 structure, it comes down to 124.416 Mbps. For a 4:1:1 structure, it will again be 124.416 Mbps.
The actual bandwidth requirements depend upon the type of modulation used. For pulse-code modulation, it will be half the bit rate, which comes down to 62 MHz. But even that is very high and there is a need to compress the data. As we have seen, transmission of SD television requires a bandwidth of around 62 MHz. Certain compression techniques are required to reduce the bit rate and hence the bandwidth compensation.
Compression
There are two main data compression standards—JPEG encoding and MPEG encoding. JPEG is associated with still image compression and MPEG with digital videos. MPEG-2 is used for SD and MPEG-4 for HDTV.
A digital television programme consists of three components—video, audio and service data. The original video and audio information is in analogue form and has to be sampled and quantised before being fed into the appropriate coders. The service data, which contains additional information such as teletext and network-specificinformation including electronic programme guide (EPG), is generated in digital form and requires no encoding.
MPEG-2 encoding. Video MPEG encoding consists of data preparation, compression and quantisation. The purpose of video data preparation is to ensure a raw-coded sample of the picture frame organised in a way that is suitable for data compression.
MPEG uses both temporal (time) and spatial (space) compression. Video is a sequence of still images, so the same compression technique as for JPEG can be applied to video clips. It is called spatial inter-frame compression. Also, successive images of video clips differ only slightly. It is possible to compress the redundant part while sending only the difference between them. This technique is called temporal compression.
Video preparation involves regrouping the samples of CR , CB and Y into 8×8 blocks to be used in spatial redundancy removal. These blocks are then rearranged into 16×16 macro blocks for use in temporal redundancy removal. The macro blocks are then grouped into slices, which are the basic units for data compression.
Temporal compression
The most commonly used method works by comparing each frame in the video with the previous one. If the frame contains areas where nothing has moved, the system simply issues a short command that copies that part of the previous frame, bit-for-bit, into the next one. If sections of the frame move in a simple manner, the compressor emits a slightly longer command that directs the de-compresser to shift, rotate, lighten or darken the copy.
Inter-frame compression works well for programmes that are played back by the viewer but can cause problems if the video sequence needs to be edited.
As mentioned earlier, only the difference between consecutive frames is transmitted. The remaining data is redundant and does not get transmitted. For example, if a news reader is reading the news, only part of the frame that contains his lip movement gets transmitted as the difference-frame. Things like microphone, paperweight or the channel logo will be the same for all the frames and hence need not be transmitted.