Transport of HDTV over IP networks

Aims and Objectives

The aims and objectives of our work were three-fold:

These aims were achieved by building a prototype system using standard protocols - RTP over UDP/IP - running over commodity networks (Internet2 and the DARPA SuperNet, with OC-48 or higher backbone capacity and gigabit Ethernet in the local area) and using commodity PCs running Linux as the hosts. HDTV video capture equipment was used, providing the highest quality video commercially available.

HDTV and Networked Multimedia

What is HDTV?

High definition television (HDTV) is the next generation digital TV standard. It provides high resolution and greater colour depth than standard television formats, uses a widescreen 16:9 aspect ratio, and is a purely digital format. The aspect ratio makes the image appear more "movie-like" and the greater resolution and colour fidelity add considerable realism to the image.

There are a range of picture formats with different resolutions and framerates. Most commonly used are SMPTE-296M providing a 1280x720 pixel progressive scan image at 60 frames per second, and SMPTE-274M which provides 1920x1080 pixel images, typically interlaced at 30 frames per second.

Local area transport of uncompressed HDTV is via coaxial cable or optical fibre, using the SMPTE-292M standard. This is the universal medium of interchange between various types of HDTV equipment (e.g. cameras, encoders, VTRs, editing systems, etc.), and provides a digital serial connection rates of 1.485 Gbps. It is widely used in studios and production houses, allowing HDTV content to be delivered uncompressed through various cycles of production, avoiding the artifacts that are an inevitable result of multiple compression cycles. If wide area transport of uncompressed video is desired, the 292M bit-stream is typically run over dedicated fibre connections, but a more economical alternative is desirable. We consider the use of IP networks for this purpose.

Real-time Multimedia over IP Networks

Standards for real-time video delivery over IP are relatively mature, with the dominant protocol being the Real-time Transport Protocol, RTP. This provides media framing, identifies the payload type and source, and allows for timing recovery and loss detection. RTP is typically run over UDP/IP, inheriting the performance and limitations of IP. Receivers use the information in the RTP headers to detect packet loss and to reconstruct media timing.

IP networks provide a best-effort packet delivery service, meaning that there is no guarantee that the network will not discard, duplicate, delay or mis-order packets. Applications and transport protocols built IP must adapt to these issues, abstracting the network behaviour to give a usable service, and RTP applications have developed sophisticated strategies for dealing with timing jitter and packet loss.

However, RTP based systems are poor at congestion control, adapting their behaviour to fit the available network capacity. The implication here is that it is necessary to either develop congestion control for RTP or to run applications only on a network provisioned with sufficient capacity to support their needs. Of course, if it is desired to transmit uncompressed HDTV over IP, the network will need a certain capacity.

A system to transport HDTV over IP networks will use RTP as its transport, with the implication being that an RTP payload format needs to be developed for HDTV content.

System Architecture

A system for transport of HDTV over IP will accept a SMPTE-292M digital video signal and encapsulate it within RTP for transmission over IP. At the receiver, the SMPTE-292M signal can be regenerated, or the video can be displayed directly. There are a number of options in how this can be done, depending on the aim of the transport. If the intent is to link existing equipment the correct approach may be circuit emulation, where the SMPTE-292M signal is mapped onto IP irrespective of its contents. The alternative is a native packetization, where an RTP payload format is defined to transport the video directly, with SMPTE-292M used only locally.

Circuit emulation provides transparent delivery of the HDTV bit-stream, suitable for input into other devices. It supports any format that SMPTE-292M supports, without having to be adapted to the details of that format. The main disadvantage is that the packetization is media unaware, and cannot optimise based on the video format. This makes circuit emulation somewhat loss intolerant.

Native packetization looks at the contents of the SMPTE-292M stream, acting on the video data within it. Hence, native formats may need to be defined for every possible video resolution, although those formats can be made more optimal. It also exposes the content to manipulation by end systems, rather than hiding it within another layer of framing.

We chose to use a native packetization, since one of our aims is to display and manipulate HDTV content on commodity workstations; we do not necessarily need to regenerate the SMPTE-292M output.

Implementation

In the design and implementation of our HDTV system, our priority was to use commercial, off-the-shelf, components rather than custom hardware. Accordingly, the core of our system is a high performance PC, with gigabit Ethernet and an HDTV capture card.

The PCs we use are Dell PowerEdge 2500 servers, with dual 1.2GHz Pentium III processors, running Linux 2.4. The key component of these systems is that they have two separate 64 bit, 66MHz PCI interfaces, allowing us to avoid contention between the video capture and network cards. The network interface we use is a 3Com 3c985 gigabit Ethernet card.

HDstationPro card

For HDTV capture and playout, we use the DVS HDstationPro card, which provides SMPTE-292M input and output. This card can operate in several modes: captioning, capture and playback. We used it to capture HDTV into main memory, and to regenerate a SMPTE-292M bitstream at the receiver. Our system can also display HDTV on the workstation monitor, using the X window system with Xvideo extension. The video capture card supports a range of video formats, but our system uses only SMPTE-296M (1280x720 pixels, progressive scan, 60 frames per second) at this time.

We used an updated version of the RTP library from the UCL Robust-Audio Tool to provide the core network functions of our system. This is a complete RTP implementation, supporting IPv4, IPv6 and multicast.

Sender and receiver components are implemented as two separate programs, because the requirements of the system are such that it is not possible to transmit and receive on the same machine. Both are relatively simple, with little in the way of loss or jitter tolerance at present (these were not found to be necessary in the controlled environment of our initial tests, but will be added in future as we make the system more robust and adaptive).

Software Download

Our software is available for download, for those who have suitable facilities.

Publications