Riding Moore’s Law to Scale Live OTT Compression Solutions White Paper Abstract Compression solutions for encoding have traditionally been commercially available using FPGAs and then transitioning to ASICs or a mix thereof in order to maximize performance, while minimizing cost and power utilization. This methodology has worked well for the traditional linear broadcast environment, in which there was usually one encoded stream per video input. For over-the-top (OTT) distribution, Adaptive Bitrate (ABR) is utilized; the encoder needs to produce multiple streams for each video input, enabling the end user to dynamically select the best stream for their network connection and device profile. This simple change obsoletes the model of one device per output stream using legacy hardware encoding, which doesn’t scale well, as each video service delivered could have five or more profiles. This paper describes a software architecture running on server platforms and utilizing purpose-built hardware acceleration. By riding the commoditization of server hardware and the ever-increasing processing capabilities, this architecture is able to provide the critical advantage of scale and channel density for OTT networks. Introduction Video distribution has come a long way since the days of analog distribution. With the advances of MPEG-2, H.264 and now High-Efficiency Video Coding (HEVC) compression, it has become possible to fit more services at higher quality into traditional distribution mediums such as satellite, terrestrial and cable. At the same time, there have been significant improvements in available Internet bandwidth to the end user, with the average Internet speed in the U.S. reaching 8.7 Mbps by the middle of 2013 [1]. The number of services available to consumers has ballooned from the 1990s to today, where there is the potential for thousands of live services available at anytime to a connected home. The combination of shrinking the needed bandwidth per service with compression and the rise in available Internet bandwidth to the end user is leading to significant growth in streaming media, thus enabling providers to reach their customers easily with content. Software compression solutions have historically been the core of streaming media delivery networks, while hardware compression solutions have been the core of linear delivery networks. Both software and hardware compression solutions have their benefits and drawbacks. As linear and streaming services converge, it makes sense to look at architectures that leverage the benefits of each solution and reduce the potential drawbacks. Evolution of Compression Codecs Over the last two decades, there has been widespread adoption of video compression for the distribution of video and audio to consumers. MPEG-2 truly kicked off the migration from analog-only services to digital television and dramatically increased the number of services available to the consumers. A switch from an analog transmission over satellite to DVB-S using MPEG-2 compression for standard definition (SD) services allowed providers to transmit eight services in place of a single analog service. With the introduction H.264 video encoding, the distribution density was further doubled as H.264 required only half of the bitrate of MPEG-2 compression for the same resolution. While a doubling in density for SD services was achieved, there was also a transition to high-definition television (HDTV) that required 3-4 times more bandwidth than MPEG-2 SD video. By switching to H.264, distributors were able to reduce the bitrate increase to 1.5 to 2 times that of MPEG 2 SD. Telco service providers took Delivering the Moment imaginecommunications.com advantage of H.264 compression to enter the video distribution to the home market over their existing asymmetric digital subscriber line (ADSL) infrastructure. Entering the third decade of video compression distribution, a new codec has been developed: HEVC. As with H.264, HEVC doubles the density of services being distributed for the same resolution. But this new codec also brings with it the promise of delivery of even higher resolution video with support for Ultra High Definition Television (UHDTV), which for UHDTV-1 at 2160p60 is four times the resolution and bandwidth of a 1080p60 HDTV signal. While there is the promise for UHDTV services, the key advantages for HEVC will be streaming services; the ability to provide a 720p or 1080p video at half the bitrate dramatically reduces the cost of distribution for over-the-top (OTT) providers, while improving the experience for the end users. As new codecs reduce the overall bandwidth for services, there is a cost in the terms of complexity to encode the services in the new video compression algorithm. If we start with MPEG-2 NTSC video, there is an approximate 10 times increase in complexity in using H.264 for the same video resolution. There is a further tenfold increase when HEVC is used for compression. If we look at resolution changes, there is an approximate six times increase for encoding 1080i60 over NTSC, and approximately an additional six times for UHDTV-1 for 2160p60. Figure 1 shows how the various codecs compare in complexity versus MPEG-2 SD with HEVC UHDTV-1 having an increased complexity of 3,600 times. Figure 1 – Complexity of codecs over time Linear Services Linear services are typically defined as live distributed services throughout the distribution chain using MPEG-2 transport streams (TS). They are usually continuously being distributed as a dedicated channel. For example, the services a consumer would watch over the air on a digital terrestrial television (DTT) system or through the cable, direct to home (DTH) or Internet protocol television (IPTV) provider are typically linear services. Video on demand (VOD) and other on-demand services are not usually considered linear services, as they are not served out in a linear fashion. With the move from analog to digital, there was a dramatic increase in the number of channels available to the consumer, from delivering 10s of channels in the early 1990s to now having 100s of channels available. In 2008, the average American subscribed to 118 channels over cable [2]. Providers often utilize a dedicated encoder per service, but with the increase in the number of services, denser modular encoding solutions have become more popular. They use less rack space, they save on power and they can provide built-in redundancy to ease operational complexity for the providers. There has been a move toward using transcoding solutions to take the services received from content providers and transcode them to the formats and bitrates required for their end customers. The advantage of transcoding technologies is they tend to have greater density than encoders as they have no baseband input processing requirements. With the availability of new video resolutions, there tends to be a significant amount of simulcast for the majority of services. Today, some providers create a down-converted SD version of their HDTV services to support legacy customers. This simulcast effect is pushing providers to the denser solutions, as they are not only dealing with an increase in services, but also with multiple versions of each service. Delivering the Moment imaginecommunications.com Streaming Services While linear services are considered dedicated services over MPEG-2 TS, streaming services are defined as programs destined for the end user over IP. Streaming services used to be offered in multiple formats and codecs, and the user would select the codec and format that worked with the combination of their PC and Internet connection. This method of selection was mostly trial and error from the user’s perspective, resulting in a significant number of buffering messages and a fairly poor user experience. Today, streaming services have improved dramatically, whereby the user no longer needs to try to figure out the best stream for their connection. Depending on the provider, there can still be platform issues on the receiving side based on the DRM or packaging selection. Overall, these changes have enhanced the end user experience. With Adaptive Bitrate (ABR), a provider creates multiple different profiles for their service, defining the lowest tier of service to the highest. Some services have anywhere from four to eight or more profiles. The profiles vary depending on the provider, but can go from Quarter Common Intermediate Format (QCIF) all the way up to 1080p. The client device tunes to the service and selects a default profile; based on its own algorithm, it will switch to the best-available quality based on available bandwidth and the bitrates of the profiles. It can move from profile to profile near seamlessly, settling on the best-available profile for the end user. With the additional profiles created using ABR to enhance the user experience, there is a greater demand on the encoding solution to create four to eight times more encoded streams than traditional linear services. While a significant majority of the streams are lower resolution and require less processing power to encode the lower profiles, a traditional linear service encoding model would still utilize one encoder per profile. There is an added requirement of synchronization between each encoder to allow seamless switching between profiles. In such a solution, a service with eight output profiles equates to approximately four HDTV encoders, depending on the selection of profiles. Hardware Compression Real-time encoders have typically used dedicated hardware to take in analog or SDI video and create an MPEG-2 or H.264 compressed video stream in a MPEG-2 TS. These encoders have historically been based on Application Specific Integrated Circuits (ASICs) using dedicated silicon for the encoding and Field Programmable Gate Arrays (FPGAs) to support additional tasks such as TS processing or video analysis. These are then controlled from an embedded processor. The disadvantage of this type of design is that the ASICs are inflexible, and many key functions cannot be reprogrammed. The advantage of these designs is that they tend to be lower cost for the performance they provide and require less power — both of which are key components in the selection process for any encoding solution. An alternative solution for hardware encoding is an all- FPGA encoder. The majority of the first real-time H.264 encoders were FPGA based; this is also true for HEVC encoders. FPGA solutions provide the flexibility of being reprogrammable and have a faster time to market. The key drawback of most FPGA solutions to date is that they have a higher cost and draw more power than a comparable ASIC solution. Hardware solutions work very well for single linear feeds, and they can often be adapted for streaming services if the design of the ASIC or FPGA allows encoding based on the number of encoded blocks rather than on a single video. This will allow multiple profiles to be encoded within the capacity of the encoding hardware. Most designs work for four to eight profiles depending on the resolution of each profile. This design also overcomes any concerns with synchronization between encoders as they are all managed out of the same hardware architecture. If there is a need for multiple high-resolution profiles, or more than eight, it might be necessary to span beyond a single encoder design, thereby adding complexity in order to maintain synchronization across encoders to support ABR alignment. Software Compression Software compression solutions are typically run on generic server platforms and have been historically utilized almost exclusively for streaming or on-demand services. To encode a service, a server would have a video capture card and the software encoding for the various output formats required. Such a solution provides a significant amount of flexibility as any component of the software architecture can be upgraded and modified based on the application needs. A software encoder provides users a significant amount of flexibility. To increase the number of services, providers purchase the commodity server hardware required to run those services and load the appropriate software. A key advantage here is that the commodity hardware is typically the same as that being used for a provider’s IT infrastructure. This provides flexibility in launching new services as well as pricing breaks, due to the large market for server platforms over dedicated hardware solutions for encoding. The reason software compression solutions have not typically been used for linear services is due to performance and reliability concerns in a server-based solution. Performance for linear services has lagged compared with hardware solutions. The first Delivering the Moment imaginecommunications.com real-time software MPEG-2 HDTV encoders became available in 2003 [3] — almost a decade after the release of the MPEG-2 specification. Due to this, streaming services have been focused on lower video resolutions using codecs that are optimized for high compression and low complexity for encoding. Software solutions for real-time HDTV encoding of H.264 first became available around 2010 — seven years after the first solutions were available for MPEG-2. Moore’s Law states that the transistor count on integrated circuits (ICs) is observed to double every two years, which can be roughly correlated to doubling computing power every two years. If we start at 2003 for a real-time MPEG-2 HD encoder, and we accept that H.264 HDTV encoding takes 10 times as much computational power, then we would expect real-time encoders to appear within 3.3 generations, which translates to roughly 6.6 years. This would predict that servers should have the computational capacity to do realtime software encode of H.264 HDTV by 2009, which lines up well to the actual availability of software encoders. Figure 2 – Predicted software encoded stream density In Figure 2, we apply Moore’s Law to predict servers without additional hardware acceleration should have the capacity to encode up to five HDTV H.264 videos in real time today. There are commercial software solutions that meet these capabilities on commodity server hardware. Hybrid Software and Hardware Compression It has been shown that hardware compression solutions provide high-performance real-time encoding at the beginning of the lifecycle for most codecs. Software compression solutions provide greater flexibility in compression offerings, but only meet performance milestones later in the lifecycle of a codec. A method to have the best of both worlds is a software architecture married with purpose-built hardware. Such a solution provides the flexibility of a software solution for generalized tasks such as TS processing and table manipulation, while offloading the computational complexity of video encoding to attached hardware. Mixing of software and hardware for compression solutions can provide significant density improvements, while providing the flexibility of software upgrades and easier development of new features. The addition of GPU-assisted encoding can provide a two-and-a-half to three times improvement in encoding density for H.264 encoding, depending on the efficiency of the software [4]. This is a considerable jump in performance and allows for significant flexibility in how the codec is used for compression. This would bring the total real-time encodes in a server to approximately 12 to 15 HDs, or three to four ABR profile sets of eight. While GPUs provide additional capacity for encoding, they are still designed for general-purpose computation, albeit for a specific instruction set that can be useful for assisting H.264 and HEVC compression. An alternate proposal is to utilize ASICs dedicated for compression on dedicated PCIe boards in a server. The advantage of an ASIC solution is that it has purpose-built compression hardware with low power utilization, and high performance for encoding only. In this model, you are able use software to perform all of the TS processing and rate control and ancillary features such as audio transcoding, while the ASICs handle the heavy lifting of the video encoding. This provides a flexible, high-performance and cost-effective solution. There are currently commercially available ASICs that provide a density of four HDTV encodes [5] at a power level that can fit 10 on a PCIe board. It is feasible to achieve 40 HDTV encodes in a server using a single PCIe slot. This produces 10 sets of eight ABR profiles. Using four of these PCIe boards with the same base CPU expands the density to 160 HDTV encodes or 40 sets of eight ABR profiles. Such an architecture scales very quickly as the software processing load is minimized by offloading all of the difficult tasks of encoding to the PCIe board. Delivering the Moment imaginecommunications.com Both a GPU and an ASIC-based architecture will increase the cost and power utilization on a per-device level. While they both will provide a reduction on a per-service basis, with the ASIC solution having almost 10 times the density, the savings is considerably more. Summary Streaming media viewership has undergone tremendous growth over the last few years. With this growth, video quality and quality of service has increased due in large part to improving Internet access and advances in streaming technologies such as ABR. These improvements have led to ever-increasing encoding complexity over traditional linear services. The increase in encoding complexity will only accelerate with the addition of HEVC and further need to simulcast different resolutions and codecs to handle client compatibility. While software encoding was key to the launch of streaming media, providers need to look at architectures that scale with the additional complexity requirements of their customers. Hybrid software and hardware architectures are currently the best options to provide the lower cost and greater flexibility and scalability needed to adapt to this changing market landscape. Imagine Communications is a global leader in processing and compression solutions for media, broadcast, service provider, government, and enterprise markets, offering an expansive portfolio of encoding and transcoding products. The company’s SelenioNext™ platform provides up to 10 times the density and 10 times less power consumption than competitive solutions, with the ability to replace an entire headend of video processing in a single platform. SelenioNext is an all-in-one TV Everywhere solution that enables service providers to ingest precompressed services and transcode, package, encrypt and stream multiscreen, multi-device video. Integrating multiple functions within a commercial-off-the-shelf (COTS) server platform, it provides a highly dense, scalable and operationally efficient package designed to meet the growing demand for live programming to mobile and connected screens. Available in 1U and 2U appliances or in a 10U blade system, Selenio Next easily fits into optimal form factors for all online video applications — from providing a handful of IPTV streams to thousands of multiscreen transcodes. Utilizing advanced Adaptive Bit Rate (ABR) technology, the system is unmatched in its support of up to 320 HD ABR or 320 SD ABR profiles per 2U server and its ability to scale up to any number of profiles per video program. Selenio Next also features an onboard broadcast management system for superior control and visibility into network resource optimization. References [1] Akamai, The State of the Internet, Volume 6 number 2, 2nd quarter, 2013 report [2] The Nielsen Company, “Average U.S. Home Now Receives A Record 118.6 TV Channels, According To Nielsen”, June 2008, http://www.nielsen.com/us/en/pressroom/2008/average_u_s__home.html [3] AMD, “Moonlight Launches the World’s First software based 720p MPEG-2 Real Time Encoding Solution for AMD64 Technology-Based Systems“, November 2003, http://www.amd.com/us/pressreleases/Pages/Press_Release_79092.aspx [4] Main Concept, “NVIDIA speed results”, June 2010, http://www.mainconcept.com/fileadmin/user_upload/download/product_ sheets/CUDA-Sheets_06-2010.pdf [5] ViXS Systems, http://www.vixs.com/indexee.php/products/features/xcode-pro-200 Originally published in the 2014 NAB Broadcast Engineering Conference proceedings. +1.866.4.Imagine © 2014 Imagine Communications Proprietary and Confidential WP_SCALELIVEOTT_0914
© Copyright 2026 Paperzz