posted by Roger Finger on Tue 4th Mar 2003 18:51 UTC
IconDigital media applications are unique in that they can generally consume all the performance they can get. Unlike other tasks that execute in a few seconds, the rendering of stills, audio and video can take several minutes or even hours. Applications in the digital media space can translate increases in performance to increases in end-user productivity, and it is therefore beneficial for them to take advantage of the latest platform technologies.

Overview

The Pentium® 4 Processor with Hyperthreading Technology delivers performance and architectural features that dramatically reduce the overall processing time and improve the responsiveness of the system.  Processors equipped with Hyper-Threading Technology have multiple logical CPU's per physical package. The state information necessary to support each logical processor is replicated while sharing and/or partitioning the underlying physical processor resources. Multiple threads running in parallel can achieve higher processor utilization and increased throughput.


In Section 1, Video Production is used as an example of application workflow to show how Hyperthreading Technology benefits digital media production.  Each of the four major steps of video production is examined in detail.


In Section 2, the multi-tasking characteristics of a system with HT technology are considered.  When multiple applications are running on a system, HT technology helps reduce stalls and task switching delays caused by the interaction of two or more independent programs.


Section 3 discusses some software and system level design considerations for optimizing multi-threaded applications in a multitasking environment.  In this section we'll look at how application developers can use the Intel compilers to develop optimized code and then use the VTune® Performance Analyzer to identify hotspots and optimize the code.



Section 1: Video Production Case Study


Video production is a complex multi-step process that often involves using multiple programs to achieve the desired output.  There are four major steps to this process:


  • Acquire:   Capture movies and pictures, capture audio
  • Build/Edit: Edit, mix, preview, store your project
  • Render: Apply compression and format the file
  • Output: Store the end result on hard drive, or burn to disk


  1. Acquire

Digital Video Cameras connect to the PC using IEEE 1394 (Firewire), USB, or through an analog connection.  They transmit at a fixed rate of 25 or 30 frames per second (depending on format: PAL or NTSC), so the capture step can never go faster than the actual play time of the video.  Five minutes of video takes five minutes to capture.


The data rates are high (about 4 Mbps) and the PC has to keep up with the source, or else dropped frames will result.  Dropped frames degrade the quality of the video, so most software packages warn not to do anything else on your system while capture is under way.



On non-Hyper Threaded systems and slower systems that's still good advice.  But with HT technology, multi-tasking capability of the system is enhanced: A background task is less likely to get pre-empted by other programs.  Multi-tasking allows the user to continue using their PC for other activities.  Video capture is not a CPU intensive activity - it typically consumes less than 15% of a 3.06 GHz Pentium® 4 processor (figure 1).  Why not allow the end user to use that time for something else? 


Figure 1: DV Capture from IEEE 1394


Some capture applications simultaneously encode the incoming Digital Video stream into Windows Media or MPEG formats.  The advantages are that smaller files are created, and the media is in the desired output format early in the process.  The disadvantage is that some quality will be lost early in the production cycle. 


Figure 2 shows DV capture from IEEE 1394, with encoding to MPEG2 for the output.  Capture time is still the rate-limiting step, but the CPU is kept very busy with the encoding task.  The completion times of both were approximately equal, but with HT technology there is more CPU capability available for other tasks to use.  This translates into faster UI responsiveness, even under heavy multi-tasking loads.


Figure 2: DV Capture from IEEE 1394 with MPEG 2 Encoding


It takes a lot of multi-tasking activity during video capture to cause frame drops on Hyper Threaded systems.  The only caveat is to watch out for disk conflicts.  The I/O rates during DV capture create a continuous demand on the hard disk of about 2-3%.  The data rates are not that high, but streaming must be maintained.  If these disk updates cannot happen in real-time, then frame dropping can result.  Application developers can avoid some of these problems by locking I/O resources during critical real-time operations - but should do so with the full understanding that other applications may stall as a result.


  1. Build/Edit

During the editing phase of production, audio, video, and stills are mixed together from various sources.  During video preview, decoders for MP3, MPEG2, AVI, and other formats will be running.  Individually, these decoders do not demand high CPU utilization.  Playback is very smooth - until you throw in additional audio tracks, transitions and special effects where multiple codecs and filters must run simultaneously.  The more complex transitions can be very CPU intensive and usually involve decoding two or more media streams at the same time.  In figure 3, the peaks that are seen every 10 seconds are video transitions.


Figure 3: Video Preview with Transitions

  1. Render

Rendering involves taking the edit decision list and creating video file on the hard disk.  Rendering is very CPU intensive - it can use all the performance you can through at it and scales well with faster processors.  Audio and Video encoders run simultaneously during rendering so this step is well suited to threading and parallelism.


Figure 4: Video Encoding to MPEG2


With the speed of a 3 GHz Pentium® 4 processor and HT technology, it is now possible to encode full resolution NTSC video faster than real-time!  In Figure 4 the source was a 180 second DV video.  Without HT technology, the video was encoded in 136 seconds.  With HT technology enabled, the time to encode MPEG2 decreased to 111 seconds.  For a one-hour video project, the encode time was about 37 minutes.


One surprising difference between Hyper Threaded and non-Hyper Threaded systems is the responsiveness of the User Interface.  New tasks launch right away and the cursor is rarely in an hourglass.  The encoding task is spread across both processors, and there is plenty of headroom for other applications to run.  Without Hyper Threading technology, Figure 4 shows the CPU is 70% consumed with the video encode task - leaving limited resources for other programs.



  1. Output to media

After video encoding is complete, the next part of the process is to create a disk image and write it to CD or DVD so that it can be distributed and played back on a DVD player connected to a television.  This phase of the process actually has two major phases with many sub-steps that utilize different parts of the system.


In the first phase the video and audio files are converted to the proper format.  Depending on the playback target, the format may be MPEG2 (for consumer DVD players), MPEG4 (for posting on the web), or VCD (a lower resolution format for writable CD's).  In the following example, the output media will be assumed to be high quality DVD-compatible 720x480 30 frames per second.   Figure 5 shows the two phases of the Output cycle for writing a DVD.


Figure 5: Output to DVD


Phase 1 involves transcoding or re-encoding the audio and video streams into a compatible format.  This is a CPU intensive process, and hard disk activity is also high as files get read in, modified, and then written back out to disk.  Without Hyperthreading technology, the CPU is 100% consumed and takes about three (3) times longer for the encoding phase.


Phase 2 consists of file operations to prepare the image for burning and then write it to the media.  It is not CPU intensive since the rate-limiting step is the CD or DVD burner.  Multitasking during optical disk writing can be risky.  On older systems there is a event known as a “Buffer Under-run” that can occur if the CPU is not able to produce data fast enough to keep the disk writer sufficiently stocked. 


This problem has been largely overcome as the new drives have larger input buffers (typically 2 Mbit) and there are now protection mechanisms such as “Burn Proof” technology that ensure the disk will get written properly.   Most DVD writers do have buffer under-run protection, but the drives are slower (typically 2x) and DVD writing can take up to an hour.  Your system is still available, but you should avoid operations involving heavy disk activity.


Table of contents
  1. "Overview and Section 1"
  2. "Section 2"
  3. "Section 3"
  4. "Summary and Conclusion"
e p (0)    32 Comment(s)

Related Articles

posted by Thom Holwerda on Wed 26th Nov 2008 09:28 submitted by caffeine deprived
posted by Thom Holwerda on Thu 13th Nov 2008 13:32
posted by Thom Holwerda on Mon 3rd Nov 2008 11:14 submitted by Dan Warne