In this article, we presented a set of compilation techniques that are developed in the Intel high-performance compiler for the OpenMP pragma-guided and directive-guided parallelization. Two multimedia applications were studied to demonstrate that the multithreaded codes generated and optimized by the Intel compiler are very efficient, together with the support of the well-tuned Intel OpenMP runtime library. The performance improvements achieved on three SP+HT, DP and DP+HT system configurations are pretty good for the multimedia applications (SVM and AVSR) studied in this article. The performance results and workload characteristics of SVM and AVSR demonstrated and evidenced our three main observations: (a) the multithreaded code generated by the Intel compiler yields a good performance gain with the parallelization guided by the OpenMP pragmas; (b) the exploited thread-level parallelism (TLP) causes inter-thread interference in caches, and places greater demands on the memory system. However, the Hyper-Threading technology hides the additional latency, so that there is only a very small impact on the whole program performance, and the overall performance gain makes this little impact not visible on Hyper-Threading enabled Intel platforms; (c) Hyper-Threading technology is effective on exploiting both task- and data-parallelism through functional and data decomposition in multimedia applications.
The authors thank all members of the Intel compiler team for their contribution in developing the Intel high-performance compiler. In particular, we thank Paul Grey, Hideki Saito, Dale Schouten for their contribution in PAROPT projects, Kund J. Kirkegaard for IPO support, Zia Ansari and Kevin B. Smith for PCG support, and , Max Domeika and Diana King for the C++ FE support, Bhanu Shankar and Michael Ross the Fortran FE support. Special thanks go to the library team at KSL for developing the OpenMP runtime library. We would like to thank Steven Ge and Rainer Lienhart for the development of speech recognition workloads.
E. Su, X. Tian, M. Girkar, G. Haab, S. Shah, and P. Petersen, “Compiler Support for Workqueuing Execution Model for Intel SMP Architectures”, in Proc. of European Workshop on OpenMP (EWOMP), Sep. 2002.
L. Liang, X. Liu, M. Zhao, X. Pi, and A. V. Nefian, “Speaker Independent Audio-Visual Continuous Speech Recognition,” in Proc. of Int’l Conf. on Multimedia and Expo, vol. 2, pp. 25-28, Aug. 2002.
X. Tian, A. Bik, M. Girkar, P. Grey, H. Saito, and E. Su, “Intel OpenMP C++/Fortran Compiler for Hyper-Threading Technology: Implementation and Performance”, Intel Technology Journal, Q1, 2002. (http://www.intel.com/technology/itj)
D. Marr, F. Binns, D. L. Hill, G. Hinton, D. A. Koufaty, J. A. Miller, and M. Upton, “Hyper-Threading Technology Microarchitecture and Architecture,” Intel Technology Journal, Vol. 6, Q1, 2002.
Y.-K. Chen, M. Holliman, E. Debes, S. Zheltov, A. Knyazev, S. Bratanov, R. Belenov, and I. Santos, “Media Applications on Hyper-Threading technology,” Intel Technology Journal, Q1 2002.
OpenMP Architecture Review Board, “OpenMP C++ Application Program Interface,” V2.0, Mar. 2002. (http://www.openmp.org) D. M. Tullsen and J. A. Brown, "Handling Long-Latency Loads in a Simultaneous Multithreading Processor," in Proc. of Micro-34, Dec. 2001.
Xinmin Tian (Xinmin.Tian@intel.com) works on compiler parallelization and optimization. He manages the OpenMP Parallelization group. He holds B.Sc., M.Sc., and Ph.D. degrees in Computer Science from Tsinghua University. He was a postdoctoral researcher in the School of Computer Science at McGill University, Montreal. Before joining Intel Corp., he worked on a parallelizing compiler, code generation, and performance optimization at IBM.
Milind Girkar (Milind.Girkar@intel.com) received a B.Tech. degree from the Indian Institute of Technology, Mumbai, an M.Sc. degree from Vanderbilt University, and a Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign in Computer Science. Currently, he manages the IA-32 Compiler Development group. Before joining Intel Corp., he worked on an optimizing compiler for the UltraSPARC platform at Sun Microsystems.
Yen-Kuang Chen (Yen-Kuang.Chen@intel.com) is a researcher in Microprocessor Research Labs, Intel Corporation. His research interests include computer architecture to embrace the emerging audio-visual applications, innovative technologies in intelligent human-computer interface, and multimedia signal processing. He received his B.Sc. degree from National Taiwan University and his Ph.D. from Princeton University, both in Electrical Engineering. He is on the editorial board of the Journal of VLSI Signal Processing Systems.
Aart Bik (Aart.Bik@intel.com) received his M.Sc. degree in Computer Science from Utrecht University, The Netherlands, in 1992 and his Ph.D. degree from Leiden University, The Netherlands, in 1996. In 1997, he was a postdoctoral researcher at Indiana University, Bloomington, Indiana, where he conducted research in high-performance compilers for Java*. In 1998, he joined Intel Corporation where he is currently working in the vectorization and parallelization group.
Ernesto Su (Ernesto.Su@intel.com) received a B.Sc. degree from Columbia University, and M.Sc. and Ph.D. degrees from the University of Illinois at Urbana-Champaign, all in Electrical Engineering. He joined Intel Corp. in 1997 and is currently working in the OpenMP Parallelization group. His research interests include compiler performance optimizations, parallelizing compilers, and computer architectures.
(Performance results were measured using specific computer systems and reflect the approximate performance of Intel products. Any difference in system hardware or software design or configuration may affect actual performance.)
- "Hyper-Threading Technology for Multimedia Apps, Page 1"
- "Hyper-Threading Technology for Multimedia Apps, Page 2"
- "Hyper-Threading Technology for Multimedia Apps, Page 3"
- "Hyper-Threading Technology for Multimedia Apps, Page 4"
- "Hyper-Threading Technology for Multimedia Apps, Page 5"