Vectorizing for Wider Vector Units in a HW/SW Co-designed Environment

R. Kumar, A. Martinez, A. Gonzalez

Research output: Chapter in Book/Report/Conference proceedingConference contribution


SIMD accelerators provide an energy efficient way of improving the computational power in modern microprocessors. Due to their hardware simplicity, these accelerators have evolved in terms of width from 64-bit vectors in Intel's MMX to 512-bit wide vector units in Intel's Xeon Phi. Although SIMD accelerators are simple in terms of hardware design, code generation for them has always been a challenge. This paper explores the scalability of SIMD accelerators from the code generation point of view. We explore the potential problems in vectorization at higher vector lengths. Furthermore, we propose Variable Length Vectorization and Selective Writing in a HW/SW co-designed environment to get around these problems. We evaluate our proposals using a set of SPECFP2006 and Physics bench applications. Our experimental results show an average dynamic instruction elimination of 33% and 40% and an average speed up of 15% and 10% for SPECFP2006 and Physics bench respectively, for 512-bit vector length, over the scalar baseline code.
Original languageEnglish
Title of host publicationHigh Performance Computing and Communications 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Number of pages8
Publication statusPublished - 1 Nov 2013


  • hardware-software codesign
  • microprocessor chips
  • parallel processing
  • program compilers
  • storage management
  • HW-SW codesigned environment
  • Intel MMX
  • Intel Xeon Phi
  • Physics bench applications
  • SIMD accelerators
  • SPECFP2006
  • code generation
  • dynamic instruction elimination
  • hardware design
  • hardware simplicity
  • memory operations
  • microprocessors
  • scalar baseline code
  • selective writing
  • variable length vectorization
  • vector length
  • vectorizing
  • wider vector units
  • word length 512 bit
  • Hardware
  • Microprocessors
  • Optimization
  • Proposals
  • Registers
  • Vectors
  • Writing
  • Dynamic optimization
  • HW/SW Co-designed processor
  • Speculation
  • Vectorization


Dive into the research topics of 'Vectorizing for Wider Vector Units in a HW/SW Co-designed Environment'. Together they form a unique fingerprint.

Cite this