Distributed modulo scheduling

M.M. Fernandes, J. Llosa, N. Topham

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Wide-issue ILP machines can be built using the VLIW approach as many of the hardware complexities found in superscalar processors can be transferred to the compiler. However, the scalability of VLIW architectures is still constrained by the size and number of ports of the register file required by a large number of functional units. Organizations composed of clusters of a few functional units and small private register files have been proposed to deal with this problem; an approach highly dependent on scheduling and partitioning strategies. The paper presents DMS, an algorithm that integrates modulo scheduling and code partitioning in a single procedure. Experimental results have shown that the algorithm is effective for configurations up to 8 clusters, or even more when targeting vectorizable loops
Original languageEnglish
Title of host publicationHigh-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On
Number of pages5
Publication statusPublished - 1 Jan 1999


  • distributed algorithms
  • instruction sets
  • parallel architectures
  • parallel programming
  • processor scheduling
  • program compilers
  • DMS
  • VLIW approach
  • VLIW architectures
  • code partitioning
  • compiler
  • distributed modulo scheduling
  • functional units
  • hardware complexities
  • partitioning strategies
  • private register files
  • register file
  • superscalar processors
  • vectorizable loops
  • wide-issue ILP machines
  • Clocks
  • Computer architecture
  • Computer science
  • Data analysis
  • Microprocessors
  • Pipeline processing
  • Processor scheduling
  • Radio frequency
  • Registers
  • VLIW


Dive into the research topics of 'Distributed modulo scheduling'. Together they form a unique fingerprint.

Cite this