An architecture for microprocessor-based systems is provided. The architecture includes a SIMD processing unit and associated systems and methods for optimizing the performance of such a system. In one embodiment systems and methods for performing systolic array-based block matching are provided. In another embodiment, a parameterizable clip instruction is provided. In an additional embodiment, two pair of deblocking instruction for use with the H.264 and VCl codecs provided. In a further embodiment, an instruction and datapath for accelerating sub-pixel interpolation are provided. In another embodiment, systems and methods for selectively decoupling processor extension logic are provided. In yet another embodiment, systems and methods for recording instruction sequences in a microprocessor having dynamically decoupleable extension logic is provided, hi yet one other embodiment, systems and methods for synchronizing multiple processing engines of a SIMD engine are provided.
|Publication status||Published - 2007|