Energy-Exposed Instruction Set Architectures1 IntroductionEnergy consumption is emerging as a key factor limiting computational performance in both mobile and connected systems. While there have been significant advances in low-power circuit design and low-power CAD, and some work in low-power microarchitectures, to date there has been little work at the instruction set architecture (ISA) level of design. for low-power computing. Modern ISAs such as RISC or VLIW are based on extensive research into the effects of instruction set design on performance and provide a purely performance-oriented hardware-software interface. These instruction sets avoid features that might prevent a high-performance implementation. They also avoid providing alternative ways to perform the same task unless it significantly increases performance. Implementations of these ISAs perform many power-consuming microarchitectural operations during the execution of each user-level instruction, and these dominate the total power dissipation. For example, when executing an integer addition instruction on a simple RISC processor, only 1/50 of the total power consumption is due to the adder circuit itself. The rest is dissipated by cachetags and data accesses, TLBs, register files, pipeline registers, and pipeline control logic. Modern machine pipelines have been refined to the point where most additional microarchitectural work is performed in a pipelined or parallel manner that does not affect the throughput or user-visible latency of a "simple" append instruction. Since their performance effects can be hidden, there is no incentive to expose these constituent micro-operations in a purely performance-oriented hardware-software interface: their power consumption is hidden from software. -Energy) at MIT, where we are developing new energy-exposed hardware-software interfaces that give software fine-grained control over energy consumption. The key idea is to reward compile-time analysis with run-time energy savings. Our projects provide software with alternative methods of performing a task, possibly with the same performance, but where more knowledge at compile time can be used to turn off unnecessary portions of the machine's microarchitecture. We will initially focus on integer applications with complex control flow as we believe this type of code will become the energy bottleneck in future embedded systems. Other highly regular and/or parallel computations can be easily mapped into energy-efficient computing structures such as vector/SIMD units [1], reconfigurable units[5], or application-specific coprocessors[3]. The following section provides examples of the techniques we are currently exploring. All simulation numbers we report are for a MIPS R3000-like processor running gcc-compiled SPECint95 binaries. Section 3 discusses the technology we are developing for more sophisticated processor power accounting, and Section 4 concludes. 2 Examples of ISA techniques with exposed energy, loads, and untagged stores: In some cases, static analysis can determine that a memory access will always succeed in the cache. For these cases, we provide versions of the load and store instructions that do not check cache tags.
tags