Jeff Bier’s Impulse Response―Using More Transistors to Save Energy

Submitted by Jeff Bier on Thu, 11/17/2011 - 05:42

I’m frequently amazed by people who can take whatever random material happens to be in good supply and make something useful out of it. Consider this clever gentleman, for example, who made a solar water heater from beer bottles. These days, thanks to continuing advances in chip fabrication, one thing that’s in abundant supply is transistors. Over the past few years, quite a few chips with transistor counts over one billion have gone into production.

Generally speaking, more transistors mean higher power consumption. The power consumption of an individual transistor tends to go down as transistor sizes shrink, but not necessarily by a large enough factor to offset the effects of higher transistor density and higher operating frequencies. And lots of power consumption in a small space can be a problem, limiting performance and requiring cooling fans and heat sinks. But, an increasing number of cases, clever engineers are actually finding ways to use more transistors to reduce power consumption.

One increasingly popular way to convert more transistors into lower energy consumption is through architecture specialization—that is, creating processing blocks that are purpose-built to handle certain tasks. By virtue of their specialization, these blocks are able to execute their specific tasks with lower energy consumption (often much lower energy consumption) than would a general-purpose architecture.

For example, while modern GPU chips offer enormous computing power, and provide programmability to enable their use for many tasks beyond 3-D graphics, GPU manufacturers generally don’t use the GPU itself for video decoding tasks, instead integrating a specialized processing engine that is used exclusively for video decoding. Similarly, modern DSP processors offload straightforward processing work for common tasks like audio sample rate conversion and forward error correction decoding to dedicated co-processors. And mobile application processors like TI’s OMAP and Qualcomm’s Snapdragon, which are tasked with delivering enormous compute performance with very low power consumption, rely on a constellation of coprocessors including DSPs, GPUs, and video compression engines to offload their main CPUs.

Another interesting approach is to design chip elements so that they run in a high-performance, high-power mode only when needed, and the rest of the time in a low-performance, low-power mode. This technique can be applied at various levels of granularity. For example, in some Altera FPGAs, individual logic blocks are configured by the logic synthesis tool to use high-performance or low-power mode as needed.

At a much larger granularity, as we reported in InsideDSP last month, NVIDIA’s Tegra 3 "quad-core" application processor actually incorporates a fifth CPU core. This lower-performance, lower-power core can be used (and the higher-power cores can be powered-down) when a tablet or smartphone is lightly loaded, such as when the user is talking on the phone or listening to music. And ARM’s recent embrace of this approach, which it calls "BIG.little," will undoubtedly speed its adoption as a mainstream technique in mobile CPU-intensive chips.

So, in an interesting turn-about, the energy efficiency challenge posed by cramming more transistors into a tiny chip can often be solved by—yes—cramming even more transistors into a chip. Of course, there’s no free lunch: While these added transistors can reduce overall chip power, they obviously add to the size of the chip, and therefore to cost. And they also add complexity—complexity for chip designers, to be sure, and also, in many cases, complexity for software developers. For chip companies (and, in an increasing number of cases, chip users), managing those complexities is a critical challenge.

What do you think?  Post a comment or send me your feedback at http://www.BDTI.com/Contact. I’d love to hear from you!

Add new comment

Log in to post comments