Jeff Bier’s Impulse Response—Multiprocessor Migraines?

Submitted by Jeff Bier on Mon, 04/12/2004 - 16:00

Once upon a time, most signal processing applications were powered by single-processor chips. Today, though, there is an increasing trend towards using complex, heterogeneous multiprocessor chips. One such chip is Texas Instruments’ recently announced OMAP 2, which contains a microprocessor core, a DSP core, and multiple application-specific coprocessors.

The primary motivation for this shift is not processing speed; the fastest uniprocessor DSPs and general-purpose microprocessors are fast enough for many performance-hungry signal-processing applications. Instead, multi-processor chips are gaining popularity because these chips deliver not only strong computational speed but also prices suitable for high-volume applications and (at least in some cases) energy efficiency suitable for battery-powered systems.

The obvious down-side of heterogeneous multi-processor chips is that they are more complex. Instead of a single processor architecture and a single tool chain, users have to contend with learning multiple architectures and tool chains. Many chip vendors attempt to ease this process by providing software and middleware so that the user doesn’t have to develop all of the code from scratch. Nevertheless, users will likely need to develop, optimize, and debug some software for each processor on the chip.

Perhaps even more daunting is the task of partitioning an application across different processors. Unlike “channelized” applications, which often use multiple instances of the same processor to execute the same software in parallel, applications that use heterogeneous multi-core chips run different portions of the application on different processors. So which portion goes on which processor?

Ideally, you’d like to run each block of code on the processor that’s most efficient for that block. But it may not be obvious which processor is most efficient until you’ve actually implemented and optimized each block on each processor—which is not a viable approach. In addition, you’ll have to consider the overhead associated with inter-processor communication—that may mean that certain blocks are best mapped to a less-efficient processor to avoid excessive shuttling of data.

When heterogeneous multi-processor chips provide significantly better performance, energy efficiency, or cost than single-processor alternatives, system developers will put up with some added complexity. But it is up to the chip vendors to provide the needed infrastructure to help make that complexity manageable. 

Add new comment

Log in to post comments