Tensilica ConnX BBE Combines SIMD, VLIW for Baseband Performance

Submitted by BDTI on Wed, 07/22/2009 - 18:00

Last month Tensilica unveiled the first member of its new “ConnX” family of licensable DSP cores, the ConnX Baseband Engine (BBE), which combines VLIW with SIMD to support a wide range of parallel operations. As part of the announcement, Tensilica has also rebranded two of its existing products: the Diamond 545CK core and Vectra DSP engine are now known as the ConnX 545CK and ConnX Vectra, respectively. Tensilica says it has a lead ConnX BBE customer that taped out a chip in June; the core will be available for license in September. Speed, area, and power consumption have not yet been disclosed. 

The BBE core is a specific configuration of the Xtensa LX base processor. Tensilica supports up to eight parallel BBE cores, thus yielding a scalable multicore solution that targets a wide range of baseband applications ranging from handsets to basestations.

The ConnX BBE core supports three-way VLIW instructions that can specify up to two load/stores plus two arithmetic instructions, which can be MAC or ALU operations. The BBE supports 8-way SIMD MAC and ALU instructions and can, in a few cases, execute sixteen parallel 18x18-bit MACs. The core does not, however, support any 16-way SIMD ALU operations.

Combining VLIW with SIMD provides more flexibility (and therefore typically better performance) than pure SIMD machines, which require all parallel operations to be identical. Tensilica isn’t the only licensable core vendor to use VLIW plus SIMD for baseband; the recently announced high-performance CEVA-XC core family includes one, two, or four 256-bit (16-way) SIMD engines, each of which supports 3-way VLIW.

One key difference between the CEVA and Tensilica solutions is that the CEVA-XC is a single scalable core, while the ConnX BBE is a scalable multi-core solution. In Tensilica’s case, the designer adds more BBE cores to increase performance, while CEVA customers choose a different core variant. At the high end of the two families, Tensilica offers higher parallelism (e.g., 128 parallel MACs vs. 64 on the high-end CEVA-XC641)—but CEVA claims that its single-core approach is simpler and avoids the challenges of multi-core programming, particularly that of partitioning the application across processors.

In contrast to the CEVA and Tensilica solutions, NXP’s recently announced licensable baseband core—the CoolFlux BSP—does not support VLIW at all and implements much narrower SIMD capabilities; the core can execute a maximum of four parallel operations.  Unlike the Tensilica and CEVA cores, the NXP core is designed exclusively for handset applications, and thus trades off fire-breathing performance for a very small footprint and low power.

There’s another notable entry in this class of cores, ARM’s “Ardbeg” vector processor (which appears to be an R&D prototype, not a product—information on the core is somewhat spotty and ARM did not respond to our request for additional details). Ardbeg is designed to be used as a modem coprocessor alongside a Cortex-R4 core. Each Ardbeg coprocessor can support up to 32-way 16-bit SIMD operations with limited two-way VLIW capabilities.

As you’d expect from a customizable processor company, Tensilica has equipped the BBE core with many (200) custom instructions to accelerate baseband processing, including extensive support for operations on complex data.  The core can execute, for example, four complex FIR filter taps per cycle. The ConnX BBE does not, however, include specialized support for Viterbi/turbo decoding; Tensilica is working on another solution for that. According to Tensilica, the core’s compiler supports SIMD vectorization of ANSI C code, and allows direct access to the core’s features via embedded functions.

Given the lack of clock speed data, it’s difficult to estimate the performance of the ConnX BBE core. BDTI has, however, benchmarked the earlier Diamond 545CK core (now the ConnX 545CK), which supports three-way VLIW and eight-way SIMD.  At 245 MHz the 545CK core received the highest BDTIsimMark2000™ score (4070) of any licensable core BDTI has benchmarked to date.

Tensilica is one of several vendors heading in the same direction, with licensable, programmable cores targeting 3.5G and 4G baseband. This trend can be attributed to several factors. The market for baseband processors is growing, as volumes increase and picocell and femtocell base stations become more popular. This entices more processor vendors (both core and chip vendors) to go after a piece of the pie. In parallel, base station vendors who have been using their own homegrown processing engines may be finding that the cost of developing and maintaining a processor is becoming painful. Such companies could buy a chip from a company like TI, but this solution may not give them sufficient opportunity for differentiation or meet their performance requirements.  Licensing a scalable, programmable core from a well-established company such as Tensilica or CEVA may provide just the right combination of customization and performance.

Add new comment

Log in to post comments