BDTI Releases Benchmark Results for Massively Parallel picoChip PC102

| | Write the first comment.

BDTI has released the first independent benchmark results comparing the performance of picoChip’s massively parallel PC102 chip to that of high-performance DSP processors and FPGAs.

picoChip is a fabless semiconductor company that sells multi-core chips for wireless infrastructure applications, such as WiMax base stations. The PC102 is based on picoChip’s multiple-instruction, multiple-data (MIMD) architecture and contains 308 heterogeneous processor cores and 14 co-processors, all of which run at 160 MHz.

BDTI evaluated the PC102’s performance using the BDTI Communications Benchmark (OFDM) ™. This benchmark is an application-oriented benchmark based on an orthogonal frequency division multiplexing (OFDM) receiver, as shown in the block diagram below. It is representative of the baseband processing found in many current and emerging wired and wireless communications applications.

ofdm_block_diagram

Figure 1. Block diagram for the BDTI Communications Benchmark (OFDM)™.

 

BDTI has used the BDTI Communications Benchmark (OFDM) to evaluate a range of processing engines that target communications applications, including traditional, high-performance DSP processors and high-performance, DSP-oriented FPGAs. 

For this benchmark, BDTI reports two sets of results: low-cost results, which are optimized to provide the lowest cost per channel; and high-capacity results, which are optimized to accommodate the maximum number of channels per chip. A chip vendor may use two different chips to generate these two results. 

High-capacity benchmark results for the PC102 and other selected chips are shown in Table 1; low-cost results are shown in Table 2. Because so far, picoChip has only benchmarked one of its chips, its high-capacity and low-cost results are the same.

Additional BDTI Communications Benchmark (OFDM) results are available at /Resources/BenchmarkResults/OFDM.       CCB_web.gif       

 

 

                 

 

Chip

Clock Speed

Chip cost
(Qty 1K)

Maximum
channels
supported

Cost per
channel

picoChip PC102

160 MHz

$95.00   

14

$6.79

TI TMS320C6455 (without
using Viterbi co-processor)

1 GHz

$292.67

1.09

$268.50

TI TMS320C6455 (Estimated
results for chip using Viterbi
co-processor)

1 GHz

$292.67

<=1.8

>=$162.60

Xilinx Virtex-4 FX140

-11 speed
grade

$1,286.00

432

$2.97

Table 1. BDTI Certified high-capacity results for the BDTI Communications Benchmark (OFDM). Data for chips from Xilinx, Altera, and Texas Instruments excerpted from "FPGAs for DSP, Second Edition" © 2006. Results © 2006-2007 BDTI.

 
Chip

Clock Speed

Chip cost
(Qty 1K)

Maximum
channels
supported

Cost per
channel

picoChip PC102                

160 MHz

$95.00

14

$6.79

Altera Stratix II 2S15


Slowest speed
grade

$55.00

20

$2.75

TI TMS320C6410

400 MHz

$14.95

0.31

$48.23

Xilinx Virtex-4 SX25

Slowest speed
grade

$89.00

64

$1.39

Table 2. BDTI Certified low-cost results for the BDTI Communications Benchmark (OFDM). Data for chips from Xilinx, Altera, and Texas Instruments excerpted from "FPGAs for DSP, Second Edition" © 2006. Results © 2006-2007 BDTI.

As shown in Table 1, the 160 MHz PC102 is able to handle 14 channels of BDTI’s OFDM benchmark. The PC102 high-capacity results fall between those of the high-performance FPGA and the high-performance DSP processor, in terms of the number of channels supported and the associated cost per channel. The Texas Instruments high-performance ‘C6455 can only handle one channel of the benchmark, which means that the system designer would need to use multiple ‘C64x’s to implement a multi-channel application, or (more likely) a combination of 'C64x plus an FPGA. The Xilinx FX140, on the other hand, can handle many more channels than the PC102 The FX140 is much more expensive than the PC102, but has a lower cost-per-channel. (As mentioned earlier, the high-capacity results are optimized for maximum channels rather than minimum cost-per-channel.)  

The low-cost results shown in Table 2 indicate that the PC102 again falls between the DSP processor and the FPGAs in terms of the number of channels supported and cost per channel.

Interestingly, the PC102 benchmark implementation does not simply replicate a single channel implementation across the chip. Instead, the benchmark implementers several different channel implementations to maximize the chip resource utilization and cram as many channels as possible into the chip. Some implementations use the hardware co-processors, others are coded exclusively in software, while others use a mixture of the two. This approach is very effective—none of the processors are idle—but it requires the programmer to create and test multiple implementations of the same functionality. That probably wouldn't be the case with a 'C64x or FPGA-based implementation.

As we've written about in the past, picoChip is one of the few massively parallel chip vendors that are shipping chips to customers in volume. The PC102's benchmark results indicate that it fills a gap between high-performance DSP processors and high-performance FPGAs, which may make it an attractive alternative to these more traditional approaches.