Altera Hones DSP Capabilities with Stratix III FPGAs

Submitted by BDTI on Wed, 12/13/2006 - 20:00

In November Altera announced the Stratix III family, its next generation of field-programmable gate arrays (FPGAs). The new devices will be fabricated in a 65 nm process and feature a number of significant architectural changes.  To reduce power consumption, Altera has introduced “Programmable Power Technology,” which allows blocks of logic that don’t need to run at maximum speed to run in a slower, low-power mode.  The sizes of hard-wired memory blocks have been changed relative to the predecessor Stratix II chips, and support has been added for “distributed RAM,” which allows logic cells to be configured as small blocks of dual-ported RAM.  There have also been enhancements to the hard-wired DSP blocks, such as support for 12x12-bit multiplies and new rounding and saturation modes. 

The new Stratix III devices, targeting the growth markets of broadcast video, surveillance, medical imaging, and military, reflect Altera’s continued focus on high-performance DSP applications. The company’s DSP focus is not surprising considering the growing success of FPGAs in DSP applications in recent years. A recent bulletin from market research firm Forward Concepts estimates that Altera and Xilinx, the two largest FPGA vendors, each had DSP FPGA revenues in excess of $200 million in 2005, selling more non-cell-phone DSP silicon than Freescale and Agere. In BDTI’s recent report, FPGAs for DSP, the advantages of FPGAs for high-performance DSP applications are clear.  In the study, the BDTI Communications Benchmark (OFDM)™ was implemented on high-performance FPGAs from Altera and Xilinx (Stratix II and Virtex-4), as well as a Texas Instruments C64x DSP.  While the FPGAs were more expensive than the DSP, many more OFDM channels could be implemented on the FPGAs, resulting in a significantly lower cost per channel for the FPGAs.  The key disadvantage of FPGAs highlighted in BDTI’s study is the much higher development effort required to implement the benchmark on the FPGAs vs. implementing it on a DSP.

Among the new features in Stratix III, the enhancements to the DSP block in Stratix III, shown in Figure 1, are of particular interest. One notable change is that the new DSP blocks are divided into two “half DSP blocks.”  Each half DSP block is essentially an updated version of the full DSP block found in Stratix II. Support has been added for 12x12–bit multiplication to accommodate video standards that use 10-bit precision, such as HDSDI and SDSDI. A half block can now be configured to support six 12x12 multiplies, four 18x18 multiplies, eight 9x9 multiplies, or one 36x36-bit multiply.  Hardwired logic has also been added to support popular rounding and saturation modes.  Another significant change is the addition of an adder following the three-adder/subtractor tree, which enables cascading of half DSP blocks without using intervening reconfigurable logic. This allows for more efficient implementations of filter structures such as systolic FIR filters that rely on cascaded DSP blocks. 

 

Figure 1.  Simplified DSP Block Architecture

Figure 1.  Simplified DSP Block Architecture

The design of the DSP block is an interesting point of comparison between Stratix III and its main competitor, Xilinx’s Virtex-5 family, announced in May.  The added ability in Stratix III to cascade DSP blocks without external logic eliminates one notable difference between previous blocks from Altera and Xilinx (Xilinx has supported cascading of DSP blocks since Virtex-4).  In previous generation Stratix devices, summing the outputs of DSP blocks required logic implemented in the surrounding fabric, resulting in a reduction of the operating frequency. Altera’s new rounding and saturation unit will also make the Altera and Xilinx DSP blocks functionally more similar. Rounding and saturation capabilities are accomplished in the Xilinx blocks using the hard-wired ALU in the datapath. In general, Altera, with four multipliers and four adder/subtractors in each DSP block, continues to take a more coarse-grained approach geared towards functions involving sums of products. Xilinx, with a single multiplier followed by a simple ALU in each DSP block, continues to take a more fine-grained approach.  It is difficult to generalize about which block will be more efficient, as both offer considerable DSP capabilities and the utilization of those capabilities will be highly dependent on the application.

The architectural enhancements in the new Stratix III devices, along with the increased logic density enabled by migration to a 65 nm process, will give a significant boost to the already impressive DSP capabilities of the Stratix line of FPGAs.   This will likely spur Altera’s continued growth in high-performance DSP applications.  However, the large development effort required to map applications to FPGAs and the relative lack of DSP application experts with FPGA design expertise will continue to be a barrier as FPGAs move into new markets.

Add new comment

Log in to post comments