BDTI’s DSP Insider Archives |
||
| HOME << |
||
This month:
Stretch Announces Software-Configurable Processor ArchitectureLast month Stretch Inc. announced the availability of an unusual new processor chip family, the S5000. This family pairs a 300 MHz Tensilica Xtensa RISC processor core with a 100 MHz reconfigurable compute fabric. The reconfigurable fabric, referred to as the Instruction Set Extension Fabric (ISEF), allows a set of custom instructions to be added to the RISC processor’s instruction set at run-time via software. The ISEF is logically separated into two sections, an organization that allows one section to be reconfigured with a new set of instructions while the other section is operating normally. (Reconfiguring requires 80-100 microseconds.) The ISEF targets the compute-intensive portions of signal-processing applications where it should provide significant performance gains. According to Stretch, the ISEF can implement instructions that collapse tens or hundreds of compute operations into just one instruction. For example, the ISEF can implement the sum of absolute differences function found in H.263 motion estimation. ISEF instructions execute at a 100 MHz rate (one third the instruction rate of the Xtensa core) and may have latencies of 10 or more Xtensa clock cycles. Using its full capacity, the ISEF can theoretically support an instruction that requires 400 16-bit ALUs and 64 16-bit multipliers operating in parallel, although it is unlikely that full utilization of the ISEF will be achieved in typical applications. The ISEF can support a large number of custom instructions simultaneously, but Stretch believes that typical applications will require only a handful of instructions to achieve significant performance gains. Many previous attempts to commercialize reconfigurable processors have failed primarily due to the considerable effort required to implement, optimize, and verify applications on such architectures. Stretch aims to minimize this hurdle through automatic configuration of the ISEF based on C/C++ application source code. According to Stretch, custom instructions are automatically incorporated into the processor through compiler analysis of compute-intensive portions of C/C++ application software that have been flagged by the user. This automated implementation of custom instructions promises to dramatically reduce application development time compared with ASICs and FPGA-based solutions. Further simplifying matters, Stretch has not invented a complicated new programming model: ISEF instructions are issued by the Xtensa core just like built-in RISC instructions and they access their own dedicated register file. Stretch is also able to leverage the relatively mature compiler and development infrastructure of the Tensilica architecture.
Based on performance estimates from Stretch, the S5000 family promises
to be significantly faster than today’s fastest mainstream DSPs—at
least on some applications. Many signal processing applications
exhibit significant data parallelism and make heavy use of specialized
operations. BDTI expects that the Stretch architecture will be a good
match for such applications. Although the S5000 family will not match
the throughput of ASICs or large FPGAs, it may well find homes in many
applications where the potential for rapid development outweighs the
need for absolute maximum performance.
Freescale and TI Introduce Low-Cost, High-Speed DSPsFreescale Semiconductor, Motorola’s soon-to-be spun-out semiconductor division, recently announced the MSC711x family of processors targeting telecom and VoIP applications. The MSC711x family is a lower-cost, binary-compatible derivative of Freescale’s high-end MSC81xx family, which targets performance-hungry communications infrastructure applications. Previous StarCore-based chips from Freescale were too expensive for many cost-sensitive applications. The lower cost of the MSC711x family allows Freescale to expand the scope of its VLIW DSPs into previously untapped markets in VoIP, industrial control, security, and wireless communications applications. The MSC711x family is based on the StarCore SC1400 core, one of the few DSP architectures offered both in off-the-shelf chips and as a licensable core. Freescale has announced five members of the MSC711x family, each differing in peripherals and on-chip memory (ranging from 88 to 408 Kbytes in total). Notably, all five initial MSC711x family members include a DDR DRAM memory interface, a rarity among mainstream DSPs. All five family members are expected to operate at 200 MHz, and all are expected to begin sampling in summer 2004, with full production slated for late 2004. According to Freescale, the lowest-cost MSC711x family member, the MSC7110, is priced at $12.05 in 10,000-unit quantities. Texas Instruments is also pushing its high-end DSPs into low-cost territory with the announcement of the TMS320C6410 and TMS320C6413. These new ’C64x family members target cost-sensitive applications such as voice- and video-over-IP, office equipment, and industrial equipment. Unlike the recently-introduced ’C64x chips that reach 1 GHz using a 90-nanometer fabrication process, these chips use a lower-cost 0.13-micron process and have clock speeds under 500 MHz. The 400 MHz ’C6410 with 160 Kbytes of total on-chip memory, at $17.95 in 10,000-unit quantities, offers the lowest price and the most performance per dollar in the ’C64x family. With well over 20 family members currently available, the size and breadth of the ’C64x lineup now encompasses a broad range of performance and price points. The ’C6410 and ’C6413 are currently sampling, with full production expected in September 2004.
The new low-cost ’C64x chips are quite similar to the Freescale
MSC711x family in several respects. First, both target similar
applications. Second, both achieve comparable price/performance
ratios based on their BDTI Benchmark™ scores. For example, with a
BDTIsimMark2000™ score of 2240, the MSC7110 is expected to offer
cost-performance similar to the ’C6410, which achieves a
BDTImark2000™ score of 3650. Finally, both families include
similar-modest-complements of peripherals and on-chip memory, which
are pared back compared to their higher-cost siblings.
BDTI Case Study
This Month: Checking Out the CompetitionCharting a processor roadmap is a difficult task. To set a successful course for a processor family, a processor developer must predict trends in the processor’s target applications as well as developments in competing processor families. The developer must then determine how to evolve its offerings in order to respond to these expected changes. For example, remaining competitive may require a carefully balanced mix of lowering prices, raising clock speeds, and adding architectural features. Designers of new processors face a similar challenge: their processors must be competitive not only with existing competitors, but also with new competitors that will arrive while their processors are in development. Again, this means that designers must extrapolate trends in target applications and in competitors’ processors. In either case, collecting and analyzing competitive data is a hurdle. A processor developer may need to consider several target applications and dozens of competitors. And it may not be enough to consider only one processor from each competing processor family. Instead, the developer may need to consider family members that target varied goals, such as the fastest processor from each family and the least expensive processor from each family. Merely collecting data for all the combinations of applications and processors is difficult. Analyzing all of this data and presenting the results can be a major undertaking. BDTI’s Benchmark Analysis Tool™ (BAT) helps developers meet these challenges. The BAT enables processor developers to quickly compare their processor to the competition using results from the BDTI Benchmarks™, a suite of twelve signal processing benchmarks. The BAT provides a weighting feature that allows processor developers to quickly tune the mix of benchmarks to each of their target applications. The BAT comes pre-filled with benchmark results for up to eighteen processors families. BDTI also provides speed, pricing, and power data for up to three processors from each processor family. This wealth of data allows processor developers to compare processors on a variety of performance metrics, including speed, cost efficiency, and energy efficiency. And the BAT provides a number of tools to help developers analyze these results. For example, the overall results for each metric are summarized in an easy-to-read chart.
In one recent engagement, a processor developer used the BAT to analyze
the competitive position of its newest processors. Using the BAT, the
processor developer was able to quickly compare its processors to
competitors that ranged from DSPs to general-purpose processors,
including both licensable cores and off-the shelf chips. To learn how
the BAT can help you stay competitive, contact Jeremy Giddings
(giddings@BDTI.com).
Impulse Response, by Jeff Bier
The Alchemist’s DreamOver the last few months I’ve noticed an increase in the number of tools that transform high-level signal-processing application descriptions into real-time implementations. The appeal of this idea is obvious. Many signal processing applications are initially designed using high-level tools and then migrated into low-level descriptions. Often this migration process involves multiple labor-intensive, error-prone steps. For example, an application might be developed using MATLAB, then re-built using floating-point C code, then again using fixed-point C code, and then finally optimized using assembly language. Cutting out the middle steps can make the development process faster and can help ensure that the final product matches the initial high-level design. In addition, these tools potentially reduce the number of distinct engineering teams needed for product development. For example, engineers specializing in MATLAB-based algorithm development are rarely skilled in assembly-language software optimization. As a result, product development typically requires multiple teams of specialists. By using a tool that transforms high-level descriptions into implementations, the high-level developers can create the implementation themselves. Unfortunately, so far the idea of automatically turning high-level descriptions into production-ready implementations has been like the ancient alchemists’ dream of turning lead into gold: it’s a compelling idea, but nobody can seem to make it work. One problem is that the path from a high-level description to an optimized, production-ready implementation is often not a straight line. For example, it is sometimes possible to optimize an application by removing a resource-hungry algorithm block and replacing it with a very different block that produces similar results. It is very difficult to create an automated tool that can exploit this type of optimization opportunity. Even when the mappings between high-level descriptions and optimized implementations are more direct, automated tools often perform poorly unless the user gives the tool plenty of hints. For example, tools that transform C code may require the user to structure the C code in certain ways or to use a limited, specialized set of C statements. And providing effective hints may require an intimate understanding of the underlying hardware—partly negating the value of a high-level tool.
Tools that create real-time implementations directly from high-level
application descriptions hold great potential. But today such tools
do not offer sufficient generality and efficiency to make them
practical for cost- or energy-sensitive applications. There is much
work to do before we get to that point. In the meantime, vendors and
prospective users must be realistic about what such tools can achieve.
Next Week: Embedded Processor Forum 2004For important product processor introductions and technology disclosures from the world of digital video, join BDTI at next week’s Embedded Processor Forum in San Jose. Jeff Bier, BDTI’s general manager, will moderate a conference session on Processors for Video Applications on Wednesday the 19th. This session will include announcements and technology disclosures from Ultra Data Corp., Cradle Technology, and Texas Instruments. The session will also feature a video architecture panel discussion with representatives from these companies and from Intel and Xilinx. In-Stat/MDR’s annual Embedded Processor Forum brings together key players from the world of embedded processing for four days of seminars and conferences. Conference sessions are highlighted by groundbreaking new product announcements from established leaders and up-and-coming innovators in processor technology.
For links to more information about Embedded Processor Forum, visit
http://www.BDTI.com/bdti_whatsnew.html#epf.
Inside the Intel PXA27x Now AvailableInside the Intel PXA27x is the latest in BDTI’s series of in-depth technical evaluations of individual processors. The Intel PXA27x (also known as Bulverde) is the next generation of the Intel Personal Internet Client Architecture (PCA) application processors. The PXA27x is a 32-bit fixed-point embedded processor family targeting PDAs and mobile wireless products—most notably, smart phones. Like all BDTI Inside reports, Inside the Intel PXA27x delivers results from benchmarking with the BDTI Benchmarks™ as well as comparisons of the PXA27x to competitor architectures. Inside the Intel PXA27x will help you understand how the PXA27x performs on key signal processing functions and will provide insights into this new architecture.
For more information or to order the report, go to
http://www.BDTI.com/products/reports_pxa27x.html.
Buyer’s Guide 2004: Analysis and Insight into DSP ProcessorsWhile many new types of processors have begun targeting signal processing applications in recent years, programmable DSP processors continue to hold a significant percentage of this growing market. What's more, the lion's share of programmable DSPs are provided by only three vendors: TI, ADI, and Freescale Semiconductor (the soon-to-be spun-out division of Motorola). The sixth edition of BDTI's flagship report, Buyer’s Guide to DSP Processors, examines these vendors' products, giving an experienced analyst's view of their strengths and weaknesses and quantifying their performance through established benchmarks. Buyer’s Guide for 2004 includes new benchmark results for ADI’s TigerSHARC and Blackfin processors and updated benchmarks for ADI’s SHARC, Freescale’s StarCore-based MSC8101, and TI’s ’C55x and ’C64x. The report provides updated analysis of products in each processor family, including speeds, prices, power consumption, and peripherals. Other new features in this year’s edition include the introduction of the BDTImemMark2000™, a single-number metric showing overall memory efficiency, innovative radar charts that summarize processor performance, a new layout, and improved formatting. The 2004 Buyer’s Guide is 584 pages in 8.5 x 11 inch format, spiral bound for easy desk use. The first copy is $2,695, including shipping via FedEx to North American addresses (for international shipping, add $75). Additional copies are discounted substantially.
For previews of the report and more information, go to
http://www.BDTI.com/bg04.
BDTI Releases Benchmark Scores for ADI and Freescale ProcessorsBDTI has updated its BDTImark2000™ scores for the Analog Devices ADSP-TS201 TigerSHARC and released a new BDTIsimMark2000™ score for the Freescale MSC71xx. The updated ADSP-TS201 scores measure the processor’s performance on both 32-bit floating-point and 16-bit fixed-point arithmetic. The Freescale MSC71xx is the first Freescale product using the SC1400 synthesizable high-performance DSP core from StarCore LLC. For these and other scores, go to http://www.BDTI.com/bdtimark/BDTImark2000.htm.
The BDTImark2000™ and BDTIsimMark2000 are summary measures of
digital signal processing speed distilled from a suite of DSP
benchmarks developed and independently verified by BDTI.
About BDTIBDTI is an independent source for digital signal processing technology analysis and optimized software development services. From rigorous technical analyses of processors for DSP, such as the Inside series of processor analyses, to highly regarded technology seminars, BDTI is the trusted independent source for reliable information on digital signal processing technology. As a software developer, BDTI is known for highly optimized implementations of signal processing algorithms and applications and for solutions to complex problems of integration, code size, and performance.
For more information, visit our Web site at http://www.BDTI.com.
As previously announced, a new newsletter, Inside DSP, published jointly by CMP Media’s EE Times and BDTI, will soon take the place of the DSP Insider. Both newsletters are free. If this newsletter was forwarded to you and you would like to receive the new Inside DSP newsletter regularly, register at http://www.BDTI.com/dspinsider.htm. If you do not wish to receive the new BDTI-CMP Inside DSP newsletter, send an email message to dspinsider@BDTI.com with the words Remove me in the subject line.
Previous issues of BDTI’s DSP Insider are archived and will continue
to be available at http://www.BDTI.com.
BDTI’s DSP Insider © 2004 Berkeley Design Technology, Inc. This is the last issue of the DSP Insider. BDTI replaced the DSP Insider with a new publication, Inside [DSP]. To view this publication, please visit http://www.InsideDSP.com. |