Articles and Papers
BDTI
HOME << FREE INFO << ARTICLES AND PAPERS << BDTI

dspD igital signal processing, or DSP, was once the province of big-budget applications like military radar and satellite communications. Then in 1978, Texas Instruments Inc. introduced a toy that proved DSP was viable for consumer products. Called Speak and Spell, the toy used specialized DSP hardware to synthesize some speech and was the first inexpensive DSP-based consumer product to sell well. After this modest beginning, it took nearly 15 years of improvements in microprocessor fabrication and DSP-oriented processor architectures before DSP really hit its stride [Fig. 1].

dspfigure In the last few years, DSP has become an enabling technology for a wide variety of crowd-pleasing products. [See "What counts in picking a processor"] Many answering machines now use it in place of analog tape recording because it increases their reliability and makes it possible to add such advanced features as fast message playback and random access to messages. The technology is also used in the adaptive suspension systems of some high-end automobiles. These systems use an on-board computer to process signals from sensors elsewhere in the car—for example, signals recorded by accelerometers that measure the car's vertical acceleration as it bounces over bumps in the road are used to adjust the stiffness of the suspension to compensate. From personal computers to car phones with hands-free dialing, any product that performs speech recognition or speech synthesis is using DSP technology. As for video games, which perform extensive image and sound processing, they couldnot exist without DSP.

A widespread use of DSP technology is data compression, particularly compression of speech and music signals. This form of processing exploits the many benefits of digital speech and audio (notably, better audio quality, fast searching, and sophisticated sound editing) without requiring prohibitively high bandwidth or storage capacity for transmitting or storing signals. A discussion of three consumer electronic products that rely on DSP technology for audio compression will relate the DSP requirements of the application to the characteristics of a particular microprocessor used to implement them. The intention is to provide some insight into the reasons for selecting both the algorithms and the hardware used to implement audio compression in consumer electronics.

Digital audio using AC-3

One of the best-known uses for DSP compression is in home theater and audio systems. CD audio uses 16 bits per data sample and, for stereo transmission, plays back two samples at a rate of 44.1 kHz. This translates into 1.41 Mb of storage for every second of audio. For applications with limited bandwidth or storage capacity, transmitting more than two channels of audio in the CD format can be impractical. So engineers at Dolby Laboratories Inc. in San Francisco developed a proprietary audio compression algorithm suitable for multichannel, high quality audio transmission, dubbed AC-3 (also known as Dolby Digital). AC-3 is used in high-definition television (HDTV), movie theaters and home theater systems, and encodes audio for up to six channels —right, left, center, right surround, left surround, and low-frequency effects (similar to a subwoofer). This configuration is often designated as 5.1 channels, since the low- frequency effects channel needs much less bandwidth than the other channels. For 5.1-channel transmission, AC-3 requires only 384 kb/s. How does it achieve such a low bit-rate without severely distorting the audio signal?

AC-3 is able to reduce the number of bits used to represent the audio signal through a lossy compression scheme. "Lossy" refers to the fact that some information in the original audio signal is omitted from the compressed version, so that the signal cannot be reconstructed exactly. All three of the compression algorithms to be profiled in this article are lossy. AC-3 relies on principles of psychoacoustics (human perception of sound) to choose which information to discard. Sounds that cannot (or can barely) be heard by the human ear can be removed without much degradation of audio quality. AC-3 makes use of the psychoacoustic phenomenon that recognizes that when sound is broken into its constituent frequencies, those sounds with relatively low energy adjacent to others with significantly higher energy are masked by the latter and are not audible. AC-3 also uses the effects of temporal masking, in which a loud sound masks quieter sounds that immediately precede or follow it.

AC-3 is a transform compression scheme. In other words, the input signal is transformed from the time domain into the frequency domain before compression. The transformation algorithm comprises a series of multiply-accumulate operations, and generates a series of coefficients that represent the relative energy contribution to the signal of each of the sub-bands (small frequency ranges). By analyzing the incoming signal in the frequency domain, the AC-3 algorithm can dynamically allocate the number of bits used to represent each frequency sub-band on the basis of its energy relative to that of adjacent sub-bands.

This allocation process also takes into account each sample's energy relative to the energy contained in immediately preceding and following samples. Psychoacoustically masked frequencies are given fewer (or zero) bits to represent their sub-band coefficients; dominant frequencies are given more bits. Hence, besides the coefficients themselves, information that describes how the bits are allocated must be transmitted to the decoder, so that it may reconstruct the bit allocation. In AC-3, all of the channels to be encoded draw from the same pool of bits, so that channels that need better resolution can use more bits than channels that are relatively quiet.

The output coefficients generated by the time-domain to frequency-domain transformation are typically represented in a block floating-point format in order to maintain numeric fidelity. Using block-floating point is one way to extend dynamic range in a fixed-point processor; it is done by examining a group (block) of numbers and determining an appropriate exponent that can be associated with the entire group. Once the mantissas and exponents have been determined, the mantissas are represented using the variable bit-allocation scheme described above; the exponents are represented with a fixed number of bits [Fig. 2].

Every bit in the bit-stream is used to maximize compression. On occasion, two or more small-valued mantissas may even be packed into one code word, which would normally contain a single mantissa. Because of the variety of tricks applied to compressing the data, much decision-oriented processing is required at the decoder to unpack the data. For example, the decoder software must determine whether the word contains one or more mantissas, and proceed with the unpacking process appropriately. This kind of processing is easier to implement on processors that include support for testing, setting, and clearing individual bits.

Encoding is performed at the time the HDTV program, digital versatile disk (DVD), or other audio transmission medium is created. The DSP encoding tasks include the time-to- frequency-domain transformation, determination of bit allocation using spectral analysis and psychoacoustic modeling, and bit-stream encoding.

At the decoder, the tasks are similar but in reverse: the decoder must unpack the incoming bit-stream, decode the spectral coefficients using information about their bit allocation, and convert the signal back to the time domain. The decoder may also have to combine several channels into one or two if the number of speakers available is fewer than six (as would happen in a home audio system with two speakers). Home audio equipment typically includes only the decoder because it need only play back the encoded audio.

A processor for an AC-3 decoder

AC-3 decoders for home audio systems are extremely cost-sensitive, as are all consumer electronic products. So here the goal is to choose a processor that provides the most cost-effective solution. Any of the ultra-high-performance DSP processors available today are certainly capable of AC-3 decoding. Any high-end general-purpose CPU, such as the PowerPC or Pentium, could also do the job—and in fact, AC-3 decoding software is currently available for Pentium-based PCs. For products that do not already contain a host processor, however, the additional horsepower provided by a high-end CPU is expensive and unnecessary. But just how much processing power is needed?

The answer depends in part on whether a floating- or fixed-point processor is chosen. Floating-point processors are relatively expensive but can decode AC-3 using fewer instruction cycles (because they need not expend cycles on the operations needed to maintain numeric fidelity in a fixed-point environment). In a highly cost-constrained product like a home AC-3 decoder, this expense alone might rule out floating-point processors. The overall cost is also affected by the need for multiple serial ports to support multiple output channels of sound, and by the amount of RAM or ROM required. For processors that use larger data word sizes, such as floating-point processors, the memory requirements increase proportionally.

AC-3 can be implemented on a 16-bit fixed-point processor, but with difficulty: its numeric fidelity requirements would make software development on a 16-bit processor extremely complicated. This delay, in turn, may affect time-to-market, hardly something to be desired in the competitive atmosphere of home electronic products.

One solution is the ZR38601, a fixed-point DSP processor that uses 20-bit data words. The ZR38601 was designed by Zoran Corp., Santa Clara, Calif., for audio decoding. Its wider data word simplifies software development without vastly increasing data memory requirements or chip cost over those of comparable 16-bit processors. Using the 20-bit word width, AC-3 decoding requires about 30 million instructions per second (MIPS) for a 5.1-channel input; the ZR38601 provides 50, leaving roughly 20 MIPS for additional functions. The ZR38601 includes sufficient on-chip memory to store all of the required AC-3 decoding software, and the chip is available with AC-3 software built in.

The ZR38601 also includes multiple specialized serial ports (needed to interface to off-chip digital-to-analog converters) that can support up to six audio channels. These ports are not included on most other DSPs, and can increase the cost of the overall system if they must be added separately. One further consideration is that the ZR38601 uses 32-bit-wide instructions, double the width used in most other fixed-point DSP processors. The wider instructions make room for a more powerful and easier-to-use instruction set, which can simplify programming and hence reduce time-to-market. The down side of the wider instruction words is larger program memory requirements.

Application profiling performed by Berkeley Design Technology Inc. during AC-3 software development indicates that the bulk of the decoding processor's time is spent on three tasks: transforming the signal from the frequency domain back to the time domain; reconstructing the bit-allocation scheme and unpacking the mantissas; and de-normalizing the incoming mantissas using the associated exponents. The exact distribution of time spent on each task depends on the processor. The ZR38601 is able to transform the signal to the time domain efficiently using its single-cycle multiplier and instructions designed to support common frequency-to-time-domain transformation algorithms. The processor also provides single-cycle exponent detection, in which elements in a block of samples are analyzed to determine the exponent value that will allow the best representation of the samples—a useful feature for implementation of the block floating-point format.

Perhaps most important is that the ZR38601 model can accomplish 5.1-channel AC-3 decoding for a cost of roughly US $10 per chip. (Note that all chip prices provided in this article are for large-quantity purchases.) This processor achieves very good cost-performance by matching its performance, peripherals, and data word width to the needs of the application.

TwinVQ compression

TwinVQ, an acronym for transform-domain weighted interleave vector quantization, is an audio compression scheme developed by NTT Corp., Tokyo, to provide good quality audio transmission at low bit-rates. Because of its low bit-rate requirements, TwinVQ-encoded audio can be replayed from the Internet in real time, rather than having to be downloaded first. Data sampled at 48 kHz and compressed using TwinVQ requires less than 64 kilobits per channel per second.

Not coincidentally, 64 kb/s is the bit-rate of ISDN lines. TwinVQ can be implemented at bit-rates even lower than 64 kb/s, and in tests that compare perceived audio quality, TwinVQ performs particularly well relative to other compression schemes at comparable bit-rates when the bit-rate of the encoded audio is tightly constrained.

TwinVQ compression will star in several upcoming consumer products, one of which is SolidAudio. The user of this hand-held portable device can download TwinVQ-encoded audio from the Internet and replay it on the go. SolidAudio is being developed in a cooperative effort by NTT Corp. and Kobe Steel, Kobe, Japan, and is expected to become available to consumers in late 1999. Unlike CD players, SolidAudio has no moving parts, since it replays music from memory rather than from a disk. Hence, it is insensitive to motion and will not skip.

Although product details and features have not been released, a major benefit of using digitized audio is the possibility for implementing audio searching and flexible sound editing, so these may be among SolidAudio's features list, either initially or in later versions. TwinVQ is also used in Yamaha's SoundVQ—that is, PC software that plays TwinVQ-encoded audio from the Web, and includes sound-editing capabilities.

TwinVQ, as the name suggests, uses vector quantization. Instead of quantizing each sample individually, it quantizes groups of samples together. In DSP terminology, quantization is the process of representing a signal digitally using a limited number of bits. Vector quantization requires the processor to compare the vector of an input sample to a predefined table of quantization vectors (often called a codebook) and, by some means of error analysis, to choose a good match for the input vector.

Identical codebooks are maintained at the encoder and decoder, so that only the index into the codebook needs to be transmitted—a technique that reduces the required bit transmission rate. Vector quantization means more work at the encoder than single-sample quantization, but less work at the decoder. This trade-off—higher complexity at the encoder in exchange for lower complexity at the decoder—is made in several areas of TwinVQ encoding. The approach makes sense given that the decoder (which may be used in a hand-held consumer product, such as SolidAudio) is highly sensitive to cost and power consumption, both of which are linked to the performance capabilities of the processor.

Most computations in TwinVQ encoding involve processing the input signal to improve the fidelity of its vector-quantized representation. First, the input signal undergoes a time-domain to frequency-domain transformation, just as in AC-3 encoding. While that transformation makes it possible to transmit the resulting coefficients that describe the contribution of each frequency sub-band, in practice it would produce low-quality audio because the range of coefficient values associated with each sub-band is large.

For example, the low-frequency signal components tend to have very large coefficients associated with them, while the higher-frequency components have smaller coefficients. For a given number of bits used to represent each coefficient, a smaller coefficient value has a bigger quantization error—the error between the actual value and its representation using a given number of bits—than a larger value. This characteristic is undesirable because it causes uneven distortion across the frequency bands. What would be better is to have the same level of accuracy for each of the coefficients, regardless of their magnitude.

So TwinVQ performs a number of steps that "flatten" the spectral coefficients by identifying large variations and removing the information about them from the spectral coefficients. Information about the large variations is then transmitted in a separate format, generally one that requires fewer bits than would be required to obtain a similar quality of audio reproduction by transmitting spectral coefficients containing the same information.

In the first flattening step, TwinVQ uses a parameter-based model of the input audio signal. In this approach, an audio signal can be thought of as a result of processing some known input signal (often called an excitation signal). The particular processing depends on the mathematical "model" of the audio. The model is described in terms of variables—that is, to produce different audio signals using the model, both the input excitation and the variables must be changed. These variables are the parameters of the model.

Ideally, once the excitation signal and parameters of the model are known, the audio signal can be exactly reconstructed. Therefore, only the excitation signal and the parameters must be transmitted to the decoder, rather than samples of the audio signal itself. But the model need not be transmitted, since it is known in advance and can be stored at the decoder.

Often this transmission can be accomplished in far fewer bits than would be required to generate a similar-quality reproduction using samples of the audio signal itself. The idea of using parameter-based transmission of audio signals rather than samples of the audio itself is similar to that of transmitting three parameters of a sinusoid—the frequency, amplitude, and phase—rather than transmitting samples of the sinusoid's value. Models of this type typically use a linear predictive coder, or LPC, which is common in speech compression algorithms.

In general, there is a trade-off in complexity of the model versus the resolution of the resulting frequency information. The TwinVQ model parameters accurately represent the coarse frequency characteristics of the signal, but not the finer frequency structures. In other words, the parameters describe the "envelope" of the frequency behavior but not the finer variations. In TwinVQ, the information contained in the model parameters is mathematically removed from the sub-band coefficients, a process that flattens the coefficients so that their values have less variation, making them more easily representable using a fixed number of bits. The flattened sub-band coefficients contain only the information about the finer frequency structures. In effect, information about large spectral variations is transmitted by means of the model parameters, while the remaining sub-band coefficients transmit finer frequency information.

The coefficients undergo further flattening through a process that extracts the harmonic content of the signal. That information is then removed from the sub-band coefficients (again, to be transmitted separately). After the various flattening processes, the coefficients are grouped into vectors in such a way that each vector has a relatively uniform distribution of energy, and then are quantized using a weighted vector quantization scheme. Weighted vector quantization is a method for determining which vector in the codebook will provide the best match for the data samples that need to be transmitted, by attempting to minimize the overall quantization error of the vector. Each coefficient in the vector is weighted according to its importance in the overall signal; more important coefficients are more heavily weighted, so a minimization of the overall quantization error focuses on minimizing the error for the most heavily weighted coefficients. This weighting is determined using the previously calculated model parameters.

Weighted vector quantization increases computational complexity at the encoder, but it results in better audio quality without any increase in bit transmission rate. Transmitting a codebook index to an optimal vector requires no more bits than transmitting an index to a sub-optimal vector. Most of the encoding process can be represented as a transfer function, carried out through digital filtering. Reconstruction at the decoder uses the sub-band coefficients and the model parameters to generate a filter that implements the inverse transfer function of the encoder, and to reconstruct the audio signal [Fig. 3].

A processor for SolidAudio

Since SolidAudio is both portable and aimed at the consumer market, three requirements dominate the selection of its DSP engine: the processor must have sufficient speed to perform the necessary DSP tasks, it must be an extremely low-power device, and it must be relatively inexpensive. With all three requirements pointing toward a fixed-point DSP, the question is then whether a 16-bit data word is sufficient or whether the designer would be better off choosing a 20- or 24-bit processor.

A conventional 16-bit DSP processor requires about 70 MIPS to decode a TwinVQ-encoded 44.1-kHz audio signal. On the market today are several low-power, low-cost 16-bit processors that provide over 70 MIPS in computing power. The processor chosen for SolidAudio is the Texas Instruments TMS320C54x, one of the lowest-power DSP processors available. The chip operates at up to 100 MHz, providing 100 MIPS. With a 2.5-volt supply and at 100 MIPS, it dissipates roughly 115 mW. In comparison, most other DSP processors in this performance class dissipate 200-600 mW.

As in the AC-3 decoder, one of the primary signal-processing tasks required for TwinVQ decoding is in transforming the signal from the frequency-domain representation back into the time domain. The TMS320C54x provides good support for these computations both in its hardware and in its instruction set.

Processing the inverse transfer function boils down to digital filtering, implemented through a series of vector dot products. Like most DSPs, the 'C54x contains a dedicated multiply-accumulate (MAC) unit and multiple on-chip memory buses, enabling efficient dot product computations. TwinVQ also requires the decoder to perform a number of table look-ups to determine the values of the quantized vectors. For table look-ups, indexed addressing—a specialized addressing mode common among DSP processors, including the 'C54x—is of particular use. Transmitting the decoded audio signal to the amplifier usually requires at least one serial port interface, which the 'C54x includes on-chip.

Compressing voice over IP

One of the hottest new markets for DSP is voice over IP (VoIP), a term that refers to the process of transmitting telephone calls over the Internet or other data networks rather than through the traditional telephone system. Several scenarios can be used for implementing this type of transmission. In one, the originating phone call would be placed by a computer and transmitted directly through the data network to a destination PC and telephone handset. The caller would completely bypass the telephone network, reducing the cost of placing the call.

Another possibility would be for the caller to dial a local phone number that would connect the phone call to a local interface between the telephone system and data network. This interface could be provided by, for example, an Internet service provider. The caller would specify the destination phone number, which would be used to route the call through the data network to another interface that is located close to the callee. At that point, the call would rejoin the regular telephone network and proceed to its destination.

In this scenario, the cost of placing a long-distance call can be reduced because the caller would have to pay only for the two local call segments, plus any charges associated with Internet use and the interface between the telephone system and the Internet—which can be significantly lower than current long-distance telephone rates.

But before VoIP becomes a mainstream technology, some technical problems must be overcome. A big obstacle is that the Internet was designed to transmit packets of data, not voice signals. Therefore, the voice signal coming from the telephone system must be transformed into voice packets so that it is transmissible through the Internet. This translation from telephone to Internet (and back) takes place at "gateways," whose widespread implementation would be necessary to allow speech signals to be routed to any desired destination, as described in the second scenario mentioned above. The speech packets are compressed using DSP to make the most of the available network bandwidth.

There are a number of speech compression algorithms currently in use for VoIP; the focus here will be on only one of them, the standard recommended by the International Telecommunication Union (ITU) for speech transmission over the Internet, G.723.1 [Fig. 4].

To condition the input digitized speech for Internet transmission, it is first put through a high-pass filter to remove any dc bias. The filtered signal is then used to extract speech model parameters by employing a linear predictive coding (LPC) speech model. This process is similar to that described in the TwinVQ algorithm. Keep in mind, however, that speech compression tends to be simpler than audio compression, because speech has more regular frequency content and requires less fidelity.

In addition, G.723.1 also performs pitch estimation on the input signal. The pitch of the signal, which is the periodic component, is one of the primary spectral contributors to the signal. Much as in TwinVQ, G.723.1 attempts to identify the "gross" frequency characteristics of the signal, so that they can be removed and transmitted in a separate compact format. Again, the goal is to use as few bits as possible to represent these large-scale frequency contributions (by transmitting parameters rather than samples of the speech itself) and reserve the remaining bandwidth for the more subtle frequency content.

Using vector quantization, G.723.1 weights the vector components according to information generated by the LPC analysis. The three components that are vector-quantized and transmitted (as indices) to the decoder are the speech-model parameters, the pitch information, and the remaining non-periodic speech component. The decoder is then responsible for looking up the correct codebook entries for each of the indices, and reconstructing the signal using much the same technique as described for TwinVQ.

The VoIP gateway

In the scenario where the Internet gateways are remotely located, they are not, strictly speaking, consumer products. Instead, they are the enablers of consumer VoIP products. The selection of a processor for this application, therefore, is not quite as sensitive to cost. Still, the cost per voice channel is considered to be a relevant metric used for comparing processors for this type of application.

Similarly, since Internet gateways are not portable devices, they do not have the severe power consumption constraints of products like SolidAudio; there are, however, constraints on the total power consumption of the gateway hardware. Like cost, power consumption is assessed in the context of power required per channel rather than power per chip. Using a single chip to implement multiple channels of speech compression/decompression can result in lower per-channel cost and power consumption, and may have the added benefit of significantly reducing the amount of board space required. All of these considerations point to use of a high-performance DSP.

The fastest DSP processors currently on the market are multi-issue architectures. Unlike conventional DSPs, they can issue and execute more than one instruction per instruction cycle. They take one of two forms: superscalar, in which parallel operations are scheduled by complex hardware contained within the processor, and very-long instruction-word (VLIW) architectures, in which the programmer or software-generation tool specifies which instructions will be executed in parallel. Both architectures typically use simple, single-operation instructions rather than the complex, compound instructions traditionally associated with DSP processors, making them better compiler targets.

Much DSP software development generally takes place in assembly language, because DSP architectures and instruction sets tend to be highly specialized and therefore do not make good compiler targets. So a DSP processor that allows the programmer to work (at least to some extent) in a high-level language such as C has a greater advantage in applications that can afford the cost and power overhead associated with these high-performance devices—such as VoIP gateways.

One further consideration in the choice of a processor for this application is program memory requirements, which influence the overall cost and power consumption of the system. In general, program memory use is governed by the processor's instruction word width and the efficiency of the instruction set. Since the processor may need to store software for several speech compression algorithms (not just G.723.1), it is desirable to choose a processor that has a relatively large complement of on-chip memory.

One processor currently used for implementing G.723.1 and other signal-processing tasks associated with VoIP is ZSP Corp.'s ZSP16402. This chip uses 16-bit instructions, common in conventional DSP processors but uncommon in reduced-instruction-set computer-based processors (which usually use 32-bit instructions). The processor's short instruction words contribute to good memory efficiency. The ZSP16402 is four-way superscalar; that is, it can issue and execute up to four instructions per cycle. Current ZSP devices execute at a clock rate of 200 MHz. According to the company, the high performance capabilities of the ZSP16402 enable the device to process up to eight channels of speech at a time. The processor also includes two high-speed serial ports, which support bit-rates of up to 200 Mb/s, as well as direct memory access (DMA) units that allow data transfers to take place without intervention by the processor core. Thus, the ZSP processor is able to address the need for a high rate of data exchange onto and off the chip.

Most processors in the DSP mainstream (such as the Zoran ZR38601 and the TI TMS320C54x) contain one MAC unit and one arithmetic-logic unit (ALU), which can often operate in parallel. The ZSP16402 has two of each, and all four units can operate in parallel. In addition, both MAC units can be used for additions when not being used for multiplications, enabling up to four additions per clock cycle. Most of the processing in the linear predictive coder analysis involves autocorrelation of the input signal; that is, comparing the input signal to a delayed version of itself to determine how similar they are. Autocorrelation is achieved through a series of vector dot products. Because the ZSP16402 contains two MAC units, it is able to perform these dot products efficiently.

G.723.1 is called a bit-exact standard, meaning that, for a given input speech sample, the compression should produce an output that exactly matches some specified output. This is in contrast to some other compression schemes, which specify that the output must fall within a maximum error of the desired output. Therefore, implementing this algorithm requires a processor that has good support for bit manipulation, so that it can organize and rearrange bits to exactly match the specified output format. The ZSP processor does not contain dedicated bit manipulation units (as do a number of other DSP processors), but is able to perform bit manipulation operations in its two ALUs. The ZSP chip costs roughly $50, providing good cost-per-channel while also facilitating software development in a high-level language, a combination that is well suited for VoIP gateways.

DSP in the future

With DSP processor capabilities increasing at a tremendous rate, the options for future consumer electronic devices seem endless. Speech and music compression technologies and algorithms are likely to improve as well. In the near future, expect to see such consumer devices as hand-held organizers with speech recognition and synthesis, speech conversion devices (which could, for example, alter the perceived gender of the speaker), commercial identifier (which would identify the presence of advertisements on radio or TV), and hands-free remote control for television sets.

These are just a few of the possibilities, as the capabilities of DSP processors increase and continue to provide the spark for new consumer electronics products.


To probe further

The paper "Design and Implementation of AC-3 Coders," by Steve Vernon, Dolby Laboratories Inc., June 1995, describes the development of encoders and decoders for the AC-3 audio compression algorithm.

For a description of the proposed International Telecommunication Union standard for voice transmission over the Internet, see the report "Recommendation G.723.1 (Dual Rate Speech Coder for Multimedia Communications Transmitting at 5.3 and 6.3 Kb/s)," ITU/T, July 1995.

"Buyer's Guide to DSP Processors," (Berkeley Design Technology Inc., Berkeley, California, 1999) is a 950-page technical report on DSP processors. The report discusses DSP benchmarking methodologies in detail and contains extensive benchmarking data for popular DSP processors. Excerpts from this report are available on the World Wide Web at http://www.bdti.com.

An introductory textbook on DSP processor architectures is DSP Processor Fundamentals: Architectures and Features (Berkeley Design Technology Inc., Berkeley, Calif., 1996).

An excellent DSP reference is Understanding Digital Signal Processing, by Richard G. Lyons (Addison-Wesley, Reading, Mass., 1996).


About the authors

Jennifer Eyre is an engineer/technical writer at Berkeley Design Technology Inc. (BDTI), a digital signal processing (DSP) technology analysis and software development firm in Berkeley, Calif.

Jeff Bier is a co-founder and general manager of BDTI. He is a member of the IEEE Design and Implementation of Signal Processing Systems (DISPS) technical committee.

This article is based in part on information provided by DSP processor vendors ZSP, Zoran, and Texas Instruments. In preparing the article, BDTI invited all of the major DSP chip vendors to participate; most vendors declined the invitation because of scheduling constraints or other resource limitations. BDTI especially wishes to acknowledge the generous contributions in time and expertise of Vlad Fruchter of Zoran and Mikael Berner of ZSP Corp. BDTI implies no endorsement of the products described in this article.


This article originally appeared in the March 1999 edition of IEEE Spectrum magazine.
Spectrum editor: Linda Geppert
(c) copyright 1999 BDTI

Top of page