Are you designing a system that involves audio? Maybe an audio product or a product with an audio subsystem? Here are some tips and tricks that may help you.
Designing audio systems and debugging audio presents some interesting challenges. Sound is ruthlessly real-time; the speaker cone will keep moving, even if your prototype isn’t able to keep up with the flow of output samples required. The same is true of a microphone: the microphone diaphragm keeps moving and must be sampled often enough. If you skip one or more samples at the input or the output, then a (very loud) click or pop can result. What’s more vexing, the human auditory system, which is the final judge of audio quality, is extremely sensitive, especially to unexpected sounds (or artifacts) that your implementation may introduce.
Of course numerical techniques can be used to test an audio system. But the fact that we have two ears and love to listen to sound presents an opportunity. Thanks to evolution, our ears are designed to pick up unusual, sudden events in the environment. If you are developing and debugging audio, then you can use your ears to help guide you to fixing problems. This article discusses some of the ways to put your ears to work, and other tricks useful in debugging audio.
Set up an initial passthrough test
When dealing with any signal processing application, but especially with audio, there are some ways to set up your debugging environment so that life is simplified throughout the development cycle, and even later when a new version of the product is to be released.
During development you will be using some kind of development environment. It may be a development board, with audio in and out. If the final hardware is not ready, you may use a software simulator (typically provided with the development tools for your target processor) to debug code. Using a well-constructed software simulator can provide significant advantages. For example, typically a software simulator allows disk files to be opened, read, and/or written. Whether using hardware or a simulator, if at all possible set up the development environment so that you can inject a test signal from a disk file and write one or more files of output data from your code. Instead of using standard .wav files, often the development environment requires that such files be in an idiosyncratic format. For example, it may require one audio sample per line written in ASCII text as a floating-point number ranging from +1.0 to -1.0; or one sample per line as a hex number. Sometimes two stereo samples are written on one line.
Whatever the format required by your development environment, it will help you to be able to convert to and from the usual audio formats, such as a .wav format. You may have to write a simple .C program or Matlab routine to do this, or you can use an off-the-shelf audio editor, which will also allow you to examine audio files. Programs for doing this are inexpensive and easy to find. On the PC, candidates include Adobe Audition http://www.adobe.com/products/audition/. One multi-platform shareware editor is Audacity http://audacity.sourceforge.net/; a Google search will turn up others. Matlab, Labview, and similar programs can also be used for this purpose. There are more sophisticated audio editing programs, such as
- read files from disk and write files to disk in a variety of formats (including the one required by your debugging environment)
- zoom in to the level of an individual sample
- modify an individual sample
- read out the value of an individual sample
- scale parts of the waveform
- view spectral plots
- cut, copy, and paste waveforms
- listen to sounds
- and finally, generate test waveforms.
At an early stage in the development process, set up a passthrough test. The development system should be able to pass a signal unchanged from the input to the output. One version of such a test involves sending a known signal, like a sine wave, to an A/D converter on a prototype development board, and examining the D/A output on an oscilloscope while listening to a speaker.
Figure 1. With no audio processing in place, an audio device should pass a signal through undamaged.
Ideally another version of this test involves injecting a digital signal after any analog-to-digital converter, and recovering a digital signal before any digital-to-analog converter.
Figure 2. Ideally a signal can be read from disk and injected into the development
system, then picked up from the development system and stored to disk.
Any processing modules get dropped between these test points
Figure 3. The pass through test provides the backbone for adding more modules
With this test jig in place, when problems occur later, you can return to the “wire” configuration to ensure that nothing new in the system (such as a new peripheral) has introduced bugs in the basic audio path. You can add modules one at a time to the “wire” setup and thereby isolate which module causes bugs. A variant of this is to provide a bypass switch inside each audio module. The switch can be set in real time from the debugging environment, for example by turning on or off a bit in a memory location. At minimum such a switch can be set at compile/assemble time in the source code.
If you’re dealing with a stereo signal, then the two channels have to perform in the same way. One way to verify this is to inject the identical signal into both channels, and write out the stereo results to separate disk files. A simple checksum will typically suffice to verify if the results are identical. If not, then a simple C program or a few lines of Matlab can create a file containing the difference. It is often extremely useful to listen to and examine such a difference file to pinpoint where the signal path diverges unexpectedly.
Audio algorithms that handle more than two channels, such as the well-known 5.1 format (front, left, right left surround, right surround, low-frequency effects or LFE), are now common. If you are dealing with such a multi-channel algorithm, then the principle stated for stereo generally applies to all channels. A possible exception is the LFE channel, which may carry a reduced bandwidth since it focuses on low-frequency effects. A two-trace oscilloscope can verify that the processing for each channel is identical for the path that includes the converters, even if you have to check channels individually in a multi-channel setup. In particular you want to verify that there is no phase inversion between any of the channels. Phase inversion will cause strange results; a sound can completely disappear if it is fed at the same amplitude to two channels whose phases are inverted (180 degrees out of phase).
Test the limits of the implementation with some standard test signals
There isn’t room here to cover all the ways to test audio, but some special signals that are appropriate for audio algorithms should be part of your debugging repertoire.
Create an input consisting simply of an impulse, that is, a single sample at full scale positive, preceded and followed by zeroes. Do the same for full scale negative. Run this signal through your system. Is there anything unusual about the output, such as clipping? Does the output settle back down to zero and stay there? Watch for a few least significant bits turning on and off repeatedly, without ever settling to exactly zero.
Input a sine wave. If your algorithm is linear (such as filtering, reverberation, sample rate conversion) then you should get out a sine wave, and at the same frequency. Observe the input and output sine wave for a long period of time, together. Is the phase shift between the input and the output constant over time?
Create a fake “direct current” signal, that is, a signal that consists of a long string of the same number. Again full-scale positive and full-scale negative should be included. Examine the output for anything unusual.
You may be implementing an algorithm designed to perform clipping, in which case an appropriate test is needed. If no clipping is intended, put in a sine wave at full amplitude and watch for clipping in the output. Clipping may be merely a flattening of the sine wave peaks; or something more irregular may happen during the clipped section of the output waveform. If the processor can detect numerical overflow, watch for it at each stage in the processing. Put in white noise scaled so that it peaks at the full amplitude range. Again, does overflow happen at some stage in the processing?
Create a signal consisting of a sine wave, not at full scale, with a DC offset. Unless the algorithm is designed to give the input a DC offset, you can expect that the output will be either a sine wave with the DC offset removed; or the output should have the sine wave with the DC still in place. Any other behavior should be investigated very carefully.
Put two different signals into each channel. Verify that channels have not been swapped. Also, watch for leakage between the channels.
Other textbook waveforms such as ramp, sawtooth, and square wave can be useful for debugging. But such waveforms are not band-limited, and at audio frequencies will cause aliasing. Aliasing will in turn confound the output. If you want to use a textbook waveform, a bandlimited version can be synthesized with a program such as the audio editor discussed above, or Matlab.
Plan on listening tests
As obvious as it may seem, it is worth emphasizing that someone needs to listen carefully to the audio before your product is sold. Sometimes it takes some explaining to get a manager who hasn’t worked in the audio industry to understand this.
Listening tests can be done with earbuds, or multimedia speakers in a cubicle. But a better approach is to set up a listening station in a separate, quiet room, with some high-quality loudspeakers and with some high-quality headphones.
Use various kinds of audio material: rap; techno; vocal solo; drums; male and female speech; classical; rock; and so on. Use songs that you love as well as material that you loathe. You might wonder why it is advisable to listen to a variety of material. The answer is that different musical material can affect audio products in various ways. If you listen only to a “wall of sound” song, you may miss numerical problems that occur as signals decay toward zero. If you only listen to your favorite styles, you may miss out on artifacts that part of your customer base may be hearing. Also, if you listen to your favorite songs, on the one hand you’re more likely to notice something wrong with familiar material, but at the same time you’re more likely to be distracted by the music you love.
There is a famous case in which some unexpected material turned into a goldmine for exercising an audio algorithm. During the early days of the development of MP3 and related audio compression, the researchers discovered that the song Tom’s Diner by
You may have used sine waves for technical testing, as was discussed above, but now is the time to listen to a sine wave for an extended period of time, certainly several minutes. If there is an occasional glitch in your system you will probably hear a pop. Remember that even one audio sample out of place can easily create an audible click or pop.
Not only should you listen to complete songs, you should also listen to isolated notes of instruments. The European Broadcast Union publishes a CD-ROM called “Sound Quality Assessment Material: Recordings for Subjective Tests” containing single notes from a wide variety of wind, brass, keyboard, and percussion instruments, at various pitches, various amplitudes, and with various playing styles. Most of these tracks are available online http://www.ebu.ch/en/technical/publications/tech3000_series/tech3253/index.php?display=EN. If you purchase the CD, there are also some orchestral and choral recordings. Listen to the attack of the notes: are the drums crisp? Listen to the main part of the note: is the violin or glockenspiel “clean”? Listen to the decay: does the piano fade into silence (or the noise floor) smoothly? Can you hear “steps” in the piano decay?
To get an idea of what you might hear, the Audio Engineering Society has released Perceptual Audio Coders: What to Listen For http://www.aes.org/publications/AudioCoding.cfm. The CD-ROM contains tutorial text as well as original and processed sound. Although this disk is aimed at artifacts from codecs like MP-3, it provides an inexpensive way to sensitize your ears to some of the more subtle artifacts that can also occur outside the realm of compression.
If your implementation is stereo, listen to how well the stereo image holds up. If the stereo image is not treated properly—for example, when there is leakage from the left channel into the right channel and vice versa—then the stereo placement of individual sounds will be affected. An individual sound may move from the right side to the left. Or the stereo field may collapse, that is, the full range between left and right speakers isn’t filled out (more sounds appear as if they are coming from the middle). The AES CD-ROM just mentioned provides examples of the stereo field collapsing. By extension, the place of sound in a 5.1 mix should remain true to the original sound source. In a movie sound track, listen for the dialogue, which is often in front center, wandering around aimlessly instead.
You may be implementing audio as part of a larger system, for example a set-top box. If there are visuals such as video in your system, make sure the speech or sound effects are in sync and stay in sync with the visuals over a long period of time. Toward the end of debugging, it would be reasonable to play an entire DVD movie through your prototype to make sure that nothing unusual happens. Some companies send prototypes home with their employees over the weekend so that they can do extended listening tests on movies & songs in their home environment.
The sound can guide your debugging
If problems show up in your listening tests, then use the listening tests as a guide to find the bug. As needed, use the audio file editor to examine the disk files that you’ve written to help hone in on the problem.
For example, suppose you hear a click. Use the audio sample editor to try to find the click. The click may be obvious, as in Figure 4, where the click is visible as a line sticking up from the rest of the sine wave. In other cases it may take some searching to find the clicks.
Figure 4. A mistake in a single audio sample can cause a click.
If the click sounds like it happens on a more or less regular, rhythmic basis, then use the audio file editor to find out exactly how many samples lie between each click. You may be able to relate this number (or a multiple or submultiple of this number) to the length of a buffer; for example, a buffer in an input or output interrupt service routine. If each click appears to be an individual sample, it may lead you to identify an off-by-one bug in how pointers are handled as they wrap around from one end of a buffer to the other. If there is clearly a big jump in the signal at each click, there may be a problem with swapping buffers. Or the problem may not be in software; for example, there may be a problem in a hardware clock that causes the disturbance in the audio output.
If the click does not appear to occur at regular intervals, there may be a numerical solution. Does the audible artifact happen with a sine wave input, a musical signal input, or both? Does the problem happen only when the signal is loud? If so, numerical overflow is the first suspect. Does the problem happen only when the signal dies down to silence? If so, numerical underflow, or filter limit cycles, should be checked.
If you can identify certain frequency ranges that are problematic, then you can home in on and/or rule out some filtering blocks that operate on the signal in those frequency ranges. To track this down, use sine waves at different frequencies, and musical signals with varying high-frequency and low-frequency content.
Does playing the same input always cause the same audible bug to happen at the same time? If not, then a hardware problem is more likely; for example, a disturbance on a clock line. A logic analyzer may be called for in this situation. Again, if the same audible bug happens in one channel and not the other, but both signals are subject to the same processing, then a hardware bug may be present.
It is important to exercise an audio implementation at all sample rates supported by the product. If you are dealing with compressed audio, such as an MP-3 player, you must exercise every supported bit rate and output channel configuration.
Although it may seem amusing to state the obvious, audio outputs have to sound good. This is a major distinction from the outputs of other forms of signal processing. After all who ever listens to the raw waveforms generated by a USB interface chip? Who ever listens to the raw waveforms in an 802.11 signal?
Audio represents a two-edged sword. Although audio has to sound good, as engineers we are lucky that our ears allow us to use what we hear to hone in on subtle and not so subtle bugs in an implementation. Thus, not only are the usual debugging tricks of the trade available, but a good set of ears in a good listening environment can be a major debugging tool.