How to Make Your Sound Sing with Vocoders
Pages: 1, 2, 3, 4, 5
Meanwhile, an identical set of bandpass filters has been splitting the carrier signal into separate frequency bands. In this part of the vocoder, each filter's output passes through an amplifier. The gain of each amplifier is controlled by the signal coming from an envelope follower. So if the speech signal has significant energy in the 1.6–3.2kHz band, for instance, the envelope follower for that band will output a high control signal, which in turn will boost the gain on the amplifier that's receiving the portion of the carrier signal in the 1.6–3.2kHz band.
That's where the vocoding happens. The combination of bandpass filters, envelope follower, and amplifier causes the partials in the carrier's 1.6–3.2kHz band to take on the same amplitude contour as the partials in the speech signal's 1.6–3.2kHz band. Note, however, that if the carrier has no sound energy in a particular band, the amplifier can't create it. It can only amplify or attenuate whatever is happening in the carrier signal within that band.
All that remains is to mix the bands of the carrier signal back into a single output. That's handled by the vocoder's internal mixer. (Although our example has a mono output, some vocoder designs let you pan the outputs of the individual bands anywhere in the stereo field for a more spacious sound.)
One of the characteristics of human speech is that it contains extremely high-frequency sounds caused by sibilants (consonants such as "S") and fricatives (consonants such as "F"). The vocoder design described above is not very good at distinguishing among these sounds. In addition, many of the signals you might want to use as carriers don't have enough sound energy in the extreme high-frequency band to produce good sibilants and fricatives.
To make vocoded speech more understandable, some vocoders have a pass-through circuit that sends the portion of the speech signal above 8kHz or so directly to the output, mixing it with the carrier signal rather than attempting to vocode it. The level of the pass-through is usually adjustable.
The carrier should be a signal that has significant acoustic energy throughout as much of the sound spectrum as possible. A synthesizer sawtooth wave works well, and for this application the synth's own filter should be wide open. Noisy signals, such as white noise and sampled wind sounds, are also good choices for the carrier. A waveform such as a triangle wave, which has weak overtones to begin with, or a synth waveform that has already been filtered by the synth's own lowpass filter, tends not to produce good results when vocoded.
Fig. 4. If you load tutorial file 1 and press the Tab key to spin the rack around, you can see how the sound generators and processors connect. (Click to enlarge.)
Because a vocoder requires two different audio inputs, it can't be set up as an ordinary insert effect in a software-based multitrack. In some digital audio workstations, you place the vocoder plug-in on a stereo aux-send bus; the left side of the stereo signal becomes the speech input, while the right side becomes the carrier. The sends from two tracks can then be panned hard left and hard right to feed the vocoder. In other DAWs, the vocoder plug-in may have a special input bus selector for the speech signal.
I've had good luck using a drum loop as the speech input for a vocoder. Here's an example, a recent composition of mine created in Reason, "Peace in Palestine." For more background on the piece (and to hear a higher-resolution version), visit my site.
The first demo file that accompanies this article, OreillyVocoderDemo1.rns, also has an example of this technique; to replace the vocal speech input with the drum loop "speech," mute and unmute the appropriate tracks in the sequencer. (See Figure 4.) Here's how it sounds:
A drum loop with some rhythmic activity in various frequency regions works best. If there's not much going on in the mids, for instance, you might want to mix a bongo or conga percussion loop with the beat before sending it to the vocoder. Likewise, if you want to hear the rhythm of the kick in the output, your carrier signal needs to have some low notes. The PAiA vocoder has a distortion circuit on the carrier input to increase the level of overtones.
When using ordinary speech for the speech input, you may find it useful to compress and normalize the sample so as to bring its peaks up to a uniform level. Similarly, inserting a hardware compressor between the microphone and the input on a hardware vocoder can give smoother results.