What is a Phase Vocoder? How Pitch Correction Works in Music Production

Phase vocoder is the signal processing technique that enables some of the most common operations in modern vocal production.

But few producers know the term, or understand how it affects the results they get from their plugins.

But if you ever used pitch correction to polish a vocal performance or get the classic hard-tuning effect, you’ve used this technology.

In this article, I’ll explain the basics of phase vocoder processing and why it matters for getting the most from your vocal tracks.

Let’s get started.

What is a phase vocoder?

Phase vocoder is a signal processing algorithm used for manipulating pitch information in audio signals.

It was primarily developed for processing speech, so it’s often considered within the broader category of voice encoding technologies, or vocoders. But it has little in common with synth-based vocoders that work by modulating a carrier signal with a modulator signal.

In fact, technologies that use the phase vocoder technique often focus on pitch shifting and time stretching rather than voice modulation.

Phase vocoder is a signal processing algorithm used for manipulating pitch information in audio signals.

Even so, phase vocoder plays a role in some of the most common vocal effects in music production, including pitch correction and hard tuning.

How does phase vocoding work?

The phase vocoder algorithm breaks the incoming signal down into chunks and performs an analysis called the Fourier transform on each successive block.

The Fourier transform takes simple information in the audio waveform and uses it to extract a profile of the frequency content it contains.

This frequency information includes the position of the signal’s harmonics and their intensity relative to one another.

Since every sound is made up of many harmonic partials at different frequencies, this information can be used to guess the musical note that’s being played or sung during that analysis frame.

But once the analysis has taken place, the phase vocoder algorithm can also manipulate the frequency information before reconstructing the signal.

That means it can shift the pitch by moving the location of the harmonic partials while preserving their relationship to one another.

As a result, the algorithm can change the pitch without impacting the timbral qualities of the voice, since these are determined by the unique distribution of harmonics in the singer’s voice.

What does the Phase part mean?

If you’ve been following along closely, you may have spotted a potential problem with this method—phase coherence.

Since each incoming frame of audio must be analyzed and processed separately, the changes affect each portion of the signal differently.

Without correcting for this problem, you’d hear audible changes in sound from frame to frame, including distracting skips in phase.

The issue is made worse by the fact that successive analysis frames need to overlap each other for us to perceive a continuous sound.

Modern implementations of the phase vocoder use advanced techniques to ensure that the phase relationships between frames remain coherent.

Phase vocoder and pitch correction

With the basics out of the way, here’s how the phase vocoder technique works in pitch correction.

As I mentioned above, the algorithm can determine the pitch class of a sung note by extracting the positions of its harmonics and their intensity.

The sounds we perceive as having a musical pitch are usually periodic waveforms with a distribution of harmonics that follow an identifiable pattern.

In these types of sounds, you’ll find a strong fundamental frequency followed by harmonics that occur at integer multiples of the fundamental.

For example, imagine a bowed string instrument playing a rich, sustaining note at concert pitch, or A4 = 440 Hz.

The sound will contain a strong fundamental frequency at 440 Hz, followed by a 2:1 harmonic at 880 Hz, a 3:1 harmonic at 1420 Hz, a 4:1 harmonic at 1760, and so on.

This pattern is known as the harmonic series and it’s common to pitched sounds.

Knowing this, the analysis algorithm can use the pattern to identify the most likely fundamental frequency for an incoming signal.

When you set a pitch correction plugin to recognize the notes in a musical scale, it simply moves the harmonic content of the signal from the detected pitch toward the closest pitch in the set scale.

Parameters such as retune speed or Humanoid’s quantize parameter affect the behavior of the pitch remapping from detection to scale.

Fast retune speeds are responsible for the characteristic “stair step” effect that made hard-tuning famous.

But Humanoid takes it further than just retune speed to let you dial in the exact behavior of the tuning algorithm.

Watch out break down of how the quantize function lets you get the results you need from any incoming vocal with Humanoid:

Phase vocoder plugins

Some of the most common tools used in vocal production rely on phase vocoder tech for their key functions.

I’m talking about vocal pitch correction tools that allow you to adjust the tuning of a vocal performance to fix mistakes and improve the sound.

In most cases, these plugins are made to minimize their impact on the original vocal timbre. That’s why maintaining phase coherence is so important.

But there’s a lot more that can be done with FFT analysis and the phase vocoder technique if you’re willing to experiment.

Humanoid is our over-the-top pitch corrector and vocal transformer that uses phase vocoder technology in an unconventional way.

It can hard tune your vocals to a scale using a process similar to the method I’ve described above. But it can also perform other operations on the harmonic content of the vocal signal to manipulate the sound.

For example, even harmonic sounds with partials that follow the pattern of the harmonic series contain a little variation.

Whether it’s a slight deviation from exact integer ratios, or the presence of some inharmonic partials, each human voice is unique.

But what happens when you force all the harmonics into a perfect ratio and remove any noisy partials from the signal? The result is a uniquely synthetic vocal texture you can increase by turning up the Robotify knob in Humanoid’s pitch section.

Check out the official tutorial to hear Humanoid in action and see the creative possibilities of this approach.

A new phase of creative vocals

Phase vocoder technology enabled a huge shift in what was possible for producers to do with recorded audio.

As common as it seems today, fixing the pitch of a vocal performance has long been a holy grail for music technology researchers.

But creative musicians and producers always find the artistic applications of any new technology.

Humanoid is just one tool for exploring the possibilities that come with phase vocoders and pitch correction.

Now that you have an idea of how they work, get back to your DAW and manipulate some vocals.