Slovenian voice over by MK presents
Voice Over Audio Guide - Basics

VOICE OVER AUDIO GUIDE:
What is wav, mp3, sample rate, bit rate, 32bit, 16bit, ...

VOICE OVER AUDIO FORMATS:
WAV, MP3 and FLAC

MP3 format

First of all, let’s deal with MP3s. They are not the best. My Slovenian Voice Over Audio Guide will give you some pointers to better understand this complex field. If you take away anything from here, try to remember that MP3s kinda suck. If you have even a passing interest in decent sound, and audio fidelity, you’re going to want to avoid MP3s. Most essentially, an MP3 (meaning MPEG-1 Audio Layer 3) is a file that sacrifices audio resolution for small file size. It cuts out all the bits that we as humans aren’t supposed to be able to hear. Also, it can be read by just about any device on earth. Why don’t I use MP3 for my Slovenian voice over projects?

What are the cons here? We might not technically be able to ‘hear’ the bits that are cut out, but this compression of the file renders it thin, tinny, and lifeless. No one these days seriously uses MP3s: its creators terminated its licensing. Therefore, I do not use MP3 format to deliver my Slovenian voice over work.

FLAC format

Now we dive into the FLAC format. FLAC format is where things get interesting. The “Free Lossless Audio Codec” pulls off an outstanding trick by allowing us to compress the file size down to approximately 60% of the original. It does this without losing any noticeable audio quality. Not only is it open source, but it allows the transmission of sample rates up to 1,411 Kbps, which is significantly higher than anything else. This is the format you should use if you really care about your sound, but don’t want to commit to physical formats like CDs. This format is occasionally requested by clients needing Slovenian voice over services.

WAV format

The WAV format is equally common, and it is useful for anybody who wants decent audio. Essentially, WAVs (Waveform Audio File Format) are higher resolution audio files, that almost always contain uncompressed audio. Technically speaking, a WAV file is a presentation of a piece of audio encoded with something known as Pulse Code Modulation (PCM). This is a way of taking the analog audio and converting it into digital so that it has a sample rate and bit depth. For all intents and purposes, WAV and PCM are interchangeable terms, and both refer to a high-quality audio file. In performing Slovenian voice overs I almost exclusively use the WAV format. FLAC is commonly used on streaming services, whereas you’ll normally find WAVs and MP3s on common hard drives. 

SAMPLE RATE AND BIT DEPTH:
VOICE OVER SAMPLE RATE INTRO

Knowing about sample rates is one of the cornerstones of knowledge for anyone who dabs in audio. Therefore this chapter takes the most space in this Slovenian voice over audio guide. Sound waves are converted into data through a series of snapshot measurements, or samples. A sample is taken at a particular time in the audio wave, recording the amplitude. This information is then converted into binary data.

The system makes thousands of measurements per second. If we can take tons of measurements very quickly with enough possible amplitude values, we can effectively use these snapshots to reconstruct the resolution of an analog wave.

kiloHertz (kHz)

The recording system takes these measurements at a speed called the sample rate, measured in kilohertz. In most digital audio workstations – or DAWs – you’ll find an adjustable sample rate in your audio preferences. This controls the sample rate for audio in your project.

The options you see in the average DAW—44.1 kHz, 48 kHz—may seem a bit random, but they most certainly are not. You will soon understand why I use the 48 kHz sample rate in my Slovene voice over work. The sample rate determines the range of frequencies captured in digital audio. For example, let us use the sine wave:

Slovenian voice over audio guide sine wave for Slovenian Voice Over website
Sine Wave Frequency

In order to measure the frequency of this sine wave, we need to be able to detect and define one cycle. One complete cycle of any wave contains a positive and negative stage. To know the length of this cycle—the wavelength, which leads us to the wave’s frequency—we need to detect both of these two stages. Therefore, we need to measure the wave at least two times per full cycle to accurately capture its frequency. (Now we will get into a bit more technical stuff in my Slovenian Voice Over Audio Guide):

Nyquist rate and frequency

What this means is that we can capture and reconstruct the original sine wave’s frequency with a sample rate. The sample rate is at least twice its frequency, a rate that is named the Nyquist rate. Conversely, a system can capture and recreate frequencies up to half the sample rate, a limit that is named the Nyquist frequency.

The Nyquist frequency is the bandwidth of a sampled signal, and is equal to half the sampling frequency of that signal. If the sampled signal represents a continuous spectral range starting at 0 Hz, the Nyquist frequency is the highest frequency that the sampled signal can represent.

Aliasing

The signal above the Nyquist frequency is not recorded properly by ADCs, becoming mirrored back across the Nyquist frequency and introducing artificial frequencies in a process that is named aliasing.

In order to prevent aliasing, audio-to-digital converters are often preceded by low-pass filters. The low pass filters eliminate frequencies above the Nyquist frequency before audio reaches the converter. This will prevent unwanted super high frequencies in the original audio from causing aliasing. Early filters could taint the audio, but this problem is being minimized as better technology is introduced. Hopefully this subchapter, containing quite a bit of technical jargon didn’t confuse you too much and you are still reading my Slovenian Voice Over Audio Guide. If so, kudos 🙂 Let’s now move on the sample rate of 44.1 kHz. 

SAMPLE RATE OF 44.1 KHz

The standard sample rate, the one that is most commonly used is 44.1 kHz. Sample rate of 44.1 kHz will accurately represent frequencies of up to 22kHz. Humans with great hearing can hear up to 20kHz. This tells us that a sample rate of 44,1kHz is perfectly adequate to record music and voice overs. 44.1kHz also eats up less storage on your computer than higher sample rates. Some people insist they can hear improvements in audio recorded at higher sample rates. The science doesn’t really support these claims. But still, most work in the voice over field is done using the 48 kHz sample rate. 

So, humans can hear frequencies between 20 Hz and 20 kHz. Most people lose their ability to hear upper frequencies over the course of their lives and can only hear frequencies up to 15 kHz–18 kHz. 

The computer should be able to recreate waves with frequencies up to 20 kHz in order to preserve everything we can hear. Therefore, a sample rate of 40 kHz should be the best to use, right?

The sample rate of 44.1 kHz technically allows for audio at frequencies up to 22 kHz to be recorded. By placing the Nyquist frequency outside of our hearing range, we can use more moderate filters to eliminate aliasing without much audible effect.

OTHER SAMPLE RATES - 48 kHz, 88.2 kHz, 96 kHz, etc.

With 44.1 kHz being an acceptable sample rate for consumer audio, there are occurrences in which higher sample rates are used. For example in music and voice over industry. Some sample rates were introduced during the early days of digital audio when anti-aliasing filters were expensive. By moving the Nyquist frequency even higher allows us to place the filter further and further out of human hearing, and therefore impact the audio even less.

48 kHz is another very common sample rate that we will mention in my Slovenian Voice Over Audio Guide. The higher sample rate technically leads to more measurements per second and a closer recreation of the original audio. So 48 kHz is often used in “professional audio” contexts, as in professional voice over audio files, more than music contexts. For instance, it’s the standard sample rate in audio for video. This sample rate moves the Nyquist frequency to around 24 kHz, giving further buffer room before filtering is needed. Therefore this sample rate is the one frequently used in my Slovenian voice over projects.

Higher bitrates: 88.2 kHz, 96 kHz, 176.4 kHz, 192 kHz

There are some engineers who choose to work in even higher sample rates, which tend to be multiples of either 44.1 kHz or 48 kHz. Sample rates of 88.2 kHz, 96 kHz, 176.4 kHz, and 192 kHz result in higher Nyquist frequencies, meaning supersonic frequencies can be recorded and recreated. Low pass filters have less impact on the sound and more samples per second, which results in a more high-definition recreation of the original audio. 

In my Slovenian voice over projects I almost exclusively record in 48 kHz and deliver the work in WAV format. If a client explicitly requests, the voice over file may be deliver in other specifications.

Quality matters! Therefore, by choosing my services for your Slovenian voice over project, high-quality awaits you. 

SAMPLE RATE AND BIT DEPTH:
BIT DEPTH INTRO

Bit depth dictates the number of possible amplitude values of one sample. Pulse-code modulation (PCM) is the standard form of digital audio in computers. The amplitude of an analog signal is sampled at regular intervals to create a digital representation of the sound source. The sampled amplitude is quantized to the nearest value within a given range. The number of values within this range are determined by bit depth. 

32-Bit Floating Point Digital Audio

Using your DAW, information that reaches above 0dB is clipping, correct? Actually, this isn’t the case. Information above 0dB isn’t lost until it’s truncated by your D/A converter, or exported to a fixed point file format (such as 16 or 24-bit fixed point). Applying a 32-bit limiter to the stereo buss in your DAW will prevent clipping from occurring. Even if the signal running into the limiter is peaking well above 0dB.  This means all the individual tracks in your DAW can be peaking above 0dB, if you apply a 32-bit limiter on your master buss. No, this isn’t going to make your music louder.

Exporting 32-bit files

By exporting a 32-bit floating point file, the points above 0dB will be saved to the file you export. However, if you try to playback the file through your audio interface, your 24-bit fixed point D/A converter will cause the information above 0dB to disappear. This doesn’t mean the information is missing from the digital file on your computer. It means D/A converter just can’t reconstruct the digital signal above 0dB in the analog realm.

You can send a 32-bit floating point file that’s peaking above 0dB to your mastering engineer. They will be able to reduce the level of the file when they import it. I hope that my Slovenian Voice Over Audio Guide shed some light on your understanding of the nature of sound and audio. I wish you all the best in your future endeavours!  

Avdio produkcija Marko Kvar s.p.
Rakovec 10, 3231 Grobelno, Slovenia.
ID: 8564655000 | VAT: 30198852

Contact
T: +386 31 789 689
E: marko@mk-voiceover.com
L: Ljubljana, Slovenia
About Marko
Marko is a native Slovenian voice over actor working from a professional sound studio in Ljubljana. After his longtime experience in theater and radio, Marko decided to start providing highest-quality voice overs and dubbing services to clients all over the world.