exporting 44.1 mixes when recordings were at double or quad rates

2014/01/06 17:48:50

gswitz

Nika to me
The proper way of doing sample rate conversion involves filtering the material below the new Nyquist frequency prior to the sample rate conversion. This would eliminate any aliasing due to the conversion.

Assuming you used a good sample rate converter, there should be no aliasing in the result.

Me to Nika
So let me restate just to be clear...

I record a multi-track project at 96 kHz and bounce it in the box down to one stereo pair at 32 bit 96 kHz (the bit depth increases from 24 at time of recording to 32 for bounced tracks).

Now, to export, I'm reducing back to 24 bit 44.1. I'm using Sonar Cakewalk to export the audio. I'm guessing that the filter would be applied by Cakewalk and my audio interface would not matter. Do you agree with this?
Also, you are saying that Cakewalk would need to apply the filter to remove material above the new Nyquist frequency prior to exporting at the new sample rate.

I'm guessing the responsibility lies with Cakewalk to convert the data to the new sample rate, not my audio interface. Therefore, they should be able to answer to the question of the filter.

Does it sound like I correctly understand you?

Nika to me
You have it down precisely. When Cakewalk does an "export," it is necessarily doing a sample rate conversion. In order to do that properly, it must filter any data above the new Nyquist frequency prior to eliminating the excess samples. (The way it actually converts from 96kS/s to 44.1kS/s is likely by upsampling to the lowest common denominator of the two sample rates, then adding the new Nyquist filter, and then removing the excess samples). That algorithm lies entirely within Cakewalk, and is not dependent upon your converters.

Me to Nika
Given that path, dither would not be required for a sample rate conversion where bit depth remains unchanged. Do you agree?

Nika to me
To the contrary, dither is required. But I want to make sure I'm clear about what that means.

Dither is required any time bit depth is reduced. When your 24 bit/96k file is upsampled and downsampled to 44.1k, a filter is used. That filter requires processing. That processing inherently invokes more bits (Cakewalk likely uses either 32 bit or 64 bit floating point processing to do its internal processing, including sample rate conversion). The downsampling to 44.1k thus requires a reduction in bit depth from the processing bit depth (32 bit or 64 bit floating point) back to 24 bit (fixed point). Thus, to get down to 24 bit/44.1kS/s, Cakewalk should be using dither as part of the process.

But that does not mean that you, the user, should be adding dither during that process. Cakewalk's software should have the dither functionality built in for those internal processing operations. You, the user, should only be adding dither when you reduce the operating bit depth of a file (as in, from 24 bit to 16 bit).

If Cakewalk gives you a dither "option" in its sample rate conversion utility, I'd be interested to know what that option is for. My guess is that the option is there only for when you are also reducing bit depth, as in going from, say, 24 bit/96k to 16 bit/44.1k.

2014/01/06 18:40:48

gswitz

This was my next note to Nika...

Me to Nika
When bouncing to tracks or when exporting there is a dither choice that looks like this...
http://stabilitynetwork.blob.core.windows.net/g-tunes/Screenshot_Dither.png

This is the bounce to tracks dialog; the export dialog has the same choices.

In Digital Audio Explained, page 259, you write, "The conclusion to be drawn is that colored dither can effectively be used to reduce the bit depth, but should only be used at the final processing stage. Colored dither and truncation to lower bit depths is the very last process than any signal should undergo prior to listening or pressing a CD or DVD or other distributable means. If any digital processing is still to happen, from EQ to compression to simple level changes, then colored dither should not be used and TPDF, white noise dither should be used to have the least audible effect."

So, since there is up-sampling to 64-bit processing engine in Sonar (you can see the check box in the image), there will be down-sampling that will justify dither. My guess is that when working at 44.1 or 48 it would be appropriate to avoid the Pow-r choices and stick to Rectangular or Triangular when bouncing internally and save the Pow-r 3 setting for the final export for distribution, thus avoiding colored dither where one might re-amplify the color in audible frequencies using EQ in the next mixing/mastering stage.

My question is this...
At 96 kS/s rates, shaped noise would put the noise well above the audible range and would just get removed by later processing. If your recordings are at double or quad rates, would it be appropriate to use the Pow-r 3 Dithering setting for bouncing as well as for exporting (where you presumably reduce Sample Rate to 44.1)?

Nika to Me
I agree with you.

When taking a high sample rate project and doing a sample rate conversion/bounce to disk to a lower bit depth, where no further processing will take place, a noise-shaped dither algorithm, such as POW-r, is best.

2014/01/06 18:59:22

drewfx1

OK good, thanks.

The reason for the dither is because it's being assumed that you're doing a bit depth reduction alongside the SRC. This will generally be true unless you are exporting to a 64bit or 32bit floating point file.

In terms of dither type, the "problem" with noise shaping is that if repeatedly applied, it could concentrate lots of energy in a narrow frequency band and that could potentially cause problems.

Personally I would argue that in reality when dithering down to 24bit (or above), the dither level is so low that it's not really going to matter what you do. But since the level is so low, there's certainly zero harm in using simple rectangular dither if someone wishes to argue that this is a best practice.

2014/01/06 19:28:43

gswitz

Thanks, DrewFX1. I'm with you that it doesn't matter much.

This thread started because I saw frequencies above audible in the RME DigiCheck and I wondered about a need to filter for the first time. As you aptly pointed out this was part of the SRC.

During that, Dither was brought up and since Nika was already communicating with me, I thought I'd clear up something that had been bugging me a little.

For me, I normally just leave Dither on Pow-r 3 all the time and never re-visit it. I saw in Digital Audio Explained that it was a bad idea to use shaped noise prior to the final export. Then I thought about the noise being above the audible when using 96 and wondered if maybe that wasn't preferable to rectangular.

I feel very lucky to learn this stuff. Also, I am able to understand Nika b/c of the book. He's got a gift for teaching.

drewfx1This will generally be true unless you are exporting to a 64bit or 32bit floating point file.

Also, Drew, when exporting don't you export to FIXED POINT files? not Floating? Not trying to pick nits... :-)

2014/01/06 20:05:31

drewfx1

I use Sound Forge and other programs as well and export to 32bit floating point for this purpose. Also, when using an external mp3 encoder, you want to export to 32bit floating point.

IOW, 32bit floating point is an interim file format to be input into another program, not a final destination format.

2014/01/06 20:31:05

gswitz

Schooling me again!

Can you export floating point from Sonar? I don't see the option.

2014/01/06 20:52:47

drewfx1

If you select 32 or 64 bit, it's floating point.

2014/01/06 21:01:58

gswitz

Ok! Thanks!

2014/01/06 21:13:12

mikedocy

http://src.infinitewave.ca/

That is a site that compares the output spectra of many different sample rate converters (SRC).
Use "Sweep" for the test result. The best SRC is the one that makes a curved white line and nothing else.
Any other lines besides the single curved white line would be aliasing.
Sonar's is pretty good according to the graph.

About the sweep mode from the help file:
Swept sine wave with -6 dbFS peak amplitude, spanning the frequency range from 0 to 48 kHz for 8 seconds. As a result, the spectrograms of converted signals can be drawn. They allow identification of non-linear distortions introduced into the signal and aliasing. The dynamic range of this spectrogram is 180dB.
Before the 5 second mark, the tone is in the audible frequency range, so the level of harmonics and distortions in the left part of the spectrogram shows how signals in the audible range are distorted at different frequencies. After 5.5 seconds, the input tone goes above 22 kHz and cannot be represented in a 44.1 kHz format. So, ideally, it should be suppressed by the low-pass filter.

2014/01/06 21:32:39

gswitz

Awesome, Mikedocy! Thank you!!

exporting 44.1 mixes when recordings were at double or quad rates

Use My Existing Forum Account

Use My Social Media Account