• SONAR
  • 64 bit engine? (p.9)
2013/12/14 16:03:01
mixmkr
However, I always wondered if after people heard what dithering did to multiple low-level examples, it would train their ears sufficiently so they could learn to recognize the difference at normal listening levels. The ability of the ear to "learn" extremely subtle gradations would explain why some people hear very subtle audio cues while others don't.
in the same way that you can highlight an instrument at the beginning of a song...then drop it a gazillion dB or more later on, and it is still clearly heard, as long as it's playing the same or similar part with the same tonality.
2013/12/14 18:03:46
drewfx1
Anderton
drewfx1
Unless someone can demonstrate that it's ever even borderline audible through some objective testing, I would say that there was some marketing going on.



It depends upon what you compare it to. When compared to a 16-bit fixed audio engine, you don't have to do too much DSP to hear an obvious, audible difference. With a 24-bit fixed engine, you have to work a lot harder to create a project where you can hear a difference. It is possible, but the project wouldn't have much relationship to real-world projects...unless your music consists of solo acoustic instruments recorded in isolation with noiseless mics, then bounced multiple times through precision reverbs and played back at really loud levels

 
Well, here we were comparing calculations done using 32 bit single precision floating point to 64 bit double precision floating point.
 
In terms of marketing, I too remember the days when we avoided at all costs any processing that wasn't absolutely necessary out of fear of audible damage from calculations being done at lower bit depths. And I agree that when CW introduced the 64 bit engine, we were not far removed from those days.
 
Personally, as I've expressed in various ways, I find some of CW's historical wording regarding the 64bit engine, shall we say, "unfortunate". But as a long time enthusiastic user of CW products, I put this in context of a company I otherwise have great respect for. 
 
I put equal (or more) blame on individuals' inclination to ignore basic questions of context: "There are errors? OK, how loud are they under typical conditions?" 
 
And no one ever seems to ask under what conditions a given problem is minimized or exacerbated. 
 
For some reason when it comes to audio, people want to believe that any artifact must be audible under all conditions if they just listen for it, but the real world just doesn't work that way. And intelligent people who profess themselves to be "skeptics" will sometimes readily accept all claims from one side without any even trivial doubt, but will demand endless proof that the other side has dotted every "i" and crossed every "t" without ever providing any contrary evidence of their own.
 
 
I agree that one could create a laboratory project with the express intent of making 32bit errors audible, but for real world usage I've never seen a shred of objective evidence that it's even close to making a difference. 
 
Mathematically, the size of the errors relative to the signal is dependent on the bit depth the calculations are done with, the number of calculations performed, and how they accumulate based on the nature of the calculations being done. With 32 bit floating point you are starting at a point far from ever being audible, and in mixing I will assert that the errors are typically distributed fairly randomly. Therefore you need to do lots and lots of calculations before the errors could accumulate enough to be worth worrying about. 
 
The math part is not really open to debate. But I would be quite interested in someone presenting objective evidence suggesting that the number of calculations under the 32bit mix engine is sufficient to make the errors audible, or that when mixing real world signals the errors might accumulate unusually rapidly to the point of being a problem.
 

Again to draw a comparison to dithering, I did a mastering seminar where I reduced the signal level dramatically and did comparisons with and without dithering. The difference was totally obvious, but only because the signal level was so low you could really hear what was happening with those least significant bits. People couldn't tell the difference at "normal" listening levels.
 
However, I always wondered if after people heard what dithering did to multiple low-level examples, it would train their ears sufficiently so they could learn to recognize the difference at normal listening levels. The ability of the ear to "learn" extremely subtle gradations would explain why some people hear very subtle audio cues while others don't.



If the dither/quantization_error at a normal listening level is below the absolute threshold of hearing, or is sufficiently masked by background noise and the audio itself, it will be inaudible. This is commonly the case for 16 bit audio, but you can certainly find (or create) conditions where it is audible.
 
In borderline cases, my understanding is that listeners being trained on what to listen for can make a very significant difference. And that, aside from hearing loss, training or "knowing how to listen" is the primary difference between different individuals ability to hear things or not - i.e. aside from hearing loss, it's not based on anyone having naturally superior hearing or anything like that. So it wouldn't surprise me if, as you suggest, some people have learned to hear details that escape others of us, but are still within the physiological limits of our hearing.
 
But I would also assert that it's often not all that difficult to differentiate between "conceivably borderline" cases and "below the physiological limits of human hearing" cases for listening levels that don't cause permanent hearing damage in the short period of time before you blow your speakers.
2013/12/14 22:03:00
Goddard
Well this may be of some interest... (especially, Reference [1])
 
http://pure.ltu.se/portal/en/studentthesis/rounding-errors-in-floating-point-audio%286cd6adc7-83c9-4208-ad06-06e105892cc1%29.html
 
As some people claim not to have time to read stuff I post links to (yet still find time to post at length?), some selected highlights:

Rounding errors in floating point audio:
Investigating the effects of rounding errors on the fixed point output format of a simulated digital audio chain, using fixed point input, and floating point intermediate storage
 
Erik Grundström
2013
Bachelor of Arts
Audio Engineering
Luleå University of Technology
...
 
1. Introduction
...
There are claims of increased audio quality by using 64 bits in the marketing material of some DAWs [1] and plug-in manufacturers while others claim there are no increased audio quality [2][3]. The scientific literature on this subject is however extremely scarce [4]. This means that the subject should be systematically investigated since it is important for the audio engineer in deciding what equipment to use and also for the design engineer when new audio products are developed, both software and hardware.

1.1 Research question
Will the use of a 64 bit floating point intermediate signal chain produce less deviation from an original fixed point audio file than a 32 bit floating point signal chain after requantization to the original fixed point format?
...
2.2 Digital signal processing
Since digital audio consists of a series of binary values at equal increments of time, processing in the digital domain means that mathematical operations are applied to these values. For instance; to change the level of audio in the digital domain, a multiplication is carried out on every sample. This is one of the simplest types of processing that can be carried out since it processes each sample value independently. Therefore, the result of the process is not dependent on the surrounding sample values. More advanced processing may use several consecutive samples in its process and thus, each samples value is not independent after the processing. Some processes might even be recursive, meaning that the output of the processor is also passed back to its input. What commonly happens when processing is carried out is that the word length necessary to describe the result of the mathematical operation becomes longer than the word length of the original audio data. After each operation the result must be requantized to the word length of the intermediate container format. This requantization is likely to cause rounding errors. It is possible that these rounding errors will then be compounded by subsequent operations and requantizations. This is, however, not a given since it is possible that the rounding errors will balance out if their sign is random. This cannot be controlled as this would require that all parameters of both the signal and the processing would be known when the algorithm is developed [15]. It is quite obvious that this is not feasible for an audio processing unit or DAW.
...
 
3. Method
In order to answer the research question the following has been done:
 
• Generate test files
- One file consisting of all possible sample values in a 16 bit fixed point wav file
- One file consisting of all possible sample values in a 24 bit fixed point wav file
• Simulate a digital signal chain using floating point intermediate format in:
- 32 bit floating point
- 64 bit floating point
• A comparison program has been written that reads samples from the original and the processed file and compares them. The program then prints out the amount of differences between the two, the maximum difference and the mean of these differences and the cumulative deviation.

The simulated audio chain has been made in 6 different versions.
 
1. Converts the fixed point audio data to the two floating point formats and then back and writes a new .wav file with the resulting values.
 
2. This version attempts to provoke differences with an extreme gain change of -700 dB and then +700 dB
 
3. uses a more realistic gain processing of -3 dB and +3 dB
 
4. applies an additional stage and so changes gain by -3 dB, +3 dB, -8 dB and +8 dB
 
5. is the same as 4 but adds -16 dB, +16 dB, - 2dB and + 2 dB gain processes
 
6. is the same as 5 but adds -22 dB, + 22 dB, -12 dB, + 12 dB, -25 dB, +25 dB, -27 dB and +27 dB gain processes

All of the above audio chains should, in theory, not produce any deviation from the original. However, due to rounding errors in the requantization after each gain change, differences may occur.

To approximate the effect any error would have on real music signals, 6 additional test files were generated using random numbers.
• 1 is 16 bit white noise i.e. uniform random numbers
• 2 is 24 bit white noise
• 3 is 16 bit random numbers from a Gaussian probability density function
• 4 is 24 bit random numbers from a Gaussian probability density function
• 5 is 16 bit random numbers from a Laplacian probability density function
• 6 is 24 bit random numbers from a Laplacian probability density function
 
5. Discussion
5.1 Ramp file testing
...
If the conversion is transparent and the input and output is in 16 bit fixed point, a 32 bit and a 64 bit intermediate format will not produce any deviations from the original and thus the audio chain will be transparent. If the input and output is instead 24 bit fixed point the 64 bit intermediate format will not introduce deviations from the original, but the 32 bit will. The percentage of deviations will increase with the square root of the numbers of calculations as is seen in fig.30, fig 31, fig 32 and fig.33. The fact that the deviations seem to increase in such a predictable manner is important for the design engineer as this allows him/her to weigh these errors against the additional memory the audio will allocate in the primary memory of the computer. Perhaps these errors may be deemed acceptable for some calculations in memory intensive tasks. Not only does the number of deviations increase with the number of calculations but they also appear to grow in magnitude as both the maximum deviation and the mean deviation are increasing.
...
 
5.4 Practical implications
While it was stated in section 1.2 Purpose and Limitations, that, whether any differences detected were audible will not be treated, a small discussion on this subject may be appropriate.

The largest deviation from the original in the results section is 4 quantization levels in 24 bit output (see Table 16 in section 4.1.2). If it is assumed that this deviation is not correlated with the signal, this would result in noise at -132 dBFS. This is beyond the dynamic range of human hearing [5, p. 70] and thus it is unlikely to be audible. It is however possible to encode audio data into storage formats that require extreme bit transparency throughout the distribution or signal chain for successful decoding. This kind of packaged data could thus be severely impacted if it would be converted to 32 bit floating point and processed for whatever reason. This could cause unexpected noises and distortions of the audio or, even worse, complete failure to decode the data. Thus, when bit transparency is of the highest priority, it is highly recommended to use 64 bit floating point over 32 bit in the intermediate signal chain if the original fixed point format cannot be used.
...
Note also that the use of 64 bit floating point for all multiplication coefficients in this thesis results in an implicit conversion of 32 bit data to 64 bit during the calculation. The ecological validity of this is debatable. It would be logical to use 32 bit formats for processing coefficients if the intermediate format too is 32 bit. This may cause even more deviations from the intended results since there will be less precision in the number representation during the calculation and thus more, and larger, rounding errors may occur.
 
5.5 Reliability and validity
...
The ecological validity may, in part, be debatable. The use of 64 bits in the multiplication coefficients in the gain processes may not be applicable to real world scenarios when the intermediate audio format is 32 bits. It is also highly unlikely that an engineer would change the gain of an audio signal in one direction and then restore the gain at a later stage, at least not before any additional processing has taken place. It is not likely that an engineer would use counteracting processes that, in theory, would produce an output that is identical to the input. Furthermore it is unreasonable that an engineer would use such extreme level changes as 700 dB. The results, however, do show similar rounding errors for this as a more realistic level change of 3 dB, and thus this unrealistic scenario does not invalidate the results. Gain adjustments in general however, are very common in audio production. It is, in fact, likely the most common processing of all, and thus the study does show some strong ecological validity in this context.
...
6. Future research
This thesis has just barely scratched the surface on this topic. Therefore, based on the results of this thesis, a number of recommended questions for research can be made.
 
• Analyze what sample values are actually changed by the processing. Are there values that are more likely to be changed by processing and re-quantization than others?
• ...is there a relation between what processing is applied and what sample values are changed?
How will the word length of the intermediate format affect more advanced algorithms?...
Further investigate the relation between number of calculations and deviations
from the original file. This research could be done similarly to this but use a
greater number of calculations.
...
Investigate whether these deviations would be audible and if so, how many
steps of processing are required before they become audible.
 
9. References
[1]
http://www.cakewalk.com/Products/feature.aspx/SONAR-Core-Technology-and-64-bitDouble-Precision-Engine
 


Mixing (summing) of multiple streams was unfortunately not considered, only int -> fp -> int format conversion (casting) and reciprocal gain alterations (multiply, always peformed with 'doubles') upon individual streams.
 
Still, noteworthy that the processed 16-bit streams always nulled, even when using floats, whereas the 24-bit streams never nulled when floats were used. Hmm, maybe double precision does matter. Good to hear the bug's been squashed.
 
Hey, that Reference [1] cite calls to mind a past forum post:
 
Seth Perlstein |Cakewalk|
Rain
"There’s a reason SONAR just sounds better. SONAR's industry-first, end-to-end, 64-bit double precision floating point mix engine allows you to mix with sonic clarity using a suite of versatile effects, powerful mixing tools, and endless routing possibilities."
http://www.cakewalk.com/P...ouble-Precision-Engine

Sorry couldn't resist...

 
Yes, a 64-bit double precision audio engine will dos sound better than a 32-bit float, 24-bit, etc. engine. This can be proved mathematically that there will be less rounding errors in the summing with a 64-bit audio engine vs. others.
 
Comparing 64-bit audio engine to 64-bit audio engine, I doubt there would be a difference.

SP

http://forum.cakewalk.com/Sound-Quality-of-Sonar-X1-m2507939-p13.aspx#2513668
 
Sorry, couldn't resist either...
2013/12/14 22:31:59
Goddard
D K
 ^^^ Game, Set, Match ^^^ - That is.. for anyone whose primary concern is about performing,capturing, mixing and presenting...music

 
Seeing as how you feel compelled for some reason to keep score here, why don't you instead tell us all about how much improvement your heard in your Tango 24 after shelling out for that BLA mod?
 
Btw, we'll be expecting objective proof...

2013/12/14 22:46:33
Splat

2013/12/14 23:41:54
Goddard
Anderton
lawp
so the dpe is/was just marketing hype?


 
As I've said before...when the 64-bit engine was introduced, the world of audio engines was quite different and it was a major step forward.
 
Please note this is my personal opinion and does not speak for Cakewalk.



Craig, with all respect (and I've been enjoying your writings since Polyphony and Device days), "double precision" DAW audio engines had already been around for some time before Sonar's DPE. SAW used 64-bit processing (running native on PC), and Digi used 48-bit/56-bit DSP chips for PT TDM (with 24-bit paths between chips?) giving effective double precision (or at least, the necessary extended precision for accumulation when mixing 24-bit audio).
 
Iirc, CPA/Sonar's audio engine was built on DirectX and used single precision floats for processing. And iirc, even when Sonar was re-coded into a 64-bit application (Sonar 4?), it still processed using floats until Sonar 5 finally came out with the DPE.
 
Hopefully Noel will correct me if I'm wrong here, but seems to me that it was not until Intel and AMD's provision of streaming SIMD (SSE) functionality enhancement in their processors (supplanting the need to rely upon the x87 FPU), along with the move away from a DirectX foundation that it became performantly practical to natively implement higher precision processing in Sonar (as well as in plug-ins such as those running under VST which was revised for doubles around that time also).
 
That said, a 64-bit DAW with a 64-bit DPE was a pretty nifty trick back then (and still is).
 
But yeah, a DAW with a DPE is no longer a novelty. Even have one running on some i-devices now!
 
Otoh, some DAW developers still flatly reject double-precision, such as this one (who btw does know Jack):
 

64 bit processing is a completely bogus sales/marketing tactic. No (let me repeat that, no) double blind test has ever shown any difference to audio processing with 64 bits over 32. Synthesis is slightly different, but plugins are free to do whatever they want internally, and simply convert 32 bit for input and output. The same is true of any other processing. Certainly nothing that Ardour itself does to the signal would benefit from 64 bit processing. If you think that Reaper (or any other system) "sounds better" because of 64 bits, you need to setup a properly structured double blind test. I almost guarantee that your belief will be gone by the end of the test.

https://community.ardour.org/node/5812
 
Hey, remind you of any recent forum threads around here?
2013/12/15 13:31:11
drewfx1
GoddardStill, noteworthy that the processed 16-bit streams always nulled, even when using floats, whereas the 24-bit streams never nulled when floats were used. Hmm, maybe double precision does matter. Good to hear the bug's been squashed.
 



 
You crack me up. 
 
You just can't seem to comprehend the difference between an error being present and being audible or meaningful. Until you will admit that those are not the same thing, I will not waste anymore of my time on you.
 
But I will wish you good luck in your future endeavors.
2013/12/15 13:42:33
Splat

2013/12/15 17:38:01
Anderton
Goddard
Anderton
lawp
so the dpe is/was just marketing hype?


 
As I've said before...when the 64-bit engine was introduced, the world of audio engines was quite different and it was a major step forward.
 
Please note this is my personal opinion and does not speak for Cakewalk.



Craig, with all respect (and I've been enjoying your writings since Polyphony and Device days), "double precision" DAW audio engines had already been around for some time before Sonar's DPE. SAW used 64-bit processing (running native on PC), and Digi used 48-bit/56-bit DSP chips for PT TDM (with 24-bit paths between chips?) giving effective double precision (or at least, the necessary extended precision for accumulation when mixing 24-bit audio).



According to the SAW site, the last version of SAW released in 2001 had a 24-bit audio engine. Pro Tools had a 48-bit fixed engine but bottlenecked to 24 bits when going through the TDM bus.
2013/12/15 18:15:10
slartabartfast
Wow. A lot of fantastic stuff here. 24 bits 32 bits 64 bits...
Hard to argue that 64 bits is not better in theory. There might conceivably be situations where it would make a difference. Anyone arguing for 64 bits for that reason ready to argue against 128 bits? 256 bits? 1024 bits? Computers  may be able to chug those out without a glitch, if not now then surely in the audio world of the future.
Or are we all going to accept that, short of an infinitely large internal representation, there is no reliable way to protect ourselves from the possibility of audible errors in an infinitely long or recursive processing chain.
© 2026 APG vNext Commercial Version 5.1

Use My Existing Forum Account

Use My Social Media Account