The science of sample rates

SONAR
The science of sample rates (p.20)

2014/01/24 15:45:25

Goddard

John T
Goddard

People have unfortunately taken Layry's private whitepaper assertions as valid, no doubt because of his reputation, although he's never submitted his assertions for scrutiny and peer-review in the JAES or other journal (for which he would also need to submit some verifiable proof).

Nyquist-Shannon only dictates the minimum sampling frequency required in respect of a band-limited continous=time (analog) input signal (and assumes an ideal reconstruction), but imposes no actual restriction on employing higher sampling frequencies and certainly does not imply that a sampling freq>Nyquist is deleterious, as is well understood by people who actually have to design sampling systems

Man alive. Nobody, including Lavry, thinks that the sampling theorem imposes any such restriction.

Clearly, you like the sound of your own voice, and you're welcome to it. But I don't think the rest of us are asking too much if we expect your gum flapping to at least make the occasional stab at relevance.

Still vainly trying at some kind of comeuppance?

No, Lavry didn't say Nyquist-Shannon imposes any upper restriction on sampling rate, Lavry himself seeks to impose that restriction, by urging that sampling at higher rates than his own converters happen to offer compromises audio accuracy and causes distortion, without actually ever proving that. According to Lavry, sampling audio at beyond a certain rate is "excessive". That sure sounds to me like Lavry is imposing an upper restriction on sampling rate.

Hmm, "accuracy", is that anything like "fidelity"? What was it Wescott stated about fidelity and sampling rate?

Wescott
Measured purely from a sample rate perspective, increasing the signal sample rate
will always increase the signal fidelity.

Perceive any relevance yet?

Gee, while I could leave things at that, seeing how much you seem to enjoy countering anything I post with an assertion that it's not relevant rather than actually responding to any technical points asserted, here's some more for you to dodge.

If you've actually bothered to read Lavry's 2004 and 2012 whitepapers I'm surprised your weren't displeased on doing so, seeing as how you've professed a dislike for "obscurantism" - plenty of that to be found there in how he tries to obscure the actual sampling rate of an oversampling converter (but then of course, he's also an "industry salesman" competing against companies marketing higher sample rate converters). He'd certainly altered his tune since his earlier paper on oversampling.

Lavry 1997 whitepaper
Oversampling:
Most digital audio equipment uses higher sampling rates then required by the Nyquist receipt. Oversampling offers solutions to both "sinc problem" and "filter problem". Oversampling typically takes place first during the analog to digital conversion. The signal is then converted to "standard rate", for reduced storage and computations. Such a conversion can be done without recreating much of the "sinc and filter problems". Later oversampling during the digital to analog
conversion, yields freedom from from such problems as well.

Sampling twice as fast, makes the NRZ time interval half as long, thus closer to the theortical flat response. The "sinc filter shape" is moved up by an octave, but doubling the number of samples, overcomes amplitude attenuation. Sampling at twice the speed also provides an "energy free zone" between the desirable frequency band and the undesirable out of band frequencies. Our filter is steep enough to remove all unwanted high frequencies. In fact, the cutoff can be moved higher to pass all the inband with minimal attenuation and phase distortions.

The following plots show the tone peaks for X2 and X4 oversampling. Note that sampling faster reduces the "4dB problem" to about .9dB at X2, and to .2dB at X4 oversampling

Oversampling by X4 may still require a slight amplitude compensation (an easy task). Higher rates yield so little attenuation that often no compensation is necessary.

Oversampling and "more bits":
Your stereo dealer is selling a CD player with X8 oversampling 20 bits DAC. Do you hear 20 bits? Clearly, incoming samples with 16 bit accuracy can not be interpolated into 20 bits. The best of geographical surveying equipment yields errors when the reference markers are off. Oversampling interpolation is an "averaging concept" thus it yields some better
"average accuracy", but each interpolated sample accuracy is limited to that of the input samples (16 bits in the case of CD players).

Oversampling offers great benefits in terms of amplitude flatness response and easy filtering, with much freedom from unwanted inband phase linearity problems. These concepts are beyond reach for most consumers, thus the "marketing department" decided to equate it with "more bits". While there is some truth to the story, much is being "stretched" a bit to far (and sometimes 3 bits).

My what a difference a few years makes...

Lavry 2004 whitepaper
While this article offers a general explanation of sampling, the author's motivation is to help
dispel the wide spread misconceptions regarding sampling of audio at a rate of 192KHz. This
misconception, propagated by industry salesmen, is built on false premises, contrary to the
fundamental theories that made digital communication and processing possible.

...one may be misled to believe that faster sampling will yield better resolution and detail.

In fact all the objections regarding audio sampling at 44.1KHz, (including the arguments relating to pre ringing of an FIR filter) are long gone by increasing sampling to about 60KHz.

...how can we explain the need for 192KHz sampling? ...An argument in favor of microsecond impulse is an argument for a Mega Hertz audio system. There is no need for such a system.
....
Clearly there are benefits to faster sampling:
1. Easier filtering (AD anti aliasing and DA anti imaging)
2. Reduction of higher frequencies attenuation at the DA side.

Indeed, such faster sampling is common practice with both AD and DA hardware. Most AD's today are made of two sections: a front end (modulator) and a back end (decimator). The front end operates at very fast rates (typically at 64 -512 times faster then the data output rate). The reasons for such fast operation is outside the scope of this article. It is sufficient to state that anti alias filtering and flatness response becomes a non issue at those rates.

It is most important to avoid confusion between the modulator rate and the conversion rate. Sample rate is the data rate. In the case of AD conversion, the fast modulator rate (typically less bits) is slowed down (decimated) to lower speed higher bit data. In the case of DA converters, the data is interpolated to higher rates which help filtering and response. Such over sampling and up sampling are local processes and tradeoff aimed at optimizing the conversion hardware.

One should not confuse modulator speed or up sampling DA with sample rate, such as in the case of 192KHz for audio.

AD converter designers can not generate 20 bits at MHz speeds, yet they often utilize a circuit yielding a few bits at MHz speeds as a step towards making many bits at lower speeds. The compromise between speed and accuracy is a permanent engineering and scientific reality.

Sampling audio signals at 192KHz is about 3 times faster than the optimal rate.
It compromises the accuracy which ends up as audio distortions.

While there is no up side to operation at excessive speeds, there are further disadvantages...

And I'm still looking for any illustration in Lavry's 2012 paper of how higher sampling rates reduce conversion accuracy.

Lavry 2012 whitepaper
In this paper, I will cover some of the myths of higher sampling rate and illustrate how higher sampling rates can actually reduce accuracy in audio conversion.

Let’s talk about converter accuracy. In reality, good audio performance requires extremely low distortion because the ear is very sensitive and perceptive. Personally; I am for accuracy and do not advocate placing limits on accuracy.

In fact, high quality audio converters operating at sample rates no higher than 96 KHz offer results that are very close to the desired theoretical limits. Yet, there are many who subscribe to the false notion that operating above the optimal sample rate can improve the audio. The truth is that there is an optimal sample rate, and that operating above that optimal sample rate compromises the accuracy of audio.

Regarding the accuracy (or loss thereof) of audio at higher sampling rates, the following article gives a rather more realistic picture.

Story JAES 2004
3 LIMITS TO ADC ACCURACY
Some very general limits to the performance of ADCs
are given in Fig. 1 along with a few key published performance
points. The limits should be treated with some
care, as befits such generalized presentations, but the figure
summarizes the major problems relevant to audio.

These performance measures do not involve either bits
or sample rate; in general, these are format rather than performance
issues. Usually either more bits or higher sample
rates can be had with the application of more power or
more silicon. If there is no corresponding increase in accuracy,
however, they are of dubious use—so the accuracy
obtainable is the limiting factor. There may be an exception
at very high sample rates, where extra bits can cause
power consumption problems.

Audio needs accuracy above 100 dB (the limit of
accuracy with matched components), but significantly
below the limit set by SDTE. The accuracy requirement
has caused audio to generate technically interesting solutions.

5 AUDIO REQUIREMENTS
Digital audio needs a comparatively low sample rate,
compared to what is currently available.

Audio is traditionally defined in terms of an audible
bandwidth and dynamic range, so the efficiency requirement
has been translated to mean that sample rates used
are only a little above the Nyquist minimum, and the word
length is related to the dynamic range required. CD, for
example, is based on the model of audio information
extending to 20 kHz only, so a sample rate of just over 2 #
20 kS/s is adequate (hence 44.1 kS/s).

This format adds a requirement for substantial lowpass
filtering—it looks for a flat frequency response to 20
kHz, but filtering to about "100 dB by 24.1 kHz (aliases
back to 20 kHz), a roll-off rate of some 400 dB per
octave. Analog filtering cannot handle this roll-off rate
easily, so it is attractive to use digital filtering. If this is
done, a substantially higher sample rate has to be output
by the ADC, prior to the digital filtering, so that the analog
filtering requirements are relaxed. The higher the
ADC sample rate that is used, the easier is the analog
antialiasing filtering prior to it. The output of the ADC is
then digitally filtered, and the sample rate is decimated, to
the required final rate.

The filtering problem at the input to the ADC is quite
severe. If the sample rate is increased to 176.4 kS/s (four
times the CD rate), an analog filter with about 36 dB per
octave rolloff is needed—achievable, but needs care,
especially if the passband to 20 kHz is to be flat and ripple
free. At 705.6 kS/s, sixteen times the CD rate, something
over 20 dB per octave is still needed. A fourth order
Butterworth filter would achieve this, but it still needs distressingly
accurate components.

Jitter decreases as the sample rate used increases.

(Although Lavry's '97 paper did explain certain benefits of oversampling in DACs, if anyone is curious a fuller explanation wrt oversampling multi-bit DSM ADCs can be found here.)

And perhaps some others' perspectives on high sample rate audio may be useful here as well:

Lesso & McGrath AES 2005
3. HIGH SAMPLE RATE AUDIO
The advantages of high sample rate audio are widely
debated. Several people have shown that the advantage
of the higher sample-rate audio is not the extended
bandwidth above 20kHz (even though there is some
evidence that power above 20kHz has some effect [7])
but is instead the fact that we can use the extended
bandwidth to tailor the transition band and reduce the
time dispersion of the impulse response [11][13].
It has been argued that the advantage of DSD is that the
time domain dispersion of the impulse response of the
system is limited and so equivalently it makes sense to
try to limit the extent of the impulse response of the
filters discussed here.

Even at the higher sample rates there are trade-offs to be
made. Figure 5 shows the impulse response of two
filters for a 96kHz input, one with a cutoff at 20kHz and
the other with a cutoff at 40kHz. The impulse response
for the filter with the cutoff at 20kHz is under half the
width and has considerably less pre-ringing1. The extra
degree of freedom at higher sample rates therefore
makes it possible to design much more interesting
filters. Equivalently, the 40kHz filter has the same
impulse response of a filter running at 48kHz, with
20kHz cutoff. This shows that at higher sample-rates it
is possible to design filters that have substantially lower
time dispersion, and perhaps this explains the reported
improvement in audio quality associated with higher
sample-rate audio [15].

An ultra high performance DAC with controlled time domain response

Story AES1997
Digital audio systems have historically made use of this 20 kHz limit to set sampling rates.
When CD formats were first established, the problem of storing the large amount of data
needed for about 1 hours stereo playing time was substantial, so sample rates were set as low as reasonably possible, consistent with maintaining a 20 kHz bandwidth. 44.1 kS/s gave and still gives an unambiguous frequency range of 22.05 kHz (Nyquist principle).

In principle, if frequency response were the only issue, there would be no advantage in moving to formats with higher sampling rates. However, the evidence is otherwise. Direct comparisons of the same source material, recorded and reproduced at 44.1 kS/s, 96 kS/s and 192 kS/s show that there is an advantage in going to the higher rates - it sounds better! The descriptions of those used to making such comparisons tend to involve such terms as “less cluttered”, “more air”, “better hf detail” and in particular “better spatial resolution”. We are left wondering - what mechanism can be at work? It seems unlikely that we have all suddenly developed ultrasonic hearing capabilities.

Actually, a little thought also suggests that frequency response cannot be the only factor at
work in our hearing apparatus. Figure 1 shows two waveforms that have identical (power)
spectra, and yet sound very different - a bandlimited impulse (a click) and a type of white
noise. Other waveforms can easily be generated that have the same amplitude response, but
sound (substantially) different still. Something else must be going on.

Energy Dispersion
The ringing contains energy, and we can plot energy against time. For anti-aliasing filters we
get the sort of shape shown in figure 3. This shows that although the energy in the input
transient is concentrated at one time, the energy from the anti-alias filter is spread over a
much longer time - the audio picture is “defocused”. We might be tempted to argue that the
energy is ultrasonic, but this is certainly not the case at 44.1 or 48 kS/s - our bandwidth
constraints mean that to get good anti-aliasing, we must filter as fast as we can, and only pass the audio bandwidth. Ergo - any energy in the output signal is in the audio band. At sample rates above the standard, the energy in the ring still has the full bandwidth of the passband - maths tells us so.

Energy Dispersion at Different Sample Rates
Figure 6 shows the energy associated with the transient responses. 44.1 and 48 kS/s filters
spread audible energy over 1 msec or more. The 96 kS/s filter is much better, keeping the
vast bulk of the energy within 100 msecs. The 192 kS/s filter can be very good indeed,
keeping the energy within 50 msecs.

Taking into account the speed of sound, we can convert energy defocusing in the time domain to “smear” in distance estimation by the ears. Energy spread over ±500 msecs is the same as a distance smear of ±15 cms. 96 kS/s keeps almost all the energy within about ±50 msecs, or ±1.5 cms. One of the observations people make4 about 96 kS/s material is that the spatial localisation of everything is very much better than 44.1 kS/s. 192 kS/s is better than this, although very dependent on amp and speaker performance to demonstrate it.

One can get oneself into a bit of a twist thinking about the energy in the ringing. After all, if it is in the audio band, allowing extra energy at higher frequencies through the system surely
cannot cancel out some that is in the audio band? It does, though - so although we may not
be able to hear energy above 20 kHz, its presence is mathematically necessary to localise the energy in signals below 20 kHz, and it is possible (and our contention) that we can hear its absence in signals with substantial high frequency content. A high sample rate system allows it through (fact) - and allows the high frequency signals to sound more natural (contention) but allowing better spatial energy localisation (fact).

It is our suggestion that some of the audible differences between conventional 44.1 kS/s and
higher rates (88.2, 96, 176.4, 192 kS/s) may be related to this “energy smear” or defocusing
caused by anti-alias filtering, and that the ear is sensitive to energy as well as spectrum. This
is further backed up by our two original “same spectrum, sound diufferent” signals (figure 1).
In the impulse, all the energy is concentrated at one time, whereas for the white noise the
energy is uniformly spread over time. There is a precedent to this suggestion that the ear is
sensitive to both spectrum and energy - the eye is as well. For sensitive vision or vision off the main beam, we use energy (luminance, or black and white information), whereas for detailed identification when we are looking at something, we use spectrum information (chrominance, or colour). In fact, most sensing processes are sensitive to energy. If the ear is sensitive to energy, it would almost certainly use the information for spatial localisation.

Multi Channel
In conclusion, it is worth noting that if this suggestion is correct, then it would be sensible for
any multichannel audio formats to use one of the higher sampling rates. The purpose of
multichannel is for better spatial localisation of sound sources - so it needs a sampling rate
that can support this!

http://www.cirlinca.com/include/aes97ny.pdf

Your apparent inability to discern the glaring faults in that "Science of..." blog (or in other articles among that blogger's online body of work) coupled with your continued insistence that nothing I've posted is relevant only serve to reveal very plainly what little you actually know about digital audio and DSP, no matter how much you try to put me down.

Consider the possibility, just for a moment, that the facetious scientist blogger, even if well-intentioned that people might not fall prey to marketing hype or "faith based" audio myths (which seems to be his recurring theme, casting everything as a "subjective" vs "objective case") and genuinely trying to spread good info written in a pleasing and easily digestible manner, might himself actually be so uninformed (or worse, misinformed) about what he writes/teaches that he doesn't fully grasp what he's writing about and consequently has himself been influenced by marketing hype and misinfo which he has rehashed into blog posts and is thereby now gleefully perpetuating. Because that is sadly apparent when one observes what little he actually seems to understand about much of what he writes.

If you'd read Lavry's 2004 paper (and not just the brief excerpt in that blog post where it is glorified as "influential") or even the first links I posted in this thread (rather than ridiculing me for posting links), you might have confirmed for yourself that what I'd stated regarding converters sampling at MHz rates was in fact accurate (and had been discussed in the forerunner cakewalk audio newsgroup back in 1998), and could have avoided exposing yourself as a know-nothing platinum poser, er, poster.

As already said, not posting here to impress you but so that others genuinely seeking knowledge here don't end up getting sold a load of perpetuated mis-info.

If my posts really bother you that much, I'd suggest availing yourself of the drop-down "Block" button next to my username. Really handy for adjusting the forum SNR.

Otherwise, I'd strongly suggest you develop some decent tech chops first if you really want to jam.

Happy to discuss quantum computing even, although this isn't really the appropriate place (or two places at once) for that...

2014/01/24 15:54:05

mettelus

My brain has become a shift register... every time something goes in... something falls out

2014/01/24 15:58:30

John T

Block feature, yes. An excellent idea.

2014/01/24 16:01:21

Goddard

bitflipper
Also, the cost of analog anti-aliasing and reconstruction filters might be a factor in some equipment, but it's trivial in audio devices. We're talking 5-cent capacitors here.

Perhaps you're unaware of how Apogee started out in business.

It's not trivial nor inexpensive to come up with good analog a-a and reconstruction filters for audio sampling, especially steep ones (e.g. brickwall) with good freq and phase characteristics as was the only option until DSP solutions became feasible. Think about why people complained that CDs sounded "harsh" and "metallic". Active filters helped, but the cost...

bitflipper
It's the sample-and-hold circuit that's going to have inherent tradeoffs regardless of cost, because its accuracy is not ever going to be consistent across all sampling frequencies. In this particular part of the ADC, the designer has no choice but to pick a target sample rate to optimize the S/H for. If his market is primarily professional studio, he'll assume 96KHz - and have to accept slightly less-than-optimal performance at 44.1KHz. This is why some interfaces perform better at one rate over the other. It's also why Dan Lavry refuses to sell a 192KHz interface, and why the ability to sample at 192KHz should not be taken as a predictor of how well a device will perform at more commonly-used rates.

I believe Lavry started out using non-oversampling sub-ranging or folding type converters (or some other type of multi-pass converters) for sampling audio to PCM, and perhaps that's what he may still use, dunno. But even so, those types of converters get used for sampling very high freq stuff, so should not impose limitations for sampling audio. In any case, technical progress has made the circuit and component related performance factors he raised pretty irrelevant anymore (for years now) as actual limitations to performance and accuracy, as is evident from the number of very capable 192k converters already out there. Others haven't been held back.

Investing in new or redesigned products is always a burden for a boutique manufacturer, and I suspect that could have been a factor in why Lavry became so criticial of 192k until he saw that demand for 192k had risen (or his sales had slipped) sufficiently to justify joining the 192k club too as he now appears to be doing, rather than it being a matter of the technology advancing to where he could finally implement 192k without compromising the audio. Will be interesting to see whether 192k capability eventually migrates down into his lower cost converter ranges (his current USB-equipped gear is limited to 96k over USB by their USB 1.1 interfaces anyway so he'd need to move up to USB2 at least like Benchmark and others already have).

Btw, worked on a few industrial systems employing ADCs in my time also, including one interfaced to an Intel 4004 but hey, Widrow had shown that you only needed 3 bits to find Venus on radar...).

2014/01/24 16:03:15

The Maillard Reaction

Hi Goddard,
I appreciate that some one relevant is willing to take the time to elaborate on this stuff.

Thanks.

best regards,
mike

2014/01/24 17:42:20

brundlefly

I have just one question. Is this an Input meter or an Output meter?

2014/01/24 17:48:37

John

brundlefly
I have just one question. Is this an Input meter or an Output meter?

I feel so used! LOL

2014/01/24 18:17:28

bitflipper

It's not trivial nor inexpensive to come up with good analog a-a and reconstruction filters for audio sampling, especially steep ones (e.g. brickwall) with good freq and phase characteristics as was the only option until DSP solutions became feasible. Think about why people complained that CDs sounded "harsh" and "metallic". Active filters helped, but the cost...

I'm sure you're aware that anti-aliasing filters in modern converters are not steep. They don't need to be, because the oversampled Nyquist frequency is hundreds or thousands of times higher than the top of the audio range. TBH, I haven't examined many interfaces with a magnifying glass, but my guess would be that in most cases the anti-aliasing filter consists of two capacitors and a resistor.

As to why people complained about early CDs, that comes down, I think, to early converters not being oversampled. They did need steep filters, and were prone to aliasing. But we're talking the 1960s. Anybody with a $5 RealTek chip today has a vastly more capable interface than those first-generation recorders.

Re: the 4004. Man, you're as much of a dinosaur as I am! Back then I used to read electronics catalogs the way most young men devoured skin mags. I distinctly remember the week the new Intel catalog arrived that included the 4004. I had the school (where I was an instructor) order one - for the students of course - and built an analog sequencer with it.

It was the very same week a bucket-brigade analog shift register showed up on my desk. That BBD chip had cost a day's wages, but I was sure it was gonna be the future of audio echo units. Unfortunately, I immediately destroyed it with a static discharge, said the heck with it. A few years later along comes a company called Eventide Clockworks, who'd actually done it. That coulda been me, I thought, but for lack of a wrist strap! And laziness.

2014/01/24 18:28:36

Splat

"Perhaps you're unaware of how Apogee started out in business."

2014/01/24 18:30:21

John

Dave read Brundelfly's post and view the avatar closely. I think it explains what has been going on in this thread.

<< ..19 20 .. >>

The science of sample rates

Use My Existing Forum Account

Use My Social Media Account