Sample-rate and bit-depth conversion: a primer

Author
yep
Max Output Level: -34.5 dBFS
  • Total Posts : 4057
  • Joined: 2004/01/26 15:21:41
  • Location: Hub of the Universe
  • Status: offline
2006/07/04 16:53:08 (permalink)

Sample-rate and bit-depth conversion: a primer

Continued from another thread...

ORIGINAL: jacktheexcynic
when you mix at 96/24 and then down-sample to 44.1/16, yes the sound is "condensed." how it is done i can't say specifically, but the distortion that gets introduced by this process is covered up by dithering (another process i don't pretend to understand)...

Okay, the short version is like this (I apologize in advance for any headaches or gross oversimplifications, experts bear with me, I'm trying to make this easy...):

If you "downsample" from a source with a sample rate that is an even multiple of the target sample rate (e.g, if you go from 88.2 to 44.1kHz), the sample conversion essentially just throws out every other sample. This process is totally harmless and does not introduce any distortion. It simply produces the waveform that would have been captured had the original conversion process taken half as many samples. If you "upsample" from 44.1 to 88.2, it just doubles every sample. You don't gain anything, but you don't lose anything either.

If you do a sample rate conversion involving sample rates that are not even multiples (e.g. from 96kHz to 44.1) the sample conversion process uses a mathematical formula to calculate what the "phantom samples" would have been had the original analog waveform been sampled at the target intervals. The old conventional wisdom was that this was worse than simply recording at the target sample rate in the first place, but sample rate conversion has gotten better in recent years and opinion is now divided, although most experts don't worry about it too much.

FWIW, dither has nothing to do with the above.

Converting bit depth is a totally different and much simpler process. If you go from 24 bit to 16 bit, the computer simply chops off the last (quietest) 8 bits. (This process is called "truncation." If you go from 16 bit to 24 bit, the computer simply adds 8 bits of silence.

Think of it like this: pretend that, instead of binary, bit depth is in regular numbers, and that instead of 16-bit vs 24-bit, it's "2 bit" vs "3 bit," so that a "2 bit" recording has a scale from 00 to 99, where 00 is silence, and 99 is full scale (maximum loudness). "3 bit" would have a scale from 000 to 999, where 000 is silence and 999 is full scale. Both "bit depths" would have the same "volume range," it's just that the 3-bit scale has more "steps" or greater resolution between silence and full volume.

So if you took a 3bit sample that was just about half scale, let's say 526 on the 000-999 range, and you convert it to 2bit, the computer chops off ("truncates") the last bit and spits out 52, approximately the same level. If you then converted that 2bit sample back to 3bits, the computer would simply add a zero to the end of the 2bit sample and spit out 520. Makes sense?

On to dither: dither is random noise added to the signal when "truncating," (e.g. when converting from 24 bit to 16 bit). We all know that noise is bad, but in this case, it's good. I'll explain:

In the above example, we converted a "3bit" sample of the value 526 to a "2bit" sample of value 52. This is pretty close, but it would have been more accurate to round it off to 53, but even that would be inaccurate. What we really want is a value of 52.6, which is impossible to achieve in a "2bit" sample. These inaccuracies are known as "truncation error." This truncation error causes a loss of resolution and leads to a "stepped" sound on very low-level decays (such as cymbal tails) and a loss of low-level detail in the recording.

Introducing a little bit of random noise to the "3bit" signal before we "truncate" performs a very neat little trick that smooths over this problem and actually produces an output that has greater resolution and dynamic range, at the expense of a little bit of hiss (and there's even a cure for that: read on...)

Continuing the example, let's say that we have a string of ten "3bit" samples, all of which have a value of 526 (we're keeping things simple here). If we simply truncate each sample, we end up with a string of "2bit" samples of value 52. But if we add a little bit of randomization before we convert, we end up with six of the ten samples having a value of 53, and four of the samples having a value of 52, randomly dispersed between the ten resulting samples. The result will be an analog output with an aggregate RMS value of 52.6, exactly what we want, except with a little bit of hiss. But as promised, there's a cure for that hiss...

"Shaped" dither, such as POW-r, introduces noise that is basically eq'd to reduce the level of "hiss" in the frequency ranges that humans are most sensitive to. Because dither noise is very quiet, this "shaped" dither is essentially inaudible at safe listening levels, yet still randomizes the truncation enough to produce "averaged" levels that can effectively result in a 16-bit recording that has 19-bit resolution.


So anyway, onto conclusions:

-Always record at 24-bit if you can, even if your target medium (such as CD) is 16 bit. This has two advantages: one is that your ultimate 16-bit record will have better resolution, and the second reason, which is usually far more useful in practice, is that recording at 24-bit allows you to to set your record levels much lower and still have vastly improved resolution, so there is no need to worry about digital clipping. If you need to convert the 24-bit samples to 16-bit, then dither down to 16-bit as the very last step, after all mixing and premastering.

- If you want to record at higher sample rates, try and stick with a sample rate that is an even multiple of your primary target rate. So if you're recording a CD (which is 44.1kHz), use either 44.1, 88.2. If you're recording for DVD (which is 48kHz), use either 48, 96, or 192kHz, and so on.

Notes:

High sample rates and human hearing: In conventional scientific terms, a theoretically perfect sampling system that operates at 44.1kHz is essentially capable of accurately reproducing everything the human ear can hear (in theory, anyway). Many people disagree with mainstream science, and have a variety of theories to explain why people may be able to percieve sounds that are not heard in conventional hearing tests. These theories range from frankly ridiculous to insightful and thought-provoking. I hope these people will not flame me but will present their case as best they can. I'm not taking sides here, just trying to frame the debate as best I can.

High sample rates and gear limitations: You'll notice above that that I used the term "theoretically" when talking about the ability of a 44.1kHz sample rate to capture and reproduce the range of human hearing. Inferior analog-to-digital (A/D) or digital-to-analog (D/A) converters will perform noticably better at higher sample rates, for one very good reason that is univerally agreed-upon and not subject to esoteric or theoretical debate...

"Jitter" is the term used to describe instability in A/D or D/A converters. In other words, if you have a sample rate of 44.1kHz, the converter should be sampling at exactly even intervals, 44,100 times per second. This level of stability is difficult if not impossible to achieve, and any instability will result in jitter. In a graphical wave editor, a very "jittery" waveform does not look like a smooth, evenly stepped curve, but rather kind of shaky, like a curve drawn on an etch-a-sketch. In practice, severe jitter sounds like grainy "digititis": sounds are not clearly located in the stereo spread, highs are harsh and kind of white-noisy, and so on. If you have jittery converters, higher sample rates will sound better for this simple reason: the samples are taken at smaller intervals, so any instability is less significant overall-- in other words, a 1% error at 44.1kHz is twice as big as a 1% error at 88.2kHz. Make sense?

I hope I've helped clarify a few elements of digital audio for somebody, at least.

Cheers.
post edited by yep - 2006/07/04 22:58:03
#1

20 Replies Related Threads

    jacktheexcynic
    Max Output Level: -44.5 dBFS
    • Total Posts : 3069
    • Joined: 2004/07/07 11:47:11
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/04 20:14:12 (permalink)
    as always, a well written and well-thought out response from yep - thanks. i get the process now (i sort of knew what was going on but this makes it clear to me).

    ORIGINAL: yep
    High sample rates and human hearing: In conventional scientific terms, a theoretically perfect sampling system that operates at 44.1kHz is essentially capable of accurately reproducing everything the human ear can hear (in theory, anyway). Many people disagree with mainstream science, and have a variety of theories to explain why people may be able to percieve sounds that are not heard in conventional hearing tests. These theories range from frankly ridiculous to insightful and thought-provoking. I hope these people will not flame me but will present their case as best they can. I'm not taking sides here, just trying to frame the debate as best I can.


    i disagree with mainstream science on this one, and here's why (and i'm definitely not trying to flame anybody):

    humans can supposedly hear between 20 and 20,000 hertz (samples per second for the uninitiated). so at 44,100 samples per second, the highest possible audible frequency can only be represented with two samples. since a cycle has both a peak and a trough then we end up with one sample with "positive" gain (where the speaker pushes out) and one sample with "negative" gain (where the speaker pulls in).

    now i realize there are technical terms for the above but i don't remember/don't care to look them up. anyone can see on a wave form that a complete cycle goes above and below the 0db line. the actual waveform is curved but the digital representation, having only two samples to work with (approximately) would essentially be two one-pixel lines, one pointing up and one pointing down.

    of course the analog conversion (and the speaker cone) probably smooths out what would otherwise be an impossible sound to recreate (instant movement) but i think we can all agree that the resolution is definitely lacking. (as a side note, i'm sure this is why people often prefer tubes and analog equipment to all-digital, it retains the real resolution of the waveforms and naturally smooths out very high frequencies.)

    when you start piling on all the frequencies commonly found in musical instruments, i think the high-end starts to get either dull (favoring of lower frequencies) or harsh (favoring of higher frequencies). again i believe that while you can sort-of represent any single frequency with 44,100 samples per second, it starts to become digital mush in regular use. at higher sampling rates you get better representation of the actual waveform. that better representation cleans up the signal going to the analog converter, which cleans up the signal going to the speaker, which refines the reproduction of the actual recording.

    i may be wrong, but i do know one thing: listening to dvd audio in dolby 5.1 (44,100 samples per second, at 16 bits) and dts 5.1 (96,000 samples per second, at 24 bits) there is an audible difference and it's not just volume. there is definition which is lacking in the lower quality sample (universal's (or is it 21st century fox?) trumpet theme is a good reference). i can hear the difference on my midgrade home theater system (sony, w/ floor speakers).

    that's my two cents.

    - jack the ex-cynic
    #2
    yep
    Max Output Level: -34.5 dBFS
    • Total Posts : 4057
    • Joined: 2004/01/26 15:21:41
    • Location: Hub of the Universe
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/04 22:42:40 (permalink)
    Hi Jack,

    You're on exactly the right track, although you may have been led there by a liar. A sample rate of 44.1 cycles per second reproduces stuff at the upper limits of human hearing at a resolution that would be totally unacceptable at, say, 7,000 cycles. Typical hearing tests say that human hearing maxes out at about 20,000 cycles per second (20kHz). A sample rate of 40kHz will reproduce a 20kHz waveform, but at a very low resolution, and 44.1kHz isn't much better.

    Mainstream science tells us that at the upper limits of human hearing, sensitivity is extremely poor, and that people cannot distinguish between different resolutions at extremely high frequencies. A random test of human hearing confirms this (in fact, most people over age 25 or so can't even hear above 15-17kHz). Many people with very sensitive hearing, such as audiophiles and sound engineers, claim to be able to hear subtle differences at frequencies very close to ultrasonic, and there are some studies that suggest that people can percieve frequencies that they cannot hear in conventional hearing tests. These studies contradict conventional thinking about high-frequency perception.

    There are essentially two debates regarding ultrasonic, or near-ultrasonic frequencies (close to or above 20kHz): The first is that human hearing is more perceptive at very high frequencies than is generally acknowledged, and the second is that human hearing is essentially capable of percieving ultrasonic harmonics that don't show up on "one frequency at a time" hearing tests. Either way, we are talking about very subtle differences that exist on the frayed edge of human perception, and that exist beyond the ability of conventional CD or DVD audio to reproduce.

    One thing that is absolutely certain and beyond dispute is that bad converters, either on the A/D or D/A side are always improved by higher sample rates, for the reasons mentioned above. Whenever I hear someone say that there is a difference in playback quality with a higher sample rate, my first suspicion is that they are listening on a jittery playback system, which is extremely common in consumer electronics. I doubt very much that the converters on your sony home theater are scientifically-calibrated reference-grade converters, and from that POV, it would make sense that they would sound better at a higher sample rate, all else being equal. In this hypothetical case, the difference might lie not with the sample rate itself, but rather with the playback device.

    All that said, it is still ABSOLUTELY TRUE that a higher sample rate would sound better on a flawed playback system, even though it was the playback system's fault. And from this POV, recording at and transmitting at a higher sample rate is better, even if the higher sample rate itself doesn't contain any perceptibly faulty information, because an unstable playback device will always be less unstable at higher sample rates, simply because the samples are closer together, so the errors are proportionately smaller.

    One thing for everyone to keep in mind is this: the most common kind of digital audio most people hear is CDs, followed by DVD audio, and these are 44.1 and 48kHz respectively, and both formats are capable of very high-quality sound reproduction, far better than most affordable playback systems available to the people who first heard John Hammond's recordings of the Benny Goodman Orchestra, or the first audience to hear Pet Sounds or Sgt. Pepper's or Kind of Blue or Electric Ladyland or Tommy or whatever.

    Any improvement upon CD or DVD audio is usually extremely subtle, at least in digital terms. Almost nobody is ever going to complain that a Led Zeppelin CD or a Wu-Tang Clan DVD is audibly inferior to an analog recording on a typical home playback system. If there is a difference inherent to the sample rate, it is the kind of thing that is most likely to manifest itself on a reference-quality playback system reproducing Thomas Tallis compositions or some such.

    cheers.
    #3
    Junski
    Max Output Level: -59.5 dBFS
    • Total Posts : 1570
    • Joined: 2003/11/10 07:29:13
    • Location: FI
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/05 03:18:59 (permalink)
    ORIGINAL: yep

    Continued from another thread...

    ORIGINAL: jacktheexcynic
    when you mix at 96/24 and then down-sample to 44.1/16, yes the sound is "condensed." how it is done i can't say specifically, but the distortion that gets introduced by this process is covered up by dithering (another process i don't pretend to understand)...

    Okay, the short version is like this (I apologize in advance for any headaches or gross oversimplifications, experts bear with me, I'm trying to make this easy...):

    If you "downsample" from a source with a sample rate that is an even multiple of the target sample rate (e.g, if you go from 88.2 to 44.1kHz), the sample conversion essentially just throws out every other sample. This process is totally harmless and does not introduce any distortion. It simply produces the waveform that would have been captured had the original conversion process taken half as many samples. If you "upsample" from 44.1 to 88.2, it just doubles every sample. You don't gain anything, but you don't lose anything either.

    If you do a sample rate conversion involving sample rates that are not even multiples (e.g. from 96kHz to 44.1) the sample conversion process uses a mathematical formula to calculate what the "phantom samples" would have been had the original analog waveform been sampled at the target intervals. The old conventional wisdom was that this was worse than simply recording at the target sample rate in the first place, but sample rate conversion has gotten better in recent years and opinion is now divided, although most experts don't worry about it too much.

    FWIW, dither has nothing to do with the above.

    Converting bit depth is a totally different and much simpler process. If you go from 24 bit to 16 bit, the computer simply chops off the last (quietest) 8 bits. (This process is called "truncation." If you go from 16 bit to 24 bit, the computer simply adds 8 bits of silence.

    ...




    Maybe jacktheexcynic meant what happens w/ Sonar --> it's SRC quality from 96kHz to 44.1 kHz is not good at all -http://src.infinitewave.ca/

    IMO, when you dither from higher bit-depth to lower --> your 24-bit/32-bit data becomes 'rounded' on some point, i.e. to 16-bit accuracy and not just chopped off (--> as for an rough example: 0.283625615327493772 may become 0.28362533 --> shall this be heard, I can't say anything).


    Junski
    post edited by Junski - 2006/07/05 10:53:42


    #4
    jacktheexcynic
    Max Output Level: -44.5 dBFS
    • Total Posts : 3069
    • Joined: 2004/07/07 11:47:11
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/05 09:43:43 (permalink)
    ORIGINAL: yep
    One thing that is absolutely certain and beyond dispute is that bad converters, either on the A/D or D/A side are always improved by higher sample rates, for the reasons mentioned above. Whenever I hear someone say that there is a difference in playback quality with a higher sample rate, my first suspicion is that they are listening on a jittery playback system, which is extremely common in consumer electronics. I doubt very much that the converters on your sony home theater are scientifically-calibrated reference-grade converters, and from that POV, it would make sense that they would sound better at a higher sample rate, all else being equal. In this hypothetical case, the difference might lie not with the sample rate itself, but rather with the playback device.


    this is certainly the case with my home theater system - the receiver is on the low end of digital capable surround. i don't believe i have particularly sensitive hearing, although as i've listened to music on progressively better and better soundcards and speakers, i've been able to refine my perception of what i can hear (picking out certain frequency ranges, hearing high-end details i've never noticed before, etc.).

    one thing i've been thinking about: with speakers, at lower frequencies, the definition of dozens of samples may be required for the cone to move in a wave-like pattern without cutting off the peaks and troughs (distortion?). but at higher frequencies the speaker/driver may not be capable of anything other than wave-like movement because of physical limitations. so even though the definition in the digital signal is lacking, it is "glossed" over by the speaker itself.

    any thoughts on that?

    - jack the ex-cynic
    #5
    jhonvargas
    Max Output Level: -83 dBFS
    • Total Posts : 371
    • Joined: 2003/11/06 07:34:55
    • Location: Australia
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/05 09:56:17 (permalink)
    Yep,

    Thanks for the info. I have a question about this subject: Will not these AD/DA inaccuracies be reduced by oversampling? I understand that good AD/DA converters use oversampling for reducing jitter, so it is giving you an accurate rate of 44.1 K samples per second but it is (internally) sampling at a higher rate.

    John
    #6
    ed_mcg
    Max Output Level: -48 dBFS
    • Total Posts : 2741
    • Joined: 2004/04/26 11:22:59
    • Location: Minneapolis
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/05 10:03:13 (permalink)
    one thing i've been thinking about: with speakers, at lower frequencies, the definition of dozens of samples may be required for the cone to move in a wave-like pattern without cutting off the peaks and troughs (distortion?). but at higher frequencies the speaker/driver may not be capable of anything other than wave-like movement because of physical limitations. so even though the definition in the digital signal is lacking, it is "glossed" over by the speaker itself.
    Right, it's interesting to do thought experiments with that.

    Take 44.1Hz, which is a slightly sharp low F on a bass guitar; this will have 1000 samples per each period and the "connect the dots" line will be quite close to the original with the LPF characteristics of a the D/A smoothing off any rough edges.

    Now take 22.05KHz (the Nyquist frequency; granted there's LPF somewhere around 19.5KHz to prevent digital aliasing, but let's ignore that to make the arithmetic easy); this will have exact 2 samples per period. Without a low pass filter, this would produce a square wave coming out of the D/A and even with a LPF, it would likely go over to a triangle wave. Now, these more complex wave have a high degree of harmonic content, in fact they can be represented as a Fourier series of sine wave. That's where the LPF comes in: the first harmonic of 22.05K is 44.1K and the LPF of the D/A is set to less than 20KHz.
    #7
    chaz
    Max Output Level: -47.5 dBFS
    • Total Posts : 2775
    • Joined: 2004/02/03 12:08:00
    • Location: Tampa, FL
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/05 17:01:32 (permalink)
    Excellent thread guys on a very important topic, IMO.
    #8
    Junski
    Max Output Level: -59.5 dBFS
    • Total Posts : 1570
    • Joined: 2003/11/10 07:29:13
    • Location: FI
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/06 05:59:31 (permalink)
    ORIGINAL: saecollege.de

    Bit Rate

    Digital sound is made up of words of 0's and 1's and 00, 11, 01, 10 are the four possibilities in a two bit word. A three bit word can be made up with 000, 111, 001, 010, 100, 101, 011, 110, which means there are eight possibilities. You see - 2 bit gives 4, 3bit gives 8, 4 bit gives 16, 5 bit gives 32, and so on. Now if we were to use the bit words to express volume with a four bit word we could give 16 different values for volume. So the higher the bit rate the more accurate the resolution be it volume, digital pictures or digital sound. So 24 bit digital sound has more resolution and accuracy than 16bit digital sound.


    i.e.

    16-bits --> 2^16 = 65 536 possibilities
    24-bits --> 2^24 = 16 777 216 possibilities


    ORIGINAL: saecollege.de

    Sampling Rate

    Digital sound is produced by sampling a sound (or should I say the electrical version of it) in real time and expressing it in bit words. Once you start sampling or recording digital sound a clock starts and progressive samples of what the sound is are taken. The rate at which the samples are taken is called the sampling rate.




    The drawing above shows a wave of a sound being sampled. If the time in the drawing is 1 second, then there are 6 samples (the last one is the first in the next second) of the sound in one second or a sampling rate of 7. So obviously the higher the sample rate the more accurate the resolution. So when we say that the sound is 16bit, 44.1Khz it means that the sound is being sampled at 44.1 thousand times a second and it is being measures with 16 bit accuracy. In the above waveform the sampling volume levels given would be 0,2,2,0,-2,-2, Not a very accurate version of a simple waveform. But 44.1kHz, now that's fast, or is it? Lets look at sound in seconds.




    In this chart you can see the relationship between the sampling rate and the waveforms it's sampling. 1kHz will have 44.1 samples taken of each of it's waveform as its oscillating at 10,000 waveforms a second. 100Hz will have 441 samples taken of each of its waveforms. But 10kHz will have 4.41 samples taken of each of it's waveforms. Now look at the first waveform we drew. In that drawing we took 6 samples of the waveform and got an amplitude reading saying 0,2,2,0,2,2. imagine how inaccurate 4.41 samples are of a complex waveform. That is why digital high frequencies sound harsh!! The industry has constantly denied this factor and even gone to the extent of saying the hear can't distinguish between a square wave and a sine wave above 7kHz. Pigs Bum.

    At a sampling rate of 96kHz you get 9.6 samples of a 10kHz wave and believe me, you can hear it.

    In an article by Rupert Neve, I read recently, he said that we should aim for 24bit resolution and 192kHz sampling rate if we want to equal the quality of high quality analogue recording. We will get there. DVD is already up to 24 bit 96kHz sampling so we are on the way. But if your 16bit, 44.1kHz CD sounds bright, consider what makes it bright and you will see that it's a false bright created by the high frequencies sounding like square waves!

    Why 44.1kHz Sampling Rate?

    Why not 44, or a nice round number like 50. When the first engineers were inventing digital sound they had worked out the on/off, 0/1, idea and needed a way to represent it. The idea came to use white dots on a TV screen where a white dot was on and a black dot was off. Neat. So you record it like a video picture on a video recorder. That was fine, but the engineers had been caught out before. What about PAL (the European video standard) and NTSC? (the American and Japanese standard.) They weren't going to get caught up in that again, no way, so they configured a number that was compatible between the 528 line NTSC and 625line PAL and the number was 44.1kHz. Just a piece of useless info you might want one day!





    Junski
    post edited by Junski - 2006/07/06 06:47:04


    #9
    daflory
    Max Output Level: -90 dBFS
    • Total Posts : 10
    • Joined: 2006/05/23 23:24:38
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/07 04:09:58 (permalink)
    I read an article from a reputable source (can't remember where, I'm afraid), that offered an interesting explanation for the better perceived sound quality of high sample rates. It suggested that ultrasonic harmonics selectively re-enforce or otherwise interact with the audible portions of a recording. We may not hear these high overtones directly, but they affect the sound we do hear.

    When these ultrasonic harmonics are "chopped off" by low sample rates the audio below the Nyquest frequency may be accurately reproduced, but it will be missing the slight distortions caused by the ultrasonic overtones.

    Subjectively, what I notice between high and low sample rates is a loss of "airiness" or "overhead" quality." I find it especially noticeable in music with lots of overtones: choral music or overdriven "singing" guitar (like Brian May).

    Great posts here. Thanks a lot for this info!
    #10
    Kicker
    Max Output Level: -81 dBFS
    • Total Posts : 477
    • Joined: 2004/06/08 23:31:37
    • Location: Amherst, MA
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/08 22:02:11 (permalink)
    Excellent information from everyone!

    ORIGINAL: jhonvargas

    Yep,

    Thanks for the info. I have a question about this subject: Will not these AD/DA inaccuracies be reduced by oversampling? I understand that good AD/DA converters use oversampling for reducing jitter, so it is giving you an accurate rate of 44.1 K samples per second but it is (internally) sampling at a higher rate.

    John



    Oversampling does not affect jitter. It is used to lessen the combing created by the LPF. Oversampling takes the input frequency and multiplies it so that it is way out of the audible frequency range. Once it's up there, the LPF can be applied at say 1,000 Khz instead of 20Khz. Then when the signal is divided down to the original scale, any artifacts created by the LPF are inaudible.
    #11
    yep
    Max Output Level: -34.5 dBFS
    • Total Posts : 4057
    • Joined: 2004/01/26 15:21:41
    • Location: Hub of the Universe
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/12 21:00:47 (permalink)
    ORIGINAL: jacktheexcynic
    ...one thing i've been thinking about: with speakers, at lower frequencies, the definition of dozens of samples may be required for the cone to move in a wave-like pattern without cutting off the peaks and troughs (distortion?). but at higher frequencies the speaker/driver may not be capable of anything other than wave-like movement because of physical limitations. so even though the definition in the digital signal is lacking, it is "glossed" over by the speaker itself.

    any thoughts on that?



    Again, your thinking is right on track. Bear in mind that the D/A converter does actually output an analog waveform, not simply a "square wave" (unless that's what dictated by the samples). High quality speakers will reproduce these high frequncies quite accurately, but in most cases, in most environments, on most playback systems, it hardly matters. Human hearing is not very good close to 20kHz. Indeed, most adults can't even hear much above 15k (hence the "mosquito" ringtone-slash-teenage-deterrent) and a certain amount of inaccuracy at very high frequencies will not only be "smoothed over" by the playback system, but will be absorbed or masked by room acoustics and background noise in most real-world scenarios, and moreover won't even be audible to most people.

    daflory touched on one of the competing theories as to why high sample rates might sound better-- there is a school of thought that believes that ultrasonic harmonics may interact with audible frequencies in a way that is perceptible to humans. There are some studies that appear to substantiate this theory, but it is far from proven, and the underlying theory should be approached with some skepticism. For instance-- wouldn't those audible effects have been captured within the audible frequency range in the original recording? And conventional physics tells us that the SPL of, say, a 40kHz harmonic would have to be huge to have a perceptible effect on a 20kHz fundamental, when even the fundamental is barely perceptible. Moreover, nothing that even remotely approaches conventional music has a fundamental even close to 20kHz (high C is what, 8400 cycles?), so we are talking about very faint harmonics of harmonics causing foldback effects on frequencies that are barely perceptible to begin with.

    I'm not saying the phenomenon of human perception of very high frequencies doesn't exist, but I have doubts about theories that claim that people have any significant sensitivity to frequencies above the upper teens of the kHz range. I have really good hearing, and last time it was tested, I could hear 20kHz only at something like 100dB SPL, which is about on par with with having 20/20 vision at my age (30). These high-frequency sounds need to be extremely loud to be even audible, and having heard them (and seen the SPL meter) in the past few months, I'm pretty sure that I don't have any ability to percieve anything like subtle differences up there-- they're more like a faint headache than a sound at audible SPL levels.

    That said, I absolutely agree that higher sample rates can sound much better. Like you (jacktheexcynic), I think DVD audio is usually noticably better than CD, but I suspect it's mostly due to 24-bit and decreased jitter, for the reasons mentioned above.

    When evaluating these things, bear in mind that 44.1 can reproduce frequencies that are way above the RIAA curve, and well above any significant content from albums like Sgt Peppers, kind of blue, Gould's recordings of the Goldberg Variations, pet sounds, and so on, and nobody ever complained about those lacking "airiness." Moreover, most of those "eq by numbers" guides usually suggest boosting somewhere between 10-15kHz to increase airiness, and a great many mastering engineers still follow RIAA practice and simply shunt frequencies to ground above 12kHz.

    My experience is that, with very high-quality converters and a high-quality recording/playback system, a 44.1kHz sample rate is capable of reproducing excellent recordings. With lesser converters, higher sample rates sound noticably better. In any case, I personally still think all-analog sounds best, but the format wars (and indeed, the sample-rate wars) are a topic for another thread. I've already overshot the bounds of the topic.

    Cheers.
    #12
    jacktheexcynic
    Max Output Level: -44.5 dBFS
    • Total Posts : 3069
    • Joined: 2004/07/07 11:47:11
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/12 22:43:35 (permalink)
    ORIGINAL: yep
    When evaluating these things, bear in mind that 44.1 can reproduce frequencies that are way above the RIAA curve, and well above any significant content from albums like Sgt Peppers, kind of blue, Gould's recordings of the Goldberg Variations, pet sounds, and so on, and nobody ever complained about those lacking "airiness." Moreover, most of those "eq by numbers" guides usually suggest boosting somewhere between 10-15kHz to increase airiness, and a great many mastering engineers still follow RIAA practice and simply shunt frequencies to ground above 12kHz.

    My experience is that, with very high-quality converters and a high-quality recording/playback system, a 44.1kHz sample rate is capable of reproducing excellent recordings. With lesser converters, higher sample rates sound noticably better. In any case, I personally still think all-analog sounds best, but the format wars (and indeed, the sample-rate wars) are a topic for another thread. I've already overshot the bounds of the topic.

    Cheers.


    didn't know about the riaa curve... explains quite a bit though if they are lopping off frequencies above 12khz.

    as for analog vs. digital, i believe that digital recording introduces an artificial harshness and it's not just audio, it's video, photography etc. watch "collateral" (filmed with digital cameras) or look closely at most digital photography and you'll see it. i can even see it in my wedding photos (expensive digital camera). of course there maybe it's just the compression but still... it lacks warmth. off-topic to be sure. =)

    - jack the ex-cynic
    #13
    yep
    Max Output Level: -34.5 dBFS
    • Total Posts : 4057
    • Joined: 2004/01/26 15:21:41
    • Location: Hub of the Universe
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/13 00:31:24 (permalink)
    ORIGINAL: jacktheexcynic
    ...didn't know about the riaa curve...


    For the record, and for the benefit of those that are unfamilliar with it, the "RIAA curve" is basically a record-mastering standard that was practically a law until the late 80s, and that was still adhered to in almost every record through at least the late 90s, and it mandated a low shelf cut at 47hz and a high shelf cut at 12khz (with a 30 degree gradient, if anyone cares).

    It basically meant that the extreme low and high frequencies be cut out. This was originally due to limitations in recording medium (vinyl LP, specifically), and almost all of the "classic" recordings we know and love today were subjected to it, with very little damage done (again, this still allowed for harmonics well above high C to come through). Many high-priced mastering engineers still use it today, and actually think that it makes for a better overall listening experience, and a lot of them produce excellent recordings that nobody complains about.

    For anyone doubting the capability of CD audio, it's important to remember that up until ten years ago or so, CDs offered objectively and subjectively far and away the best sound quality most people had ever heard, especially in an affordable playback medium. A handful of obsessive sound engineers and people with megabuck home stereos still insisted they liked analog recordings better (myself included), but there is no question that CDs revolutionized people's ideas of what inexpensive home listening could be. I am lucky enough to have a $4,000 oracle gyroscope-mounted turntable in my living room on which vinyl LPs give CDs a serious run for their money, but for most people, such extravagance is out of the question and CDs will far outperform any analog playback device they are likely to have access to.

    The fact is, relatively inexpensive CD players can compete with the best, most expensive playback systems in the world, even if they can't beat them hands down. $10 portables may suffer somewhat from "digititis," but they are still far better than the most expensive cassette walkmans, and even better than most turntables. We are all lucky to live in a world where subwoofers (once a strange and exotic luxury) are fairly commonplace, standard car stereos routinely come with 6 decent speakers instead of two crappy ones, complete surround-sound systems are available for $300 that rival $1,000 stereos from 10 years ago, and so on. Almost all of this is thanks to CDs, which really did usher in a revolution in common perceptions of sound quality. And the strange thing is, some of the best records ever made are still some of the old mono stuff made on primitive, noisy, bandwidth-limited tube gear.

    DVD audio is better than CD, but not the revolution in home listening that CD was. And the fact is, there are plenty of people who are perfectly happy with ipods and mp3s, which suck compared to CDs, just as plenty of people were perfectly happy to trade down in sound quality from LPs to cassettes, for the sake of convenience.

    It's our job as audio engineers to obsess over sound quality, even those of us who don't get paid for it, but it's important to remember that most people who are listening to the latest Bloodhound Gang or Beyonce single could care less about "airiness" or detail in upper-range harmonics.

    Cheers.
    #14
    Junski
    Max Output Level: -59.5 dBFS
    • Total Posts : 1570
    • Joined: 2003/11/10 07:29:13
    • Location: FI
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/13 02:28:32 (permalink)
    ORIGINAL: yep

    ...

    ...

    ...

    ...

    When evaluating these things, bear in mind that 44.1 can reproduce frequencies that are way above the RIAA curve, and well above any significant content from albums like Sgt Peppers, kind of blue, Gould's recordings of the Goldberg Variations, pet sounds, and so on, and nobody ever complained about those lacking "airiness." Moreover, most of those "eq by numbers" guides usually suggest boosting somewhere between 10-15kHz to increase airiness, and a great many mastering engineers still follow RIAA practice and simply shunt frequencies to ground above 12kHz.

    ...

    Cheers.


    Where the RIAA curve frequency response ends (it's an electrical process)?

    I have a RIAA stage that is capable from 0 Hz up to 420 kHz (when limited) and up to 4.2 Mhz (when unlimited). Also, my cardridge is capable for 5Hz to 80 kHz ± 0.5 dB (~FLAT response (boron) from 5 Hz to 30 kHz, if affects by temperature aren't taken in notice, and FLAT response 40 Hz - 15 kHz (by the factory measurement data sheet)).


    ORIGINAL: yep

    ORIGINAL: jacktheexcynic
    ...didn't know about the riaa curve...


    For the record, and for the benefit of those that are unfamilliar with it, the "RIAA curve" is basically a record-mastering standard that was practically a law until the late 80s, and that was still adhered to in almost every record through at least the late 90s, and it mandated a low shelf cut at 47hz and a high shelf cut at 12khz (with a 30 degree gradient, if anyone cares).

    It basically meant that the extreme low and high frequencies be cut out.
    This was originally due to limitations in recording medium (vinyl LP, specifically), and almost all of the "classic" recordings we know and love today were subjected to it, with very little damage done (again, this still allowed for harmonics well above high C to come through). Many high-priced mastering engineers still use it today, and actually think that it makes for a better overall listening experience, and a lot of them produce excellent recordings that nobody complains about.

    For anyone doubting the capability of CD audio, it's important to remember that up until ten years ago or so, CDs offered objectively and subjectively far and away the best sound quality most people had ever heard, especially in an affordable playback medium. A handful of obsessive sound engineers and people with megabuck home stereos still insisted they liked analog recordings better (myself included), but there is no question that CDs revolutionized people's ideas of what inexpensive home listening could be. I am lucky enough to have a $4,000 oracle gyroscope-mounted turntable in my living room on which vinyl LPs give CDs a serious run for their money, but for most people, such extravagance is out of the question and CDs will far outperform any analog playback device they are likely to have access to.

    The fact is, relatively inexpensive CD players can compete with the best, most expensive playback systems in the world, even if they can't beat them hands down. $10 portables may suffer somewhat from "digititis," but they are still far better than the most expensive cassette walkmans, and even better than most turntables. We are all lucky to live in a world where subwoofers (once a strange and exotic luxury) are fairly commonplace, standard car stereos routinely come with 6 decent speakers instead of two crappy ones, complete surround-sound systems are available for $300 that rival $1,000 stereos from 10 years ago, and so on. Almost all of this is thanks to CDs, which really did usher in a revolution in common perceptions of sound quality. And the strange thing is, some of the best records ever made are still some of the old mono stuff made on primitive, noisy, bandwidth-limited tube gear.

    DVD audio is better than CD, but not the revolution in home listening that CD was. And the fact is, there are plenty of people who are perfectly happy with ipods and mp3s, which suck compared to CDs, just as plenty of people were perfectly happy to trade down in sound quality from LPs to cassettes, for the sake of convenience.

    It's our job as audio engineers to obsess over sound quality, even those of us who don't get paid for it, but it's important to remember that most people who are listening to the latest Bloodhound Gang or Beyonce single could care less about "airiness" or detail in upper-range harmonics.

    Cheers.


    Are you sure on these RIAA thoughts ...
    I have not meet any information on cutting low/high frequencies as you mentioned ...

    On picture below, standard RIAA curve used from '56
    green = ("reverse") used on recording (i.e. this EQ is done before pressing the LP),
    red = reproduction for the RIAA EQ (i.e. this is done by the RIAA stage on your phono pre-amp)
    blue = this should be the original signal (but it's not because of deviations on electrical parts on both ends)



    Couple sources for RIAA:
    - RIAA basics - http://www.euronet.nl/~mgw/background/riaa/uk_riaa_background_1.html
    - different compensations used on recordings - http://www.smartdev.com/LT/compensation.htm
    - http://www.enhancedaudio.com/newway.htm
    - google.com

    Also, there are lots of articles (just google) explaining the differences b/w the formats (LP vs CD) and stating the CD just can't be as good w/ high frequencies as the LP is.


    Junski
    post edited by Junski - 2006/07/13 04:12:32


    #15
    yep
    Max Output Level: -34.5 dBFS
    • Total Posts : 4057
    • Joined: 2004/01/26 15:21:41
    • Location: Hub of the Universe
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/13 10:28:31 (permalink)
    Junski, sorry for the confusion-- that "RIAA curve" on your phono preamp is a totally different thing. I was talking about the AES RIAA mechanical rule used during premastering.

    Cheers.
    #16
    Junski
    Max Output Level: -59.5 dBFS
    • Total Posts : 1570
    • Joined: 2003/11/10 07:29:13
    • Location: FI
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/13 13:00:35 (permalink)
    ORIGINAL: yep

    Junski, sorry for the confusion-- that "RIAA curve" on your phono preamp is a totally different thing. I was talking about the AES RIAA mechanical rule used during premastering.

    Cheers.


    Interesting. If the RIAA you're talking does not mean the 'reverse EQ fix' (i.e. the green EQ curve on picture above) put on vinyl record then does it mean something else made before this process?

    Could you give some more light on this matter or maybe some explaining article links (I tried w/ google already, w/o helpful results)?


    Junski
    post edited by Junski - 2006/07/13 13:19:37


    #17
    Brett
    Max Output Level: -80 dBFS
    • Total Posts : 534
    • Joined: 2004/01/29 06:54:35
    • Location: Tokyo
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/13 23:26:19 (permalink)
    All of this talk is depressing, I can't help worry about the future of sound with 128kbps mp3s being ubiqitous.

    Nice summary Yep.

    I'm extremely sceptical of jitter. I'm a network engineer by trade and even cheap network equipment can run at gigabit speeds, that's 100,000,000,000 bits per second, even a $10 network card runs at 100 Mhz. Getting a clock running accurately at 44,100 Hz clock is nothing. Also a TV can lock on to a broadcast UHF signal without jitter.

    Brett

    post edited by Brett - 2006/07/13 23:37:07
    #18
    jacktheexcynic
    Max Output Level: -44.5 dBFS
    • Total Posts : 3069
    • Joined: 2004/07/07 11:47:11
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/15 12:11:49 (permalink)
    ORIGINAL: Brett
    I'm extremely sceptical of jitter. I'm a network engineer by trade and even cheap network equipment can run at gigabit speeds, that's 100,000,000,000 bits per second, even a $10 network card runs at 100 Mhz. Getting a clock running accurately at 44,100 Hz clock is nothing. Also a TV can lock on to a broadcast UHF signal without jitter.


    brett, the reason jitter doesn't matter with network cards is that the length of the electrical transmission is irrelevant and in the case of the ethernet protocol there is a timing bit which synchs the physical devices (4/5, or four data bits and one timing bit). of course it's been awhile since my networking class so that may not be accurate but that's what i remember.

    in the case of a d/a converter the timing does matter - if one sample is longer or shorter than the next it distorts the wave, and at higher frequencies this is a big deal. there is nothing on the analog end to correct the timing.

    edit: this is why higher sample rates suffer less from jitter, because the distortion is not as pronounced with a shorter segment of the waveform.
    post edited by jacktheexcynic - 2006/07/15 12:25:36

    - jack the ex-cynic
    #19
    SteveD
    Max Output Level: -47 dBFS
    • Total Posts : 2831
    • Joined: 2003/11/07 13:35:57
    • Location: NJ
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/22 15:07:04 (permalink)

    ORIGINAL: Junski

    ORIGINAL: yep

    Junski, sorry for the confusion-- that "RIAA curve" on your phono preamp is a totally different thing. I was talking about the AES RIAA mechanical rule used during premastering.

    Cheers.


    Interesting. If the RIAA you're talking does not mean the 'reverse EQ fix' (i.e. the green EQ curve on picture above) put on vinyl record then does it mean something else made before this process?

    Could you give some more light on this matter or maybe some explaining article links (I tried w/ google already, w/o helpful results)?


    Junski


    Junski,

    Looks like this:

    http://forum.cakewalk.com/fb.asp?m=166121

    Lot's of detail on the AES RIAA mechanical rule used during premastering in that thread.

    Hope that helps.

    SteveD
    DAWPRO Drum Tracks

    ... addicted to gear
    #20
    jacktheexcynic
    Max Output Level: -44.5 dBFS
    • Total Posts : 3069
    • Joined: 2004/07/07 11:47:11
    • Status: offline
    RE: Sample-rate and bit-depth conversion: a primer 2006/07/22 15:58:18 (permalink)
    ORIGINAL: SteveD
    Looks like this:

    http://forum.cakewalk.com/fb.asp?m=166121

    Lot's of detail on the AES RIAA mechanical rule used during premastering in that thread.


    that thread (like the eq thread) should be required reading here. lots of good stuff in there.

    - jack the ex-cynic
    #21
    Jump to:
    © 2024 APG vNext Commercial Version 5.1