Silicon Audio
Max Output Level: -84 dBFS
- Total Posts : 346
- Joined: 2012/03/06 04:33:19
- Location: Northland, New Zealand
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/03 19:17:37
(permalink)
Anderton The plot thickens... Actually, both Bitflpper and I might be wrong about 96 making a difference only with signals inside the computer. No less an authority than James A. Moorer wrote a paper that proposed, among other things, that hearing involves not just frequency and amplitude, but time and how it relates to localization when listening with both ears. He claims that most people can distinguish a time delay of 15 microseconds or more when a pulse is put into each ear, and that some people can differentiate delays as low as 3 to 5 microseconds. Given that a sample at 48kHz is about 21 microseconds and 10.5 microseconds at 96kHz, that means the minimum time delay most people can differentiate is actually less than one sample at 48kHz, but more than one sample at 96kHz.
Sorry, but I must call BS on this. According to the sound/time caculator HERE, 15 microseconds is the equivalent of moving you head half a cm (5 mm). 5 microseconds is more like 2 mm. There would be more than a 2mm variation between the driver and you eardrums each time you put your headphones on. It is highly unlikely that at any time, the drivers of your headphones are at an exact distance from each eardrum, I would say the variation would exceed 2 mm every time you put them on. Now, take that argument to speakers and you are talking about one drop of water in the pacific ocean. Absolute BS. Yes, I know this guy's credentials, but come on...
"One of the great and beautiful things about music and recordings in general is that legacies live on" - Billy Arnell - April 15 2012
|
bitflipper
01100010 01101001 01110100 01100110 01101100 01101
- Total Posts : 26036
- Joined: 2006/09/17 11:23:23
- Location: Everett, WA USA
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/03 20:15:42
(permalink)
This has nothing to do with frequency. Moorer's paper was about binaural hearing and the perception of delays between events, not listening to continuous frequencies. I would guess this is a refinement of the precedence effect "the first wavefront law"). He maintains people with average acuity can recognize a time differential between impulses hitting each ear of as little as 15 microseconds, and some could discriminate down to 5-8 microseconds. Interesting hypothesis. It's true that our spatial perception depends on very small phase shifts between left and right ears. However, the time difference for any given angle of incidence is constant and based on the speed of sound and how long it takes sound to traverse the width of your head. That's going to be hundreds of microseconds, not tens of microseconds. Unless you're a rodent, anyway. Now, if you're talking about discrete events that occur 15 microseconds apart, you've got to ask yourself what that means in a musical context. When you strum a guitar, what is the time difference between each string being plucked? 20 microseconds, perhaps? I can't think of any musical event that happens faster than that. You also have to question whether or not conventional electromagnetic speakers can even reproduce events that close together. Certainly not for low frequencies. But even the best tweeters still have mass that limits to how quickly they can respond and reset.
 All else is in doubt, so this is the truth I cling to. My Stuff
|
Anderton
Max Output Level: 0 dBFS
- Total Posts : 14070
- Joined: 2003/11/06 14:02:03
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/03 20:19:08
(permalink)
Silicon Audio According to the sound/time caculator HERE, 15 microseconds is the equivalent of moving you head half a cm (5 mm). 5 microseconds is more like 2 mm. There would be more than a 2mm variation between the driver and you eardrums each time you put your headphones on.
Well, he's not alone by any means, and experiments done with ferrets, cats, owls, and other predatory animals seem to indicate binaural localization discrimination on the same level as humans. There's a really interesting paper, " Behavioral Sensitivity to Broadband Binaural Localization Cues in the Ferret." It was written up by the Journal of the Association for Research in Otolaryngology. Here's an excerpt: Binaural cue sensitivityIn some cases, ferrets exhibited ITD thresholds of <20 μs, with the mean threshold across animals and sessions equal to 23 μs for 200-ms broadband stimuli. Despite some differences in methodology between studies, these data are broadly comparable with ITD discrimination thresholds in humans, which are typically 10–20 μs (Zwislocki and Feldman 1956; Klumpp and Eady 1957; Yost 1974), as well as those reported for macaques (Scott et al. 2007), cats (Wakeford and Robinson 1974), and owls (Moiseff and Konishi 1981). The ITD thresholds obtained from ferrets are slightly better than those found in rabbits (Ebert et al. 2008), which is consistent with the notion that predatory species may have more developed sound localization abilities. Ferrets are also very sensitive to changes in ILDs, with some animals having thresholds of <1 dB, while the mean value for 200 ms of flat-envelope stimuli was 1.3 dB. Again, these thresholds are broadly comparable with those observed in humans, which typically vary from 0.5 to 1 dB over a wide range of frequencies (Mills 1960), as well as those obtained from macaque monkeys (Scott et al. 2007) and cats (Wakeford and Robinson 1974). It's one thing to sit here on a forum and offer conclusions based on what seems logical, but this kind of research is well-documented, repeatable, written by people who are authorities in their field, and have initial research precedents dating back over half a century. If you check out the paper, the authors give complete details on the methodology used, and there are enough references and links that if you follow them all, you won't be back to this forum for weeks. I haven't verified the results myself, so I can't say from personal experience whether I accept their findings or not. But I would find it very hard to dismiss arbitrarily the amount of hours invested in these studies by a vast number of scientists who are interested solely in pure research.
|
Anderton
Max Output Level: 0 dBFS
- Total Posts : 14070
- Joined: 2003/11/06 14:02:03
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/03 20:37:11
(permalink)
bitflipper Unless you're a rodent, anyway.
Are you psychic?!? Check out the previous post, which I was writing while you were writing yours... Now, if you're talking about discrete events that occur 15 microseconds apart, you've got to ask yourself what that means in a musical context. When you strum a guitar, what is the time difference between each string being plucked? 20 microseconds, perhaps? I can't think of any musical event that happens faster than that. "Events" can be the crest of a waveform hitting your ear. Flanging can cause separate events that are separated by 0 seconds. Don't know what the resolution is for electronic flanging, but an airplane going overhead while you're standing in front of a wall goes through zero and goes through sub-microsecond variations on the way there. You also have to question whether or not conventional electromagnetic speakers can even reproduce events that close together. Certainly not for low frequencies. But even the best tweeters still have mass that limits to how quickly they can respond and reset. True, and a very good point. But we don't just hear point source material. We hear this amazing number of reflections, cancellations, and additive peaks because we have to listen in an environment. That's what provides the spatial and localization cues. Think about bats and how they navigate, but if you've known any blind people, for many their ability to localize sounds is off the hook. Again, I'm not saying anyone's right or wrong. I'm just saying that it's important to keep an open mind and not dismiss anything just because a preconceived notion of what's logically correct doesn't agree with tens of thousands of man-hours of research. The conclusions the researchers derive could be wrong, and extrapolation to musical reproduction could be wrong. But it could also be right. I don't think anyone here on the forum has sufficient knowledge to confirm or dispute these claims at the same level of depth with which they're made. I'm here to learn, and being sure of one's knowledge impedes that process. I assume that everything I know is not right or wrong, but potentially right or potentially wrong.
|
Anderton
Max Output Level: 0 dBFS
- Total Posts : 14070
- Joined: 2003/11/06 14:02:03
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/03 20:50:10
(permalink)
drewfx1 And you know that Fourier says that any complex waveform is just a combination of sine waves at various frequencies, amplitudes and phases.
I know that, but I also know that applies only to a situation involving a single audio stream. Anyone with two ears hears two audio streams. There is no way (other than offline encoding processes) for a single audio stream to represent two independent audio streams. Even playing back a mono sound source over stereo speakers generates two independent audio streams due to room acoustics. I don't see how it would even be possible to obtain localization information from a single audio stream.
|
drewfx1
Max Output Level: -9.5 dBFS
- Total Posts : 6585
- Joined: 2008/08/04 16:19:11
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/03 20:58:04
(permalink)
bitflipper It's true that our spatial perception depends on very small phase shifts between left and right ears. However, the time difference for any given angle of incidence is constant and based on the speed of sound and how long it takes sound to traverse the width of your head. That's going to be hundreds of microseconds, not tens of microseconds. Unless you're a rodent, anyway. If the sound source is slightly to one side and some distance away it can have a very small ITD between your ears. Or am I mis-Pythagorizing it?
 In order, then, to discover the limit of deepest tones, it is necessary not only to produce very violent agitations in the air but to give these the form of simple pendular vibrations. - Hermann von Helmholtz, predicting the role of the electric bassist in 1877.
|
The Maillard Reaction
Max Output Level: 0 dBFS
- Total Posts : 31918
- Joined: 2004/07/09 20:02:20
- Status: offline
.
post edited by Bash von Gitfiddle - 2018/10/04 22:44:31
|
Silicon Audio
Max Output Level: -84 dBFS
- Total Posts : 346
- Joined: 2012/03/06 04:33:19
- Location: Northland, New Zealand
- Status: offline
|
Sanderxpander
Max Output Level: -36.5 dBFS
- Total Posts : 3873
- Joined: 2013/09/30 10:08:24
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 01:10:56
(permalink)
I would think at timescales like this simply moving your head into a different angle would have a more significant effect (and thus negate any difference between the output of the two speakers). I haven't read Moorer's article yet but generally speaking this seems a pretty wild conclusion to draw from a carefully done experiment in very controlled conditions. That's not really how science works (although the media would like it to).
post edited by Sanderxpander - 2014/06/04 09:45:10
|
Splat
Max Output Level: 0 dBFS
- Total Posts : 8672
- Joined: 2010/12/29 15:28:29
- Location: Mars.
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 01:39:16
(permalink)
64 bit precision mice? hmmmmm....
Sell by date at 9000 posts. Do not feed. @48/24 & 128 buffers latency is 367 with offset of 38. Sonar Platinum(64 bit),Win 8.1(64 bit),Saffire Pro 40(Firewire),Mix Control = 3.4,Firewire=VIA,Dell Studio XPS 8100(Intel Core i7 CPU 2.93 Ghz/16 Gb),4 x Seagate ST31500341AS (mirrored),GeForce GTX 460,Yamaha DGX-505 keyboard,Roland A-300PRO,Roland SPD-30 V2,FD-8,Triggera Krigg,Shure SM7B,Yamaha HS5.Maschine Studio+Komplete 9 Ultimate+Kontrol Z1.Addictive Keys,Izotope Nectar elements,Overloud Bundle,Geist.Acronis True Image 2014.
|
John
Forum Host
- Total Posts : 30467
- Joined: 2003/11/06 11:53:17
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 01:42:37
(permalink)
mudgel I'm certainly no rocket scientist when it comes to this particular discussion nor am I just a rock so I understand a reasonable amount of the discussion. BUT I'd like to congratulate all involved for the manner in which they've presented their points. Even when there's been some points of contention in what in other forums would start a war all has remained civil.
It's another example of the quality of the people that are part of the Sonar forum family.
There was a time pre X3 where there was a tension evident in the forum but generally speaking the forum is pretty much a pleasure to be part of. I think it was Craig who mentioned (a while back) a new gestalt on this forum and this thread is a classic example as is the one about your favourite underrated feature. Group hug. :-)
I have to agree the members on this thread have been great.
|
The Maillard Reaction
Max Output Level: 0 dBFS
- Total Posts : 31918
- Joined: 2004/07/09 20:02:20
- Status: offline
.
post edited by Bash von Gitfiddle - 2018/10/04 22:44:50
|
BJN
Max Output Level: -86 dBFS
- Total Posts : 222
- Joined: 2013/10/09 07:52:48
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 08:55:38
(permalink)
All I want is a workable definition as applied to recording and mixing music for distribution via digital mediums. Due to the ears ability and sensitivity to localization perception and prediction primarily we know where the prey or the predictor is even where our eyes cannot see. How our hearing functions has meant our survival. It doesn't matter if it isn't perfect compared to another non record buying species hearing. LOL The Science will tell us the reason 44.1 was chosen as the sampling frequency is it contained all the frequencies for human hearing. Now we have powerful enough computers and even today's cheap converters are way ahead what we started with. We have the storage space and the conversion quality. I say the ADDA conversion or DAW conversion algorithms is not the weakness in the chain. I say the weakness in the chain is the limits of the relatively unchanged technology of microphones and the fad of using old preamps with "color" but with specs only suitable for tape machines. (Okay some great pieces have very good specs.) In other words we are still dependent upon elecromagnetics to not only capture sound but to reproduce it as well. To me it is not our ears that are imperfect. Surely we have advanced technologically that a better diaphragm or loudspeaker design can match the capabilities of where digital can take us? I hope our ears are good enough to discern what the possibilities could be? I recall a little while ago coming across a new diagram (cell) design for earphones but for the life of me haven't located the article.
post edited by BJN - 2014/06/04 09:35:19
------------------------------------------------------- Magic: when you feel inspired to create which in turn inspires more creation. And the corollary: if magic happens inspiration might flog it to death with numerous retakes. Bart Nettle
|
bitflipper
01100010 01101001 01110100 01100110 01101100 01101
- Total Posts : 26036
- Joined: 2006/09/17 11:23:23
- Location: Everett, WA USA
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 10:08:01
(permalink)
drewfx1 If the sound source is slightly to one side and some distance away it can have a very small ITD between your ears. Or am I mis-Pythagorizing it? 
No, you're not mis-applying Pythagorus. That's why I said "at a given angle of incidence". Sound coming from in front of us will reach each ear almost simultaneously. That's why we have a much harder time pinpointing where sound is coming from when its origin is nearly straight ahead. It's why LRC panning has enjoyed a resurgence over the past decade, and why rhythm guitars are routinely double-tracked, panned wide and delayed for the Haas effect. Practitioners of the Haas trick learn that the left-right delay has to fall within a certain range in order to be effective. Delays below 2ms don't work. Delays above 20ms are perceived as discrete echoes and also don't work. Similarly, sound-widening plugins often use delays of a few milliseconds to shift different frequency bands - the operative qualifier being "milli". If sub-millisecond delays really had a significant impact on perception of sound quality, it would probably have been adopted as a widely-used mix technique. It hasn't. Sub-millisecond delays are perceptible, but only within the context of the undesirable effects of comb filtering. An aspect that hasn't come up yet in this conversation is temporal masking. This is what happens when two sounds occur very close together in time, too close for the cochlear cilia to reset in between. This period can be as long as 100 ms (!), depending on the frequency, amplitude and envelopes of the events. Within the microsecond timeframes we've been talking about, temporal masking is going to be the primary limiter of perception - regardless of the resolution of our recording and quality of our speakers. I'm still not buying the idea that faster sample rates sound better because transients are better-separated.
 All else is in doubt, so this is the truth I cling to. My Stuff
|
abb
Max Output Level: -88 dBFS
- Total Posts : 118
- Joined: 2004/01/19 02:04:35
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 13:02:38
(permalink)
Actually sound localization involves more than binaural cues like interaural time difference (ITD; the difference between the times it takes a sound to reach the two ears) and interaural level difference (ILD; the difference in sound pressure level reaching the two ears). There are also monaural cues based on the head-related transfer function (HRTF; a spectral cue derived from the way the pinna and head differentially affect the intensities of frequencies arriving at the ear). What's more, all these mechanisms vary as a function of frequency, bandwidth, and direction (azimuth and elevation) to the sound source. For example, sound 'shadows' created by the head that create ILDs only become appreciable around 1,700 Hz making this cue frequency dependent. Another example is the bandwidth dependence of ITDs. ITDs are not very effective for narrow band signals like sine waves because they recruit only a few hair cells in the cochlea resulting in a feeble signal for the brainstem nuclei to use when computing cross-correlation between the left- and right-ears. Yet another example is the variation of the HRTF as a function of the azimuth and elevation of the sound source, leading to different localization estimates. In the end we exploit the cues that are most salient and most reliable in given situation. And the same is true for the visual system -- we use several different cues to (visually) locate objects in space. Some of these are binocular, some are monocular. Natural selection is very opportunistic in that it endowed us with many different ways to achieve the same end result. Cheers...
|
Anderton
Max Output Level: 0 dBFS
- Total Posts : 14070
- Joined: 2003/11/06 14:02:03
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 19:20:03
(permalink)
Sanderxpander I would think at timescales like this simply moving your head into a different angle would have a more significant effect (and thus negate any difference between the output of the two speakers). I haven't read Moorer's article yet but generally speaking this seems a pretty wild conclusion to draw from a carefully done experiment in very controlled conditions. That's not really how science works (although the media would like it to).
It's not just Moorer, check out the article I linked to from JARO. It has references going back to the late 50s. It almost seems this is an "everyone knows that" kind of thing in the field. I'm not saying it's right or wrong. I haven't done the experiments myself. But I'm not arrogant enough to say that I flat out don't accept it because it doesn't seem right, or obsequious enough to flat out accept it because a bunch of researchers with doctorate degrees tell me it's so. The one thing I DO flat out accept is that instruments without oversampling sound better when recorded at 96kHz and I have files that demonstrate that to more than my satisfaction. And as I side note, I looked for instruments and processors that had switchable oversamping capabilities...there aren't that many. Either it's done internally and is transparent to the user, but then I don't understand why they sound better if run at a higher sample rate, or it's simply not built into the design.
|
Anderton
Max Output Level: 0 dBFS
- Total Posts : 14070
- Joined: 2003/11/06 14:02:03
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 19:35:48
(permalink)
bitflipper Let's look at a practical example, an electric guitar played through a high-gain amp sim. You play a very high note on your guitar, say with a fundamental frequency of 1.3 KHz (an octave above an open high-E string). The amp sim will generate harmonics at 3x, 5x, 7x, 9x, etc. The 15th harmonic is 19500 Hz, still legal at 44.1 KHz. You have to get up to the 17th harmonic before changing the sample rate would deliver any benefit. I didn't do the math, but the level of the 17th harmonic is going to be down more than 90 dB from the fundamental. IOW, inaudible.
According to IK Multimedia's chief engineer, physical guitar amps generate harmonics well above the audible range and part of their emulation process is to reproduce those frequencies. He also said that high-gain amp sims often deliver 60dB of gain. So your assumptions of what is or is not audible has to take extreme amounts of gain into account. That "90dB down from the fundamental" could easily be 30dB down. Such distortion products would be audible when folded back into the audio range. A fundamental problem I'm seeing here is that this is not a yes/no situation, there are shades of gray. Some processors will derive zero benefit from being run at higher sample rates. Some obviously derive benefits from running at higher sample rates. One-size-fits-all pro or con is not realistic or possible. FYI IK does extremely sophisticated oversampling and filtering not for the plug-in as a whole, but for individual elements. The choose the amount of oversampling and what to apply it to on a processor-by-processor basis. As a result, he noted that AmpliTube running at 44.1kHz will perform better, and draw less CPU, when all the oversampling options are enabled compared to running it at 96kHz. However, this degree of attention to detail seems to be the exception rather than the norm in the industry. Cakewalk's instruments provide oversampling, which is why I had to choose the non-oversampled version of Z3TA+ 2 to emulate the results of instruments without oversampling, and so does Native Instruments for selected processes (e.g., saturation). iZotope's Ozone also provides oversampling for selected processes.
|
Anderton
Max Output Level: 0 dBFS
- Total Posts : 14070
- Joined: 2003/11/06 14:02:03
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 19:42:24
(permalink)
I'm leaving for GearFest and then New Music Seminar, so I won't be participating much on the forums for the next several days. But really, I've made the only point I cared to make: Recording at 96kHz can improve the sonic accuracy of some soft synths and processors, even when sample rate-converted down to 44.1kHz. I don't think anyone can disagree that's a true statement. I'll leave it up to the rest of you to run your own ferret experiments and knock 50 years of research on its butt. Wouldn't be the first time the conventional wisdom had to take a hit in the light of new knowledge. Remember when everyone was just so 100% sure that Venus would be a cold, dead planet like the moon? Oooops.
|
The Maillard Reaction
Max Output Level: 0 dBFS
- Total Posts : 31918
- Joined: 2004/07/09 20:02:20
- Status: offline
.
post edited by Bash von Gitfiddle - 2018/10/04 22:45:03
|
Razorwit
Max Output Level: -66 dBFS
- Total Posts : 1235
- Joined: 2003/11/05 18:39:32
- Location: SLC, UT
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/04 23:08:57
(permalink)
Mike, I know this is OT, but ST3 was in the Sweetwater catalog that came today. I gotta think it's gonna be soon. Dean
Intel Core i7; 32GB RAM; Win10 Pro x64;RME HDSPe MADI FX; Orion 32 and Lynx Aurora 16; Mics and other stuff...
|
bitflipper
01100010 01101001 01110100 01100110 01101100 01101
- Total Posts : 26036
- Joined: 2006/09/17 11:23:23
- Location: Everett, WA USA
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/05 09:18:33
(permalink)
According to IK Multimedia's chief engineer, physical guitar amps generate harmonics well above the audible range and part of their emulation process is to reproduce those frequencies. He also said that high-gain amp sims often deliver 60dB of gain. Amplifiers, yes. Guitar speakers, no. And most microphones couldn't pick them up anyway. Stick a microphone in front of a guitar speaker cabinet and play a fat distorted chord, record it at 192 KHz and analyze the spectrum. The amplitudes of supersonic harmonics will be very, very small - if detectable at all - and they'll be the product of unpleasant intermodulation distortion. Your typical guitar amp and speaker will roll off steeply over about 12 KHz, and even if it didn't your microphone won't pick up much beyond 20 KHz. Certainly not the ubiquitous SM-58 that's so commonly used for this purpose. Have a good time at the show, Craig, and remember: if you don't come back with snapshots it didn't happen.
 All else is in doubt, so this is the truth I cling to. My Stuff
|
The Maillard Reaction
Max Output Level: 0 dBFS
- Total Posts : 31918
- Joined: 2004/07/09 20:02:20
- Status: offline
.
post edited by Bash von Gitfiddle - 2018/10/04 22:45:27
|
Sanderxpander
Max Output Level: -36.5 dBFS
- Total Posts : 3873
- Joined: 2013/09/30 10:08:24
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/05 10:00:02
(permalink)
Anderton
Sanderxpander I would think at timescales like this simply moving your head into a different angle would have a more significant effect (and thus negate any difference between the output of the two speakers). I haven't read Moorer's article yet but generally speaking this seems a pretty wild conclusion to draw from a carefully done experiment in very controlled conditions. That's not really how science works (although the media would like it to).
It's not just Moorer, check out the article I linked to from JARO. It has references going back to the late 50s. It almost seems this is an "everyone knows that" kind of thing in the field. I'm not saying it's right or wrong. I haven't done the experiments myself. But I'm not arrogant enough to say that I flat out don't accept it because it doesn't seem right, or obsequious enough to flat out accept it because a bunch of researchers with doctorate degrees tell me it's so. The one thing I DO flat out accept is that instruments without oversampling sound better when recorded at 96kHz and I have files that demonstrate that to more than my satisfaction. And as I side note, I looked for instruments and processors that had switchable oversamping capabilities...there aren't that many. Either it's done internally and is transparent to the user, but then I don't understand why they sound better if run at a higher sample rate, or it's simply not built into the design.
I'm not disputing the validity of methodology of the experiments (even though "from the 50's" means "ancient and outdated" from a biological and psychological perspective) because I didn't read them in full, nor do I plan to. I'm disputing your extrapolations from them. You seem to take the conclusions from these very controlled experiments and apply them to a wide range of real world scenarios. That's not how science works, sadly. There are too many variables. You can't use an experiment that seems "kind of similar" or "touching the edges of your point" and use it to make your case. You would have to set up a separate experiment with the exact variable you want to measure in mind, taking care to avoid all others influencing it. As it is, they did not test for what you're using it for.
|
Jeff Evans
Max Output Level: -24 dBFS
- Total Posts : 5139
- Joined: 2009/04/13 18:20:16
- Location: Ballarat, Australia
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/05 19:24:02
(permalink)
As a result of Craig's experiments I decided to do one myself. I did all this inside Studio One. In my case I have used two virtual instruments that are complex in nature. The first is Natve Instruments Prism. This is an amazing synth that can generate very complex upper patials. Harmonics that not only have complex amplitude opeartions on them but the frequencies can move smoothly sideways creating even more 'movement' in an ambient patch for example. The way the harmonics move in Prism patches has to be seen to be believed. Here the Prism sound is called 'The Witcher' I played an ambient midi part to control this. The second VST is Korg Wavestation with 'Deep Atmosphere' selected in RAM 11. Wavestation adds fatness and also beautiful and complex movement within a patch. There is always a lot going on in any Wavestation patch! This is blended quite lower under the Prism sound. So now we have a quick ambient thing lasting 30 seconds. A blend of both parts. There is no processing or eq on any of these VST's. Levels are set to unity as well. I created a session at 96 K and rendered the result out at 96K. All bit depths are at 24 bit. I also exported the same 96K session down to 44.1 Khz and that is the wave in my download. Next I created a session all at 44.1 KHz. Still 24bit. Set up the same two synths and got both of the midi tracks to play them and exactly the same music resulted. I exported that session at 44.1 Khz. I think I level matched these two waves OK as well. Timing is spot on as if you line them up you won't hear a shift in timing. Here is the link for the Zip file. Good for about 7 days I suppose. https://www.hightail.com/...ZUczRkJRdWNIcWNVV01UQw On listening you will notice the original 44.1 K version is very much brighter. On Span there are a lot more harmonics jammed in and the response is flatter up high. Notice how smooth the 96K rendered version is in comparison. I think because of the complex nature of the sounds involved the differences are even more apparent. I believe the extra high end generated in the 44.1K version is a result of the session not being at 96K. The 96K version sounds better to me. If you are wondering how did they sound at the time. When the session was in 96K it was the smooth sound the rendered wave has live. The Prism patch is not swamped in high end harnmonics at all and is quite smooth and analog sounding almost. Span shows this with less extra harmonic crowding in and the top end has that 6db/Oct roll off type slope instead. The next part of the experiment for me is to do a test with an incoming hardware synth into two sessions and render one down as before. The hardware synth needs to be special for this test and I have the perfect one on loan right now for the job. A Kawai K5000 additive synth with patches that are very similar and very complex and moving etc.. Sounds most unlike anything. I can layer this with some dreamy JD800 patches for some depth. It will be interesting to see how they compare going through the audio interface instead of being generated internally. (the results may be similar too because if the session is opened up to 96K does not that mean the frequency response through the whole system will be wider now.) There is a very compelling reason to work at this rate just to get this smooth more natural top end sound in instruments such as Prism producing these type of patches alone. It is obvious the good things are translated down to 44.1K
post edited by Jeff Evans - 2014/06/06 00:57:48
Specs i5-2500K 3.5 Ghz - 8 Gb RAM - Win 7 64 bit - ATI Radeon HD6900 Series - RME PCI HDSP9632 - Steinberg Midex 8 Midi interface - Faderport 8- Studio One V4 - iMac 2.5Ghz Core i5 - Sierra 10.12.6 - Focusrite Clarett thunderbolt interface Poor minds talk about people, average minds talk about events, great minds talk about ideas -Eleanor Roosevelt
|
Sanderxpander
Max Output Level: -36.5 dBFS
- Total Posts : 3873
- Joined: 2013/09/30 10:08:24
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/06 09:49:18
(permalink)
Neither of those have an oversampling or HQ version? I can very clearly hear the difference between "normal" Z3TA+ and the highest quality setting.
|
bvideo
Max Output Level: -58 dBFS
- Total Posts : 1707
- Joined: 2006/09/02 22:20:02
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/06 10:23:08
(permalink)
In the two versions of the experiment, one uses Sonar (or another software) to downsample (sample rate conversion) the final wav from 96K to 44.1K. The other method relies on all the various sound generators to generate "clean" high end at 44.1. From reading all of the above, it sounds like this alone could produce "differences" that may or may not be pleasant to the individual listener, even if all the software is "well behaved". It could happen. I wonder if there could be a different result by upsampling the 44.1 wav to 96 and downsampling it again, to impose whatever high frequency filtering is carried out in your chosen downsampling s/w to compare it with the downsampled wav that was originally created at 96K.
|
bitflipper
01100010 01101001 01110100 01100110 01101100 01101
- Total Posts : 26036
- Joined: 2006/09/17 11:23:23
- Location: Everett, WA USA
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/06 11:24:29
(permalink)
Jeff, I wonder how much one may really conclude from tests in which the signal source is digital to begin with. You're essentially re-sampling and thereby introducing new variables.
 All else is in doubt, so this is the truth I cling to. My Stuff
|
robert_e_bone
Moderator
- Total Posts : 8968
- Joined: 2007/12/26 22:09:28
- Location: Palatine, IL
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/06 12:05:14
(permalink)
I would think that multiple years of these issues, with hundreds of posts per thread, would tend to indicate that there may not be much of a difference at the end of the day, in practical terms. If it takes this much discussion and back and forth with still no clear cut answer, isn't that telling in and of itself, in terms of it not being significant when all is said and done? Bob Bone
Wisdom is a giant accumulation of "DOH!" Sonar: Platinum (x64), X3 (x64) Audio Interfaces: AudioBox 1818VSL, Steinberg UR-22 Computers: 1) i7-2600 k, 32 GB RAM, Windows 8.1 Pro x64 & 2) AMD A-10 7850 32 GB RAM Windows 10 Pro x64 Soft Synths: NI Komplete 8 Ultimate, Arturia V Collection, many others MIDI Controllers: M-Audio Axiom Pro 61, Keystation 88es Settings: 24-Bit, Sample Rate 48k, ASIO Buffer Size 128, Total Round Trip Latency 9.7 ms
|
Sanderxpander
Max Output Level: -36.5 dBFS
- Total Posts : 3873
- Joined: 2013/09/30 10:08:24
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/06 12:59:44
(permalink)
Not really, I agree with Craig's general point that some plugins may not have an oversampling mode while they should and sometimes you may just forget to turn it on. You would avoid this by running at 96KHz. I can very clearly hear the difference between the Z3TA+ modes so as long as your computer is not complaining, why take the risk at all?
|
Jeff Evans
Max Output Level: -24 dBFS
- Total Posts : 5139
- Joined: 2009/04/13 18:20:16
- Location: Ballarat, Australia
- Status: offline
Re: Remember that 96K TH2 thread? I Just had my mind blown, big-time
2014/06/06 17:13:54
(permalink)
The whole point of the OP and original experiment was in fact to use only digital generating sources and see how they behave under different conditions. The fact is the file of mine that was created at the start at 44.1 sounds quite different to the file that was created at 96K and rendered down to 44.1 The down sampling process has maintained the beautiful smooth sound from the 96 KHz session. I see these differences varying according to the type of sounds you are making too. For highly additive complex harmonially rich sounds it might be good to work at 44.1 mostly but create a session up at 96k to just render out all the complex additive sounding stuff. Down sample it to 44.1 and drop it back into the 44.1 session. It is going to sound different no matter how you look at it. (I am not suggesting you do this for everything either it might be just one of those things where VST synths/additive synths really show up these differences with some patches too) Span shows less harmonics (peaks are spaced further apart) in the 96 K version and a smooth roll off compared to the extra stuff (more harmonics squashed in) showing up in the 44.1 Khz version and the flatter roll off. Putting a LPF over the 44.1 KHz version will not turn it into the 96KHz rendered down to 44.1K version either.
Specs i5-2500K 3.5 Ghz - 8 Gb RAM - Win 7 64 bit - ATI Radeon HD6900 Series - RME PCI HDSP9632 - Steinberg Midex 8 Midi interface - Faderport 8- Studio One V4 - iMac 2.5Ghz Core i5 - Sierra 10.12.6 - Focusrite Clarett thunderbolt interface Poor minds talk about people, average minds talk about events, great minds talk about ideas -Eleanor Roosevelt
|