• SONAR
  • MIDI "Jitter" - It Does Exist (p.13)
2007/10/10 09:01:42
dstrenz

ORIGINAL: pianodano
Maybe I'm missing something here but that's exactly what I think should happen if you are aiming for say... the exact beat. But I don't think improved acurracy is the correct term if you're intentionally playing behind the beat and want exactly what was played captured.

It stands to reason, ALL midi is quantized to one tick or another. More ticks should equal capturing with higher accuracy the original performance. Whether the orginal performance was accurate or not is another matter entirely and, in my mind, a poorly timed performance is what quantization is used for, traditionally.


Yes, I think you're missing something.

1. Record audio and midi at the same time from a external synth.
2. Print that midi onto another audio track.
3. Compare the first audio track with the other (after lining up the first sample).

The audio recording made in step 1, which is not quantized, is what we're trying to duplicate with the midi.

If the midi is recorded at 96ppq, the audio samples recorded in steps 1 and 2 will line up much more closely than if the midi is recorded at 480ppq. Again, the recording is simply 8th notes (very few bytes running through the port)
2007/10/10 09:36:26
pianodano
ORIGINAL: dstrenz


ORIGINAL: pianodano
Maybe I'm missing something here but that's exactly what I think should happen if you are aiming for say... the exact beat. But I don't think improved acurracy is the correct term if you're intentionally playing behind the beat and want exactly what was played captured.

It stands to reason, ALL midi is quantized to one tick or another. More ticks should equal capturing with higher accuracy the original performance. Whether the orginal performance was accurate or not is another matter entirely and, in my mind, a poorly timed performance is what quantization is used for, traditionally.


Yes, I think you're missing something.

1. Record audio and midi at the same time from a external synth.
2. Print that midi onto another audio track.
3. Compare the first audio track with the other (after lining up the first sample).

The audio recording made in step 1, which is not quantized, is what we're trying to duplicate with the midi.

If the midi is recorded at 96ppq, the audio samples recorded in steps 1 and 2 will line up much more closely than if the midi is recorded at 480ppq. Again, the recording is simply 8th notes (very few bytes running through the port)



Yep. I do that all the time but not a 96. I am after capturing the performance with more accuracy as oppossed to a quantized performance. We are both saying the same thing but in different ways. Must be a failure to communicate on my part.
2007/10/10 12:12:57
Nick P
True, all MIDI is "quantized" to some degree. But on a philosophical level, all audio is "quantized" to some degree, if we consider that time itself could be "quantized" at the atomic level. What we are talking about is the human ear's capability to perceive different timing anomalies and/or "feels" of musical performances, whether the ear is that of a trained musician or a lay person.

Nevertheless, for the purposes of traditional discussion about MIDI, "unquantized" means played back as performed at whatever clock rate is default by the sequencer or set by the user in the case of software which allows this setting. "Quantized" means imposing time correction using the software such as eighth, sixteenth, thirty-second, etc... notes.

(I realize you most likely know all this)

What we are talking about is what we hear back versus what we play in. It's that simple. Theoretically if we play in audio and do not deceive ourselves, we should hear back exactly what we play. However I wonder now if even audio can be altered via the I/O process invoked by a computer-based recording solution. And theoretically with MIDI, we should hear back very close to what we played in, assuming a high enough clock rate (PPQ), which is obviously the subject of a good deal of debate exactly how high that must be.

However many people (note the number of posts in this thread) believe that the process of the computer (as opposed to a hardware MIDI sequencer) receiving, processing, and playing back MIDI data imposes a certain amount of imperfection into the MIDI performance (which we are calling "jitter" for lack of a better term). The degree and amount of this "jitter" is the subject of this thread. Much of it is based on many of us having experienced and sensitive musical ears enough to hear this without scientific measurement. Does this make our findings subjective? Of course. However notice that many users have and continue to take scientific measurements via audio editors now commonly available.

Bottom line: We think MIDI "jitter" exists, and many of us feel it is enough of a problem to warrant investigation by both designers of DAW software, as well as operating system designers such as Microsoft. Obviously the folks at Ableton agree, hence the impetus to start this thread upon reading about their new product Live 7.
2007/10/10 13:25:16
space_cowboy
OK I havent read every single post here so I run the risk of saying what others have said.

I believe a studio needs a master clock that everything is tied to. I have an Apogee clock in my D8B that is the master for my MIDI and for my audio.

I dont know about MIDI jitter but you do run the risk of things not being in sync exactly right if you have multiple clocks going on. There is a clock for midi, a clock for audio and there are probably others out there. I used to get things that never seemed to sync up exactly right when I didnt have one master clock.
2007/10/10 14:45:31
dewdman42
I stated this several times in the thread already, but at the risk of sounding repetitive....

It does make sense that a lower midi resolution would be tighter. The reason is very similar to why larger audio buffers reduce audio glitching. The system can keep up with things better. Its probably not necessary to go all the way down to 96ppq, but I personally think 240ppq is the magic number on windows.

The fact is that the windows timer can't accurately record 960ppq, even at its most optimal. At 120bpm, 960ppq = 0.5ms/tick. The Windows MM timer is 1ms by design, but often does not achieve even that. 960ppq is a complete pipe dream at 120bpm. 480 is the theoretical best it can do(1ms ticks). 240 is more realistic. 240 = 2ms/tick.

However at slower tempos, then the higher resolution becomes more meaningful. For example, at 60bpm, double everything. For example, at 60bpm, 960ppq=1ms/tick. So I might prefer 480ppq at 60bpm vs 240ppq at 120bpm, but I would expect them both to be giving me about 2ms/tick, which is enough time for the Windows OS to be more reliable about the MM timer.

If you have hardware midi timestamping you might be able to record better than 1-2ms precision reliably, but you better plan on not using soft instruments to monitor what you're playing if you expect to make effective use of that resolution in some way.

Yes a low resolution does have some inherent built in quantizing...similar to input quantize...but let me breakdown some of the resolutions and the quantization they represent.(I had it wrong earlier):

PPQ equiv quantizing
----------- -------------------
48 128th triplets
96 256th triplets
120 ~512ths
192 512th triplets
240 ~1024ths
480 ~2048ths
960 ~4096ths


(note that some of them have a tilde(~), which means that resolution does not divide evenly into 4/4 time musical values. For example a PPQ of 128 instead of 120 would give exact 512th note resolution. The reason sequencers use 120 is related to midi clock which divides up each beat into 24 pulses. So PPQ's always have to be a multiple of 24. However, the lower resolutions all land in good places and when you start talking about 512th notes, the musical timing division is not really applicable anymore.)

I would argue that when someone is trying to play in the groove at 96ppq, they might still be landing on even ticks at that quantization level. Otherwise everyone in the industry would have been totally disenchanted with the MC500 series of hardware sequencers. But we all know that many a pro keyboard player that was head over heels in love with that box...so apparently 96ppq is actually plenty of resolution to capture a groove. Even the infamous MPC line, only the latest top of the line model has 480ppq. All the rest of them are 96ppq.

What I have heard from a few is that higher resolutions are needed to capture the exact nuances of note clusters, grace notes and things like that. What that means is that for those odd times when you have a grace note or cluster or something...using a low ppq may sound kind of clunky, but this is a different thing than the timing jitter that most people might be complaining about. The timing jitter has more to do with how well the performance "grooves". Lower PPQ is more likely to be be able to be handled reliably by the poor Windows timer and for grooving purposes, I don't think the higher PPQ's are adding any benefit. Furthermore, even if you do try to use the higher PPQ for capturing your grace notes...Windows OS might still intercede a 1-2ms jitter anyway, which could still make those things sound clunky.

Regarding Ableton Live. The report is that they improved midi timing from what it was before. We do not know how good or bad it was before. There is a theoretical limit to how good they can make it...which we have discussed....related to the Windows Operating system. Are they at that limit? Is Sonar at that limit? I don't think we know for sure. But I will say that I think Cakewalk has been known for having some of the best midi timing on DOS/Windows for YEARS. They have always been good about getting their sequencers as close to the limit of what is possible on DOS/Windows. Live on the other hand started out as mostly an audio looping program and then added midi little by little. My guess is that their first midi implementations were a bit flawed and possibly had horrendous timing problems that needing fixing..which apparently now they have done. But I am only guessing. They still have the same OS limit and there is nothing anyone can do to beat it or they all would have done it already. The OS limitation is just there.

On the other hand, I am much more suspicious that Sonar may have some bugs or design problems related to freezing and mixing down midi tracks through soft synths, or bugs in delay compensation or something like that, based on some of the feedback we have heard here.
2007/10/10 15:02:30
dewdman42

ORIGINAL: Nick P
However many people (note the number of posts in this thread) believe that the process of the computer (as opposed to a hardware MIDI sequencer) receiving, processing, and playing back MIDI data imposes a certain amount of imperfection into the MIDI performance (which we are calling "jitter" for lack of a better term).


A little more clarification about this. One of the reasons that Audio can be recorded by your PC, essentially jitter free, is because there is dedicated hardware(ie, soundcard) that uses its own internal clock to capture the audio and save 44,000 samples per second into a buffer. The catch here is that the actual timing of collecting that audio and storing as timestamped discrete values is done in the soundcard hardware. The Windows OS then comes back every so often at its own good pace and grabs chunks of data from the buffer. This is why if you have a larger buffer, you are giving windows more time to do whatever it needs to do and come take from the buffer as much as it can. Meanwhile, the dedicated soundcard hardware is just clicking along like a machine, sampling the audio and stuffing it into the buffer. It has only one task to do and will do it accurately.

With midi, for most people...the timestamping of the midi events is not handled by hardware, its handled by software, which means it has to wait in line with everything else the Windows OS is trying to do.

The midi interface receives midi events and stuffs them into a buffer, just like audio, and it will fill up that buffer with as much as it can until the Windows OS will get around to coming and getting what is in the buffer. But the difference is that the midi events do not get any timestamps until Windows comes around to fetch them from the buffer. From Windows perspective, whatever is in that buffer at that time gets the same timestamp, even if the events came into the hardware at different times. See what I mean? This is why hardware timestamps are really needed. Further to all that, Because are often using a midi keyboard to control a soft synth, you sorta don't want the Windows OS to take too much time before coming to get stuff from that buffer, because otherwise you'd have big latencies. You need a realtime response for midi. Essentially, with midi you need the equivalent of really really small audio buffers. But the problem is that the Windows OS just does not provide much better than 1-2ms via the MM timer.
2007/10/10 15:31:04
pianodano
Dewdman42,


Thanks for those posts but I really have a few questions now please.

At the risk of me sounding like a complete dummy, you're saying that the audio sample rate is not divided down to achieve the midi ticks? On a machine that can do nothing but handle numbers ? And if that is the case what is the common link tying audio and midi together? And, why wouldn't all data, audio or midi just be spit out in consecutive track order ?

Also, as I mentioned many posts back, and again I agree, the MC500mkII (I have mine on the shelf and have owned it since it was introduced around late 87, or early 88) was as good as it got then imo but I should add that it can not handle the metadata and volumes of sysex and continuous controller events spit out by todays keyboards. But YMMV.

The Motif and Tyros I have, in contrast, run at 480ppqn.
2007/10/10 15:36:35
Jim Wright
A couple of quick points.

96 ppqn gives a time resolution of about 5.2 msec at 120 bpm. That's 5x coarser than the 1 msec nominal resolution of the Windows MM timer. So, even if the time values jitter around a bit on playback, the jitter will be less than the "inherent quantizing" that's imposed by the 96 ppqn sequencer resolution. Net result: audible playback should be much more consistent from one time to the next, because any jitter effects are 'swamped out' by the lower ppqn.

Is 960 ppqn useless? That depends. A DirectMusic (DM) driver can accept high-res timestands (in microseconds, I think, rather than milliseconds). If the DM driver is for something like a MOTU time-stamping MIDI interface, the 960 ppqn time value will be translated to a MIDI timestamp with a resolution of 1/3 millisecond (per the MOTU timestamp protocol). If the DM driver is for something like a Firewire MIDI interface, you might also get time resolution around 1/3 millisecond (because Firewire MIDI can do that, assuming no MIDI-logjam effects -- a big IF, to be sure).

Of course, if the 960 ppqn time value is being used by a soft synth - it may be translated to a sample-accurate time value. That's one big reason why, say, EZ Drummer drums can sound amazingly tight / in the pocket. The source data is a MIDI pattern from the EX drummer library (or GrooveMonkee); the MIDI data is played back directly through the EZD soft synth, and the MIDI data never comes anywhere near the Windows MIDI driver stack on your DAW.

- Jim
2007/10/10 15:47:44
Jim Wright
There's a test someone might be interested in trying (I don't have time right now, unfortunately -- just enough time to kibitz a bit)

Create a new Sonar project. In the PRV, draw in an ascending scale - say, 16th notes. Keep the note durations short -- say, 50% of the duration to the next note.

Now, connect a MIDI cable from a MIDI output to a MIDI input. Route things so that you can re-record the MIDI output onto another track.

Now, play back the sequence you entered manually in the PRV, and record it onto another MIDI track.

Look at the two tracks. How well do events in the two tracks line up? How well-spaced are the events in the 2nd track? How even are their durations?
Does it make a difference if you insert an empty bar before the first note of the ascending scale?

Try it again, at a different sequencer resolution (say, 96 ppqn). You should do this with a brand new project.
Do events in the two tracks line up any differently?

Differences between the two tracks ought to indicate the amount of 'round trip' error in your MIDI subsystem. The re-recorded event has made two trips through the driver stack -- once going out (during playback to the physical MIDI cable) and once coming back in (as they are recorded into the 2nd MIDI track). Of course, when you record from a MIDI keyboard, the round-trip occurs in reverse (first going into the DAW, then coming back out, if you're using an external sound module), but the total amount of MIDI "handling" by the MIDI subystem (MIDI interface driver/hardware and Windows MIDI stack) should be about the same.

If you give this a try, please post your results on this thread. And let us know what MIDI interface(s) you used.

Thanks

- Jim
2007/10/10 15:59:38
Jim Wright
If you like to mess with electronics, you can actually capture the MIDI data (coming out of the MIDI DIN ports on your interface) as audio data. Why would you want to do that? So you can measure it pretty accurately (~22 microsecond accuracy with a 44.1K sample rate).

For details on the basic approach, see this CNMAT paper: http://cnmat.cnmat.berkeley.edu/ICMC97/papers-html/Latency.html

The CNMAT approach is sound (especially for capturing Ethernet events), but you can capture MIDI messages as audio more easily, if you're willing to hack up a MIDI thru box. Every MIDI input circuit uses an opto-isolator to convert the incoming current-loop signal into a voltage signal that is ground-isolated. To use this voltage - locate the pin on the opto-isolator that presents an AC signal when you apply MIDI to the corresponding DIN input jack. (An oscilloscope is useful here). Then, solder on a 2K or 10K resistor to this pin. Connect the other end of the resistor to an audio jack (1/8" jack works for me). Then, plug a cable from that jack to an audio input on your DAW audio interface.

Congratulations! You've just created a MIDI-to-audio transcoder! Since the MIDI signal is a weird-looking pulse train with a fundamental frequency of about 15.75KHz, you can record it just like audio.

Now -- if you repeat the ascending-scale experiment in my last post -- you can capture the outgoing MIDI data directly as audio (recording it to another track in Sonar). Then, you can see just how badly the outgoing MIDI data gets time-skewed. The advantage here is that you can see what the jitter/latency is like for one-way MIDI transport (outgoing only). With the approach above (using a MIDI cable to 'loop back' the MIDI out to a MIDI in) - you can only look at at the total roundtrip jitter & latency (outgoing + incoming).

- Jim
© 2026 APG vNext Commercial Version 5.1

Use My Existing Forum Account

Use My Social Media Account