The difference is that there are 2 types of latency in play here - first, a latency in the system that is based on processing time between when the MIDI event is received and the corresponding audio is produced from the synth, and second, a latency that is perceptual based on the 'attack' of the sample, ie. a delay between when a sample starts to play and when it is effectively heard.
The first type of latency can (and usually is) automatically adjusted during playback. The DAW and plugin can work out between them how long it takes for a MIDI note to trigger audio and can read the MIDI early to compensate. (Or can push the audio back, but the end result is the same.) The idea is that sample 1 from the soft synth ends up playing back on exactly the frame when the MIDI On event is triggered. This doesn't help during tracking however because that latency is part of the system and (as we've noted) the DAW is incapable of predicting when you are going to press a key on your keyboard. So you will probably hear more latency while tracking than when you play it back.
The second type of latency is inherent to the sample. If the sample has an attack that takes 50ms before the main transient or peak is heard, it will feel like the sound has a 50ms latency relative to the MIDI event that triggers it. This is going to be here during tracking but it's also not going to go away during playback. Even with the usual latency compensation, a MIDI event at 1.000 seconds is going to result in that transient/peak being heard at 1.050s because that's how long it takes for the sound to 'get going'. If you know your instrument has that kind of slow attack, you can adjust for that with the Time+ option so that the MIDI is triggered slightly earlier to counterbalance it. But it can't be addressed by plugin delay compensation in the normal sense because the plugin's delay is already compensated for. Sample 1 is already playing exactly when the MIDI On event happens - it's just that you don't hear much until Sample 2000 or so.
Now, the plugin could 'lie' to Sonar and pretend it has more inherent latency than it really has, meaning that playback would be more likely to sound like it coincided with the MIDI. But that wouldn't work well if there are a variety of different attack durations (quite likely if some are plucks and some are strums), and either way it would make no difference during tracking because that compensation can't occur in real-time.