A couple of technical points:
1. The noise which is used to dither (which is not necessarily
white noise) does not
mask the distortion - it
randomizes the quantization error so that the error essentially
becomes noise instead of distortion.
2. You should never choose which noise shaping/dither algorithm to use while listening to quiet parts at unusually high volume. This is because the specific frequencies that the noise is moved to by the noise shaping is based in large part on the sensitivity of our hearing
at a particular listening level. IOW, noise shaping is carefully optimized for our hearing at a reasonable listening level where the noise itself is very quiet, and if we listen at a louder level then our hearing changes and that careful optimization is lost.