SimpleM
One of the earliest procedures I was taught (by more than one accomplished mastering engineer) was to keep the unprocessed pre-master handy and keep it simply level matched to what you are doing as you work on it so that you can discern whether the changes you are making are actually moving in the right direction or you are just getting the "betters" because you are making something louder.
I agree with this 1000%, and it was a Great Day when Wavelab introduced the option to balance the levels of the mastered and unmastered version for instant comparison. But think about it for a second: in this case you're not comparing the master to a master.
You're comparing the master to the mix. They're different animals, and one part of mastering is to find that sweet spot between satisfying dynamics and having the master jump out of the speaker.
I'll put it another way. I've had some clients who wanted as loud a master as possible. I believe that a song needs to retain some degree of dynamics to remain emotionally compelling, so I do what they want, and then I do what I want and let them choose. The squashed master will sound louder; mine will sound a bit softer. But almost all clients choose the one with more dynamics because they're comparing
master to master and then decide they are willing to trade off less level for more emotional impact. If I balanced the levels, they would not be able to make an informed choice. They
need to know they are trading off lower levels for more emotion.
I agree completely that if you're comparing masters to determine how faithfully they translate the
mix into a final stereo track, then you'd want the levels matched. But if you're comparing one
master to another
master, how loud you can get while still preserving dynamics and having a satisfying emotional experience is one element of a basis of comparison. It's clear from the responses to this "taste test" that hardly anyone prefers the loudest master, which simply means to me that if someone is using SONAR, they probably have educated enough ears to listen past just the impact of the level, and judge the master on its other qualities as well.
The one assurance I would want with this particular kind of "taste test" would be that all the examples were normalized to the same peak level, because that would mean each master took advantage of all available headroom. This would allow for a fairer master-to-master comparison.