* Recording - whatever the audio interface produce, normally 24bit.
- there is no interfaces which have meaningful 24bit resolution, but most of them are better then 16bit. The hint is in the specification: SnR or alike which is 96dB for 16bit (you can easily find that most interfaces have over 100dB, but no of them have 144dB required for 24bit)
- and so recording into 32 means saving at least 12 useless bits (into 64 format, ~44 bits pro samples will be useless)
* Processing - 64bit. Even simple mixing 2 channels "damage" the last significant bit, complex algorithms use 1000s of mathematical computations. While that does not mean 10 operations always destroy last 10 significant bits, in practice that is much less, the result depends from the algorithm (usually moderate to good quality) and the programmer (experience of which in the music world is unpredictable). So the difference between 32bit and 64bit processing can shift into audible range (~16 bit). Note that up-sampling (f.e. to 96kHz) in some situations sounds "better" (you can find some examples from Craig) for the same reason.
* Intermediate saving (rendering/mix export/etc) - 32bit. That is "floating point" format with 24bit precision bits for any number. Even in worse case scenario, it will take ~8 intermediate savings till the difference to 64bit become audible (so you process -> export -> import -> process -> export -> import ... etc). I have not seen examples in the Internet that such degradation ever happens in practice. Note that 24bit format is fixed point, that means low amplitude samples have low precision (down to 1bit!) and only full amplitude samples have 24bit precision. The degradation is 1bit loose pro ~6dB. F.e. if the upper level of samples in your rendered file is -18dB, you use at most 21bit when exporting into 24bit format.
* Master output - 24bit or 32bit. After mastering the level should be normalized. Future processing is not going to increase the level of low amplitude samples, it will only reduce the precision when moving toward 16bit CD format. So 24bit fixed point is more then sufficient.
Finally. Dithering should be applied only once, especially "good" algos (noise shaping). By definition of that process, dithering adds noise to the signal (that is a trade, not a magic, you get more noise for less quantization distortion, a good trade for listening but questionable for future processing).