I was never happy with using the interface monitor system.. There was never any latency as you are hearing you voice or guitar going in directly while it's still analog. If DSP effects are involved then yes the signal has now passed into the digital side so A/DD - D/A latency is now added.
But my reasons for using a very similar system as the OP is headphone and control room level issues. And the option to add reverb to the headphpones.
My Mike or a direct input for bass or accoustic guitar go first to a Joe Meek 3Q.
It has a toggle for mike / line so that solves that issue. You can leave them both plugged in.
The Joe meek has parallel outputs so one goes to a Yamaha MG 82 CX mixer and the other to my Focusrite 6i6 interface.
The 6i6 1/2 outs go to the studio monitors and 3/4 go to the little mixer.
The mixer gives you a solid clean mix that's easy to blend for either me or clients.
If its live tracking of vox and acoustic guitar,, I can add the guitar to the mixer using it's line out or a second mike, lots of quick easy solutions.
I also have a Mackie mix 8 that cost $70 that I can add in there if there needs to be more headphone mixes.
It's funny the 6i6 has 2 headphone channels and powerful mixing software but the hardware is much easier to use and always works. I have no love for Mix Control and it illogical set up.