So you only say you are using a sample library, which sounds like you are going to be using the audio interface mainly for output monitoring. That would mean you only need as many output channels as you intend to record (two for stereo) and you do not need to spend a lot of money trying to get the highest quality A/D, microphone preamps etc. The quality of your recorded sound from softsynths, samplers etc. in the box is unrelated to the audio interface you use. If so, the Focusrite Scarlett series would be fine.
Most MIDI controllers now connect by USB so, unless you have a MIDI controller that still connects via the old round plug, you do not need MIDI on your audio interface or as a separate device at all, so stereo output monitoring of in the box created audio would only require the 2i2.
If you plan to do audio input from a dozen mics at once, then you need a dozen audio inputs, and you can get out of your price range pretty fast.