Ok, Kevin; I have a lot of sympathy for your situation because I am currently in the throes of attempting to learn DaVinci Resolve and the whole video editing shtick. I've found myself in a new world where all the lingo is new, and every little thing has a gigantic back story with additional complexity attached to it.
It seems you haven't had the 'training wheels' of a simple 4-channel or 8-channel interface, and now you've jumped into a world-class peloton by getting an X32 as your first interface. Although we might wind up doing a TeamViewer-with-Skype session to get this all sorted, here's some more info:
ASIO does things in stereo pairs, and numbers the pairs with the tradition of odd-channel is left. Therefore, it's perfectly correct to see 32 channels represented as 16 pairs with odd numbers. Sonar (and all DAWs) allow you to arm (enable) just one half of a pair for recording if you wish.
There is a 48v button on the X32 console. For the channel you select on the console, it'll be lit orange if the phantom power is on. Press it to toggle the phantom on or off.
Connect your active Rokit speakers to the main-L/R out ports of the X32. The default routing for main-L/R of the X32 (full size one) is via XLR-out 15/16.
By default, the monitor output of the X32 is 'strapped' across the main-L/R output in a pre-fader (not affected by the fader/slider position) mode, and the headphones go along with it. Headphone mixes are marvelous world of their own.
When singing into the mic, keep your gain setting so that the first two orange lights for that channel are blinking. That'll put you in the -18dBFS -to- -12dBFS sweet spot in Sonar.
On that Routing>>Home tab of the X32, you'll see the word 'Local 1-8'. My earlier append was intended to ensure that the first 8 XLR inputs in the back of the X32 are going to be the first 8 inputs for Sonar. When you arm a track for recording and then used the little pull-down menu to choose and input, you'll see 1-left/right/stereo, 3-left/right/stereo, etc. As you can see, input-2 is implicitly 1-right, and input-4 is implicitly 3-right.
Take a shot with this info and see how you do.