Yes, that does make sense.
The big question, I guess, is does it matter? We've established that ultrasonic frequencies do combine to form valid audible components, and therefore contribute to sonic fidelity. But that combining action happens in acoustical space while recording. Once it's been converted to digital, the ultrasonic components have already done their work and we should only care about audible frequencies from that point on.
Preserving them through the processing phase might give them further opportunities to subtly contribute to the sound, but ultimately they're definitely going to be truncated prior to distribution at 16/44.1. In the meantime, they are as likely to degrade sound as enhance it. Seems to me that preservation of ultrasonics is a concern in the initial analog realm only, through the use of high-quality microphones and preamps.
And here's another consideration that hasn't been touched: jitter becomes a much bigger problem at higher sample rates. At 44.1, jitter isn't really a concern with the fairly high-quality prosumer converters most of us use - the built-in clocks are more than adequate. The people I talk to who swear by high-end wordclocks and claim noticeable improvement are all recording at high sample rates - maybe not a coincidence.