I can think of two likely explanations for this phenomenon. The first is that you are using some mono-incompatible widening technique on either the vocal or on the full mix. If, for example, the left and right channels are out of phase when you fold them to mono you'll lose stuff that's center-panned, including vocals. To test if this is your problem, set the interleave on the master bus to mono and listen for extreme differences.
The other possibility is that it's just a translation issue. The real trick of mixing is figuring out how to make your song sound acceptable on a wide variety of playback systems, including those at the bottom of the fidelity range.
The iPhone is about as lo-fi as it gets. On the one hand, it's incapable of reproducing some frequencies, so any mix elements that depend on those frequencies disappear. But it will also overemphasize other frequencies, exaggerating other mix elements and exacerbating whatever masking effects they're causing. Masking makes things disappear.