Have you compared your latency/buffer settings? Is there a big difference? When you compare the waveforms, which way are the differences? If you zoom in very much and compare the transients of your recording and the bass recording nudged to an exact starting point, does it stay in time or drift apart?
Basically, AFAIK, if you both have a latency low enough for accurate recording, then the resulting audio tracks should be accurate enough to be combined.
Anyway, no matter what the latency is, if the bass is played to the same drum track or click that you have used, and is a little off, then it should be equally much off all the way, and nudging the bass clip one way or another a few milliseconds should do the job. If it's half a second off, as you say, it's really, really much.
If the off-time drifts along the way, then I don't know what to say.
Note, that if you have a bass clip on the track starting at 0 time, you can not move it left without cutting a slice
off the beginning.