With any voice work you have 4 circumstances.
1) Where the sound accompanies a picture and the extraneous sounds have a context that helps the noise from seeming disconcerting. For example; a story about nature with birds singing in the back ground.
2) Where the sound accompanies a picture and the extraneous sounds do not seem to be in context to the scene and the noise seems disconcerting.
3) Where the sound exists only as sound and the extraneous sounds have a context that helps the noise from seeming disconcerting. For example; a story about jackhammers with a jackhammer sound in the back ground.
4) Where the sound exists only as sound and the extraneous sounds have no context so they seem disconcerting.
Most professional speakers will speak loudly enough that their voice will help the circumstance of masking disconceting sounds.
If that is the case then it comes down to effective mic placement.
You can try to edit the noise out after the fact. It will be very tedious and the results will likely fall short of the hope or expectation.
best regards,
mike