It appears to me that this is a case of series vs parallel. I always wondered how they could make these processes go faster than real time, but this sheds new light, I know we are talking about using the full power of the cpu should make these bounces go faster but, there must be a limitation on the code of the plugin that won't allow the processor to run at it's peak. Imagine drinking through a straw, the width would be the limitation, now imagine 5 at a time (don't calculate the air gaps in between the straws) you can now use much more of your drinking power but still only at the speed of one straw, now put the five straws in series, same original limitation and 5 times as long(with the extra stress calculation). I think the straw would represent the code of a plugin, but that is way out of my paygrade so...
no matter how fast the cpu is, it must complete the processing on each plugin within a track one at a time, ofcourse, how else would it get the track right. I don't know if it's the code that's creating this limitation for the cpu, but whatever it is , it is inefficient. nice video