Yeah, without the latency regression, it probably would have gone undetected much longer. Using a secondary thread and spreading the CPU load over a few seconds would have made it not even register as a spike in CPU usage.
Or do cheap ECDSA instead of expensive RSA. Even if the backdoor is hidden inside RSA decryption and the rest of the system thinks the thing being decrypted should be encrypted with RSA, you don't have to use it for the back door.