Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I frequently see recommendations that intrinsics be used instead of assembly. The theory is that they are more portable and just as efficient, which they certainly are on an instruction by instruction level.

But I'm not having any luck at writing entire loops with them. All the compilers I've tried (gcc, icc, clang) feel compelled to "optimize" the intrinsics for me, turning my well-tuned and port-conscious loop into a hash.

For a tight loop of all intrinsics, I can often get a 20% improvement if I use raw assembly in the same written order. Is there any good way to convince these compilers to "do what I said" without turning off all optimizations everywhere?



Put it in a separate translation unit and compile that with different flags.

But seriously, if you're really at the spot where you have real, debugged code that is better than compiler output, the compiler is providing no value and you might as well use the assembly you've written. The point to intrinsics is to expose hardware features that the C language doesn't, not to promise to do it better than the compiler.


Put it in a separate translation unit and compile that with different flags.

I've tried it in separate units, and haven't been able to get things out in the order I have them in the source. ICC does the most rearranging (which in most cases other than this leads to faster code), and GCC generally does what you say. But both do things like turning my reloads from memory (to reduce port pressure) into register copies. I was wondering if there were flags I'm not finding to prevent this.

If you're really at the spot where you have real, debugged code that is better than compiler output, the compiler is providing no value and you might as well use the assembly you've written.

Likely the best plan, but this is being used in an open source library where the rest of the work is done with intrinsics. It's a "when in Rome" thing rather than a technical concern.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: