To view parent comment, click here.
To read all comments associated with this story, please click here.
burnttoys,
"That code can not be vectorised as it is written as 'a' and 'b' may point to overlapping areas of memory.
The 'restrict' keyword is necessary."
Wow, thanks burnttoys, you are correct, some constructive criticism!
It should be using restrict just like 'memcpy' does.
Unfortunately I tried it and it didn't change the output at all. Additionally I see a lot of devs claiming that the restrict keyword doesn't have any effect under GCC, can you confirm that?
I couldn't tell you more about GCC vectorisation in any great detail.
I have seen restrict et al work (that is generate vector code) on ARM for the NEON instruction set but right now I can't remember if that was RVCT or GCC. If ARM is your thing it might be worth checking the linaro.org builds of GCC. For x86 - I'm afraid I've no idea. I just hit "build" in Qt Creator as none of my x86 code is really that performance critical anymore!





Member since:
2008-12-20
That code can not be vectorised as it is written as 'a' and 'b' may point to overlapping areas of memory.
The 'restrict' keyword is necessary.