"So AMD runs at 2/3 the IPC of an old Intel processor. That is quite poor!" That...

BeeOnRope · on Dec 6, 2019

I'm not sure if you were implying it or just using it as example of another type of unhelpful claim, but this test does not involve AVX-512.

I agree using Westmere isn't necessarily the best approach, but there is no difference in this case with either -march=native or -march=znver1.

The loop is small and simple, with only 9 instructions and compiles more or less the same regardless of march setting (I observed some basically no-op changes such as a mov and blsr swapping places). Here's the assembly (for the second test, with the bigger IPC gap):

    top:
    tzcnt  r8,rcx
    add    r8d,edx
    mov    DWORD PTR [rdi+rax*4],r8d
    mov    eax,DWORD PTR [rsi]
    inc    eax
    blsr   rcx,rcx
    mov    DWORD PTR [rsi],eax
    jne    .top

endorphone · on Dec 7, 2019

"I'm not sure if you were implying it or just using it as example of another type of unhelpful claim, but this test does not involve AVX-512."

Even worse! Is this a defense, because it's remarkably unhelpful as one.

The blog post was clearly a cry for attention for some project -- let's just use some clickbait IPC claims to gain it -- and continually alluded to a whole project -- an extreme niche project that still wouldn't have any relevance. But instead it's a meaningless, completely misrepresentative micro-loop.

BeeOnRope · on Dec 7, 2019

My read is different than yours.

I think Daniel uses those examples because they are actual examples from projects that he is or has been working on, and he's familiar with them and actually cares about them, and because it's at least a notch more realistic than something totally synthetic.

It seems like a very roundabout thing to use as a cry for attention for SIMDjson (the project I assume you are talking about), and I don't believe that's the purpose. I see no problem in linking the project.

Picking two random benchmarks and trying to extract any kind of more general IPC claim is not on solid ground, but I'm pretty sure Daniel will say he's not doing that: he's only sharing these two specific results. That's a style that reoccurs across several entries in that blog, however, so if it triggers (as it has me on occasion) you might want to look elsewhere.

Tempest1981 · on Dec 6, 2019

Doesn't that sentence refer only to the table above, measuring "bitset decoding" with a basic decoder, comparing 1.4 to 2.1 IPC?

It would help if the blog post had some headings to separate the benchmarks and summary.

BeeOnRope · on Dec 6, 2019

A plain reading indicates that yes, he's only referring to the last benchmark, which showed the 2/3 disparity.

endorphone · on Dec 7, 2019

A plain reading indicates that such is irrelevant, because these are the two tiny cases that he selectively chose to demonstrate the "IPC gap" of AMD. If some AMD booster posted hand-selected micro-benchmarks that gave AMD a lead, and boasted with exclamations and pejoratives how terrible the alternative is, we would rightly question it. This deserves no more.

And to the other defense of "Well there are AMD people claiming the same in reverse, so that legitimizes this", I've seen exactly zero of those posts on here. None. They would be laughed off the site.

What we do have is that traditionally at a given frequency, per core AMD has long trailed on major benchmarks of significant, user-realistic loads. This is the the first generation in a long time where it actually doesn't, and where you don't need additional cores to make up the gap.

BeeOnRope · on Dec 7, 2019

I feel like you are intentionally being thick in order to get mad at me.

I am only talking specifically about the 2/3 claim at the bottom of the article, which for the avoidance of doubt, is simply a summary of the final measurement made in the article, i.e., the result of dividing 1.4 by 2.1. I know this because of its positioning in the article, because the numbers line up, because a different % IPC is given for the earlier measurement, and because an earlier version of post, with different results for the last experiment (with IPC of 2.8 and 1.4), showed a different ratio (50%).

How you are somehow interpreting the small clarification of the one line which was being discussion as wide-ranging defense of the article, I'm not sure. My broader thoughts are available here [1] and the comments on the article.

---

[1] https://news.ycombinator.com/item?id=21724780