Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice. And it satisfies my curiosity about whether trading firms are switching to AMD or not


In HFT single-threaded performance is king so that's why we're all still on Intel. AMD is making progress but not just quite there yet.


Huh. My experience has been that AMD wins that unless your application is so small that it can fit into Intel's smaller cache. And the new 3D architecture from AMD I thought would make your developers drool, allowing them to actually inline everything instead of being scared of building apps that are too big to fit into cache


Not my experience at all and I work across different teams who own different latency sensitive apps. Most of them have unhygienically huge working sets.


To be clear: bitcharmer says "we" to mean "fellow HFTs" not "Jane Street".


Yes. Thanks, should have made that more explicit.


For low-latency strategies, AMD's lack of DDIO [0] makes it a non-starter. The memory latency is a big gap to close.

[0] https://www.intel.com/content/www/us/en/io/data-direct-i-o-t...


Do you know this for a fact? I've done some work in the industry where I needed to make fast software, but never the like sub-microsecond tick-to-trade type fast, so I really don't know.

There was a great presentation from 2017 about some of Optiver's low latency techniques[1]. I had assumed they released it because the had obviated all of them by switching to FPGAs, but I don't know. Either way, he suggested that if you ever needed to ping main memory for anything, you already lost. So, I wouldn't have thought DDIO plays into their thinking much.

[1] https://www.youtube.com/watch?v=NH1Tta7purM


The idea is precisely that you want to avoid pinging main memory at all, which is possible (in the happy case) if you do things correctly with DDIO. Not everything is done in hardware where I am. I am wary of saying much because my employer frowns on it, and admittedly I work on the software more than the hardware, but DDIO is certainly important to us.


how do you access this DDIO feature if you are writing a C or C++ application? intrinsics?


DDIO operates mostly transparently to software, with the I/O controller feeding DMAs into a slice of L3. Hardware can opt out by setting PCIe TLP header hints, and you have some system-wide configurability via MSRs, but it's not something a userspace application can take into its own hands.


so is this taken advantage of by the OnLoad drivers of solarflare cards, for example?


Noticed this just now. It is.


It's configurable via MSR. You can also disable it system-wide or on a PCIe port basis. I detailed it all here:

https://www.jabperf.com/skip-the-line-with-intel-ddio/


I don’t know that this definitively answers that question. It’s possible to use a different architecture based on cost/performance and keep a small population of Intel machines in service because you want access to their superior PMUs. Most of what you learn on the latter would still apply to the former.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: