Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been dreaming on pcpartpicker.

I think Radeon RX 7900 XT - 20 GB has been the best bang for your buck. Enables full gpu 32B?

Looking at what other people have been doing lately, they arent doing this.

They are getting 64+ core cpus and 512GB of ram. Keeping it on cpu and enabling massive models. This setup lets you do deepseek 671B.

It makes me wonder, how much better is 671B vs 32B?



> It makes me wonder, how much better is 671B vs 32B?

32B has improved leaps and bounds in the past year. But Deepseek 671B is still a night and day comparison. 671B just knows so much more stuff.

The main issue with RAM-only builds is that prompt ingestion is incredibly slow. If you're going to be feeding in any context at all, it's horrendous. Most people quote their tokens/s with basically non-existent context (a few hundred tokens). Figure out if you're going to be using context, and how much patience you have. Research the speed you'll be getting for prompt processing / token generation at your desired context length in each instance, and make your decision based on that.


I bought an RX 7900 XTX with 24GB, and it’s everything I expected of it. It’s absolutely massive though. I thought I could add one extra for more memory, but that’s a pipe dream in my little desktop box.

Cheap too, compared to a lot of what I’m seeing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: