Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There’s been community commentary that many of the GPT models are a tad overfitted WRT benchmarks. Benchmarks are not representative of end user experiences. That’s not to say the benchmarks aren’t useful at all, but are only useful as a subjective indicator.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: