Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

High. The only benchmark I look at is LMSys Chatbot Arena. Lets see how it perform on that

https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...



We are tracking LMSys, too. There are strange safety incentives on this benchmark: you can “win” points by never blocking adult content for example.


Seems perfectly valid to detract points for a model that isn't as useful to the user.

"Safety" is something asserted by the model creator, not something asked for by users.


It's valid, but makes the benchmark kind of useless unless your plan is to ask the model how to make meth.

More power to you if that is your plan, but most of us want to use the models for things that are less contentious than the things people put into chatbot arena in order to get commercial models to reveal themselves.

-

I'd honestly we rather just list out all the NSFW prompts people want to try, formalize that as a "censorship" benchmark, then pre-filter chatbot arena to disallow NSFW and have it actually be a normal human driven benchmark.


People like us are not the real users.

Corporate users of AI (and this is where the money is) do want safe models with heavy guardrails.

No corporate AI initiative is going to use an LLM that will say anything if prompted.


And the end users of those models will be (mostly) frustrated by safety guardrails, thus perceive the model as worse and rank it lower.


Yep. And in addition, lobotomized models will perform worse on tasks where they are intended to perform well.


Opus and Sonnet seem to be already available for direct chat on the arena interface.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: