High. The only benchmark I look at is LMSys Chatbot Arena. Lets see how it perfo...

jasondclinton · on March 4, 2024

We are tracking LMSys, too. There are strange safety incentives on this benchmark: you can “win” points by never blocking adult content for example.

adam_arthur · on March 4, 2024

Seems perfectly valid to detract points for a model that isn't as useful to the user.

"Safety" is something asserted by the model creator, not something asked for by users.

BoorishBears · on March 4, 2024

It's valid, but makes the benchmark kind of useless unless your plan is to ask the model how to make meth.

More power to you if that is your plan, but most of us want to use the models for things that are less contentious than the things people put into chatbot arena in order to get commercial models to reveal themselves.

-

I'd honestly we rather just list out all the NSFW prompts people want to try, formalize that as a "censorship" benchmark, then pre-filter chatbot arena to disallow NSFW and have it actually be a normal human driven benchmark.

mediaman · on March 4, 2024

People like us are not the real users.

Corporate users of AI (and this is where the money is) do want safe models with heavy guardrails.

No corporate AI initiative is going to use an LLM that will say anything if prompted.

adam_arthur · on March 4, 2024

And the end users of those models will be (mostly) frustrated by safety guardrails, thus perceive the model as worse and rank it lower.

baobabKoodaa · on March 4, 2024

Yep. And in addition, lobotomized models will perform worse on tasks where they are intended to perform well.

moffkalast · on March 4, 2024

Opus and Sonnet seem to be already available for direct chat on the arena interface.