LMArena was always junk. I work in this space and while the media takes it seriously most scientists don't.
Random people ask random stuff and then it measures how good they feel. This is only a worthwhile evaluation if you're Google or Meta or OpenAI and you need to make a chartbot that keeps people coming back. It doesn't measure anything else useful.
I hear AI news from time to time from the M5M in the US - and the only place I've ever seen "LMArena" is on HN and in the LM studio discord. At a ratio of 5:1 at least.
Conversation is a two-way street. A good conversation mechanic could elicit better interaction from the users and result in better answers. Stands to reason, anyway.
Random people ask random stuff and then it measures how good they feel. This is only a worthwhile evaluation if you're Google or Meta or OpenAI and you need to make a chartbot that keeps people coming back. It doesn't measure anything else useful.