Look at that jump in grade school math. From 55 % with GPT 3.5 to 95 % for both ...

causal · on March 4, 2024

Yeah I've been throwing arithmetic at Claude 3 Opus and so far it has been solid in responses.

dwaltrip · on March 4, 2024

Claude has a specialized calculation feature that doesn't use model inference. Just FYI.

causal · on March 4, 2024

I don't believe that it was in this case; it worked through the calculations with language and I didn't detect any hint of an API call.

sebzim4500 · on March 4, 2024

It definitely sometimes claims to have used a calculator, but often it gets the answer wrong. I think there are a few options:

i) There is no calculator and it's hallucinating the whole thing

ii) There is a calculator but it's terrible. This seems hard to believe

iii) It does a bad job of copying the numbers into and out of the calculator

noman-land · on March 4, 2024

Does it still work with decimals?