Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Look at that jump in grade school math. From 55 % with GPT 3.5 to 95 % for both Claude 3 and GPT 4.


Yeah I've been throwing arithmetic at Claude 3 Opus and so far it has been solid in responses.


Claude has a specialized calculation feature that doesn't use model inference. Just FYI.


I don't believe that it was in this case; it worked through the calculations with language and I didn't detect any hint of an API call.


It definitely sometimes claims to have used a calculator, but often it gets the answer wrong. I think there are a few options:

i) There is no calculator and it's hallucinating the whole thing

ii) There is a calculator but it's terrible. This seems hard to believe

iii) It does a bad job of copying the numbers into and out of the calculator


Does it still work with decimals?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: