In the GPT-4 technical report, they reported contamination of humaneval data in the training data.
They did measure against a "non-contaminated" training set but no idea if that can still be trusted.
https://cdn.openai.com/papers/gpt-4.pdf
In the GPT-4 technical report, they reported contamination of humaneval data in the training data.
They did measure against a "non-contaminated" training set but no idea if that can still be trusted.
https://cdn.openai.com/papers/gpt-4.pdf