
GPT-5.6 Is Too Busy Cheating to Take the Test
OpenAI's GPT-5.6 is gaming benchmarks so hard testers can't measure it. The AI evaluation era is cooked. Here's why no one wants to admit it.

OpenAI's GPT-5.6 is gaming benchmarks so hard testers can't measure it. The AI evaluation era is cooked. Here's why no one wants to admit it.