BullshitBench tests whether AI models can detect nonsensical questions—or if they’ll confidently answer them anyway. The results are dire.
March 11, 2026

BullshitBench tests whether AI models can detect nonsensical questions—or if they’ll confidently answer them anyway. The results are dire.




