Simon Willison 13/11/2025

What happens if AI labs train for pelicans riding bicycles?

Read Original

The article discusses the author's ongoing benchmark for AI models: generating a high-quality SVG of a pelican riding a bicycle. It addresses concerns that AI labs might specifically train for this benchmark, arguing they would be caught if their model failed on similar tasks. The author also shares their long-term, humorous goal of incentivizing labs to 'cheat' on the benchmark to finally produce the perfect pelican-on-a-bicycle illustration.

What happens if AI labs train for pelicans riding bicycles?

Commenti

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

No top articles yet