Simon Willison 2025-11-13

What happens if AI labs train for pelicans riding bicycles?

Read Original

The article discusses the author's ongoing benchmark for AI models: generating a high-quality SVG of a pelican riding a bicycle. It addresses concerns that AI labs might specifically train for this benchmark, arguing they would be caught if their model failed on similar tasks. The author also shares their long-term, humorous goal of incentivizing labs to 'cheat' on the benchmark to finally produce the perfect pelican-on-a-bicycle illustration.

What happens if AI labs train for pelicans riding bicycles?

Kommentarer

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

No top articles yet