James Bach 10/6/2025

Seriously Testing LLMs

Read Original

The article discusses the significant difficulties in testing Generative AI and LLMs, highlighting their inherent 'sortaness' and the high cost of responsible testing. It argues that AI testing is akin to platform or cybersecurity testing, with unbounded regression problems and unreliable assumptions. The authors critique superficial AI demos and advocate for smarter, more rigorous testing methodologies.

Seriously Testing LLMs

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week