AI & Alignment
Explores how AI boosts coding speed but highlights alignment as the true bottleneck in software development.
Explores how AI boosts coding speed but highlights alignment as the true bottleneck in software development.
OpenAI researchers propose 'confessions' as a method to improve AI honesty by training models to self-report misbehavior in reinforcement learning.
A guide to building product evaluations for LLMs using three steps: labeling data, aligning evaluators, and running experiments.
Explores the shift from RLHF to RLVR for training LLMs, focusing on using objective, verifiable rewards to improve reasoning and accuracy.