LLM Training articles

1/31/2026 • EN

Quoting Andrej Karpathy

Andrej Karpathy notes a 600x cost reduction in training a GPT-2 level model over 7 years, highlighting rapid efficiency gains in AI.

Computational Cost Gpt 2 hardware scaling LLM Training Machine Learning

Simon Willison

12/19/2025 • EN

2025 LLM Year in Review

A review of key paradigm shifts in Large Language Models (LLMs) in 2025, focusing on RLVR training and new conceptual models of AI intelligence.

AI Research Deepseek R1 LLM Training Reinforcement Learning Rlvr

Andrej Karpathy

3/22/2025 • EN

The day piracy changed [blog]

The author argues that traditional piracy is dead, redefined by corporations like Meta using scraped, pirated content to train AI models without consequence.

ai ethics Content Piracy Data Scraping Intellectual Property LLM Training

Remy Sharp

1/17/2025 • EN

Bite: How Deepseek R1 was trained

Explains the training of DeepSeek-R1, focusing on the Group Relative Policy Optimization (GRPO) reinforcement learning method.

Deepseek Grpo LLM Training Proximal Policy Optimization Reinforcement Learning

Philipp Schmid

1/16/2023 • EN

Curated Resources and Trustworthy Experts: The Key Ingredients for Finding Accurate Answers to Technical Questions in the Future

Analyzes the limitations of AI chatbots like ChatGPT in providing accurate technical answers and discusses the need for curated data and human experts.

Chatgpt large language models LLM Training Perplexity AI Technical Misinformation

Sebastian Raschka

LLM Training Articles

Quoting Andrej Karpathy

2025 LLM Year in Review

The day piracy changed [blog]

Bite: How Deepseek R1 was trained

Curated Resources and Trustworthy Experts: The Key Ingredients for Finding Accurate Answers to Technical Questions in the Future

Select Language

We use cookies