Reasoning articles

1/24/2026 • EN

Categories of Inference-Time Scaling for Improved LLM Reasoning

An overview of inference-time scaling methods for improving LLM reasoning, categorizing techniques and highlighting recent research.

Chain Of Thought Inference Scaling llm Reasoning Self Consistency

Sebastian Raschka

1/24/2026 • EN

Categories of Inference-Time Scaling for Improved LLM Reasoning

An overview of inference-time scaling methods for improving LLM reasoning, categorizing techniques like chain-of-thought and self-consistency.

Chain Of Thought Inference Scaling llm Model Accuracy Reasoning

Sebastian Raschka

12/1/2025 • EN

Abstraction of Thought Makes AI Better Reasoners

Explores Abstraction of Thought (AoT), a structured reasoning method that uses multiple abstraction levels to improve AI reasoning beyond linear Chain-of-Thought approaches.

abstraction artificial intelligence Cognitive Modeling large language models Reasoning

Benny Cheung

11/14/2025 • EN

Introducing GPT-5.1 for developers

OpenAI releases GPT-5.1 API with new reasoning modes, adaptive reasoning, extended prompt caching, and new built-in tools for developers.

api Gpt 51 Openai Prompt Caching Reasoning

Simon Willison

5/1/2025 • EN

Why We Think

Explores how increasing 'thinking time' and Chain-of-Thought reasoning improves AI model performance, drawing parallels to human psychology.

Chain Of Thought large language models Model Performance Reasoning Test Time Compute

Lilian Weng

3/29/2025 • EN

First Look at Reasoning From Scratch: Chapter 1

An introduction to reasoning in Large Language Models, covering key concepts like chain-of-thought and methods to improve LLM reasoning abilities.

artificial intelligence Deep Learning llm Machine Learning Reasoning

Sebastian Raschka

2/1/2025 • EN

Finetune Granite3.1 for Reasoning

A technical guide on fine-tuning IBM's Granite3.1 AI model using Guided Reward Policy Optimization (GRPO) to enhance its reasoning capabilities.

Finetuning Granite31 Grpo Reasoning Reinforcement Learning

Ruslan Magana Vsevolodovna

1/30/2025 • EN

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

A tutorial on reproducing DeepSeek R1's RL 'aha moment' using Group Relative Policy Optimization (GRPO) to train a model on the Countdown numbers game.

Deepseek R1 Group Relative Policy Optimization Grpo Reasoning Reinforcement Learning

Philipp Schmid

Reasoning Articles

Categories of Inference-Time Scaling for Improved LLM Reasoning

Categories of Inference-Time Scaling for Improved LLM Reasoning

Abstraction of Thought Makes AI Better Reasoners

Introducing GPT-5.1 for developers

Why We Think

First Look at Reasoning From Scratch: Chapter 1

Finetune Granite3.1 for Reasoning

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Select Language