Sam Rose explains how LLMs work with a visual essay
A visual essay explaining LLM internals like tokenization, embeddings, and transformer architecture in an accessible way.
A visual essay explaining LLM internals like tokenization, embeddings, and transformer architecture in an accessible way.
Google releases Gemini 3 Flash, a faster, cheaper AI model with strong coding and multimodal capabilities, compared to previous versions.
Explores the 'Normalization of Deviance' concept in AI safety, warning against complacency with LLM vulnerabilities like prompt injection.
Mistral AI releases Devstral 2 and Devstral Small 2, two new open models focused on powering coding agents and software development tasks.
An interview about Canada Spends, a project using Datasette, SQLite, and LLMs to make Canadian government financial data accessible and explorable.
A blog post analyzing a critical bug in Claude Code where a command accidentally deleted a user's home directory.
AI is predicted to bring formal verification tools like Dafny and Verus into mainstream use, aided by LLMs making them more accessible.
Bryan Cantrill discusses applying Large Language Models (LLMs) at Oxide, evaluating them against the company's core values.
Tips from David Crespo on effectively using Claude Code for understanding codebases and automating tedious coding tasks.
A technical analysis of the DeepSeek model series, from V3 to the latest V3.2, covering architecture, performance, and release timeline.
Anthropic's internal 'soul document' used to train Claude 4.5 Opus's personality and values has been confirmed and partially revealed.
DeepSeek-Math-V2 is an open-source 685B parameter AI model that achieves gold medal performance on mathematical Olympiad problems.
Release of llm-anthropic 0.23 plugin adding support for Claude Opus 4.5 and its new thinking_effort option.
Analysis of a leaked system prompt for Claude Opus 4.5, discussing its content and the challenges of evaluating new LLMs.
Armin Ronacher discusses challenges in AI agent design, including abstraction issues, testing difficulties, and API synchronization problems.
A developer discusses the non-deterministic nature of LLMs like GitHub Copilot, arguing that while useful, they cannot take ownership of errors like a human teammate.
Martin Fowler discusses the latest Thoughtworks Technology Radar, AI's impact on programming, and his recent tech talks in Europe.
New release of the llm-gemini plugin adds support for nested Pydantic schemas, YouTube URL attachments, and the latest Gemini 3 Pro model.
A guide to building a connector-based RAG system that fetches live data from Confluence using its REST API and Java, avoiding stale embeddings.