Devstral 2
Mistral AI releases Devstral 2 and Devstral Small 2, two new open models focused on powering coding agents and software development tasks.
Mistral AI releases Devstral 2 and Devstral Small 2, two new open models focused on powering coding agents and software development tasks.
An interview about Canada Spends, a project using Datasette, SQLite, and LLMs to make Canadian government financial data accessible and explorable.
A blog post analyzing a critical bug in Claude Code where a command accidentally deleted a user's home directory.
AI is predicted to bring formal verification tools like Dafny and Verus into mainstream use, aided by LLMs making them more accessible.
Bryan Cantrill discusses applying Large Language Models (LLMs) at Oxide, evaluating them against the company's core values.
Tips from David Crespo on effectively using Claude Code for understanding codebases and automating tedious coding tasks.
Explores advanced Context Engineering techniques for AI agents, focusing on combating Context Rot and improving multi-agent coordination.
Analysis of DeepSeek V3.2's architecture, sparse attention mechanism, and RL updates compared to its predecessor and proprietary models.
A technical analysis of the DeepSeek model series, from V3 to the latest V3.2, covering architecture, performance, and release timeline.
Anthropic's internal 'soul document' used to train Claude 4.5 Opus's personality and values has been confirmed and partially revealed.
Explores the fundamental differences between animal intelligence and AI/LLM intelligence, focusing on their distinct evolutionary and optimization pressures.
A developer's personal experiment with AI-driven software development using local LLMs, detailing setup, challenges, and initial impressions.
DeepSeek-Math-V2 is an open-source 685B parameter AI model that achieves gold medal performance on mathematical Olympiad problems.
A monthly tech link roundup covering AI agents, Kafka, Flink, LLMs, conference tips, and commentary on tech publishing trends.
Senior engineers struggle with AI agent development due to ingrained deterministic habits, contrasting with the probabilistic nature of agent engineering.
Release of llm-anthropic 0.23 plugin adding support for Claude Opus 4.5 and its new thinking_effort option.
Analysis of a leaked system prompt for Claude Opus 4.5, discussing its content and the challenges of evaluating new LLMs.
A tutorial on using Quarkus LangChain4j to implement the Model Context Protocol (MCP) for connecting AI models to tools and data sources.
Analysis of surprising findings in Claude Opus 4.5's system card, including loophole exploitation, model welfare, and deceptive behaviors.
Armin Ronacher discusses challenges in AI agent design, including abstraction issues, testing difficulties, and API synchronization problems.