AI Inference articles

$DeepSeek V4 - almost on the frontier, a fraction of the price$

4/24/2026 • EN

DeepSeek V4 - almost on the frontier, a fraction of the price

DeepSeek V4 preview models offer frontier-level performance at a fraction of the cost, with up to 1M token context and open weights.

AI Inference Deepseek V4 LLM Pricing Mixture Of Experts Open Weights

Simon Willison

12/23/2025 • EN

Using Streamlit Chatbot UI with AKS KAITO Language Model Inferences

A tutorial on building a chatbot UI with Streamlit to interact with a language model inference service deployed on Azure Kubernetes (AKS) using KAITO.

AI Inference Azure Aks Chatbot Kubernetes Streamlit

Roy Kim

12/22/2025 • EN

Install KAITO v0.8.x on Azure Kubernetes Service With Phi-4 Language Model

A technical guide to installing KAITO v0.8.x on Azure Kubernetes Service to run the Phi-4 language model for AI inference.

AI Inference Azure Kubernetes Service Kubernetes Large Language Model Phi 4

Roy Kim

12/20/2025 • EN

Big GPUs don't need big PCs

Testing GPU performance on a Raspberry Pi 5 versus a desktop PC for transcoding, AI, and multi-GPU tasks, showing surprising efficiency.

AI Inference Gpu llm Pcie raspberry pi

Jeff Geerling

8/23/2025 • EN

The environmental impact of Google Gemini AI text prompts

Google's report details the measured energy, emissions, and water consumption of a single Gemini AI text prompt in production.

AI Environmental Impact AI Inference Google Gemini Green Computing Sustainable Computing

David Mytton

7/14/2025 • EN

Running Open-Weight LLMs on AKS with KAITO: A Summary of Model Families

A guide to deploying and comparing open-weight LLM families (DeepSeek, Falcon, Llama, etc.) using the KAITO operator on Azure Kubernetes Service (AKS).

AI Inference Ak Kubernetes LLM Deployment Model Families

Roy Kim