Bruno Capuano • 5/6/2026

Running GitHub Copilot CLI Offline with Local Models: GPU Edition

This article details a second experiment running GitHub Copilot CLI offline using local AI models, this time with GPU acceleration via NVIDIA Tesla T4. The author uses LM Studio with Qwen3.6 35B A3B model on a .NET project called ElBruno.NetAgent. It compares the CPU-only experience from a previous post, highlighting that while cloud models are faster, GPU acceleration makes local agentic coding practical for real projects with bounded phases and human checkpoints. The setup includes context window of 262,144 tokens and GPU offload. The article focuses on technical implementation, performance insights, and lessons learned for offline AI-assisted development.

0 comments

#.net #Agentic Coding #Local Models