Simon Willison • 11/27/2025

Quoting Qwen3-VL Technical Report

This technical report details Qwen3-VL-235B-A22B-Instruct's performance in long-context video processing. It specifically evaluates the model's 'Needle-in-a-Haystack' capability, where it must locate a critical frame within long videos. The model achieved 100% accuracy on 30-minute videos and maintained 99.5% accuracy when extrapolated to sequences equivalent to 2 hours of video.

0 comments

#computer vision #Multimodal AI #Long Context