Jun 11, 2026 The Coverage Lock: What Next-Token Prediction Can and Cannot Teach Multimodal LLMs About the Visual World