Summary
Moltis opened a fix to downscale oversized images before they enter model context, addressing cases where full-resolution photos could consume hundreds of thousands of tokens. The change targets a practical failure mode for multimodal agents: image inputs that exceed context budgets before the task starts.
What changed
Moltis PR #1138 downscales oversized images before embedding them into model context.
Why it matters
Multimodal agents need preprocessing controls for images just as text agents need compaction and retrieval. Without size guards, ordinary phone photos can cause immediate context overflow, failed turns, or large unintended costs.
Evidence excerpt
Agents Radar reports PR #1138 as a fix that downscales oversized images before they enter model context, avoiding roughly 350K-token base64 payload failures for full-resolution photos.