VISOR: Agentic Visual RAG via Iterative Search and Over-horizon Reasoning
https://arxiv.org/abs/2604.09508Summary
Tackles the problem of agentic visual retrieval-augmented generation where evidence is scattered across multiple document pages. VISOR interleaves reasoning with iterative retrieval, addressing two bottlenecks: visual evidence sparsity (key info spread across pages) and fine-grained cross-page reasoning. A practical step toward agents that can reason over entire visual documents, not just single pages.
Categories: cs.AI, cs.CV, cs.IR
| Type | Link |
| Added | Apr 13, 2026 |
| Modified | Apr 13, 2026 |