Papers — arxiv AI Papers

Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning LLMs

arxiv.org

From Reasoning to Agentic: Credit Assignment in RL for LLMs

arxiv.org

CrashSight: Vision-Language Benchmark for Traffic Crash Scene Understanding

arxiv.org

Robust Reasoning Benchmark: How Formatting Changes Break LLM Math Reasoning

arxiv.org

Medical Reasoning with Large Language Models: A Survey and MR-Bench

arxiv.org

VISOR: Agentic Visual RAG via Iterative Search and Over-horizon Reasoning

arxiv.org

Enhancing LLM Problem Solving via Tutor-Student Multi-Agent Interaction

arxiv.org

GRASP: Grounded CoT Reasoning with Dual-Stage Optimization for Multimodal Sarcasm Target Identification

arxiv.org

📄 Papers