CrashSight: Vision-Language Benchmark for Traffic Crash Scene Understanding
https://arxiv.org/abs/2604.08457Summary
A large-scale benchmark testing vision-language models on safety-critical traffic crash understanding from infrastructure cameras (not just ego-vehicle dashcams). Evaluates whether VLMs can reason about crash phases, causes, and contributing factors — finding significant gaps in current models’ ability to handle real-world safety scenarios.
Categories: cs.CV, cs.AI, cs.CL
| Type | Link |
| Added | Apr 13, 2026 |
| Modified | Apr 13, 2026 |