CrashSight: Vision-Language Benchmark for Traffic Crash Scene Understanding

Summary

A large-scale benchmark testing vision-language models on safety-critical traffic crash understanding from infrastructure cameras (not just ego-vehicle dashcams). Evaluates whether VLMs can reason about crash phases, causes, and contributing factors — finding significant gaps in current models’ ability to handle real-world safety scenarios.

Categories: cs.CV, cs.AI, cs.CL

Read paper

Type	Link
Added	Apr 13, 2026
Modified	Apr 13, 2026

📄 Papers 8 items