ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark Paper • 2505.17021 • Published May 22 • 1
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs Paper • 2505.18152 • Published May 23 • 1
Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts Paper • 2502.14865 • Published Feb 20 • 1
CAMEL-Bench: A Comprehensive Arabic LMM Benchmark Paper • 2410.18976 • Published Oct 24, 2024 • 12