Computer-Use Agents as Judges for Generative User Interface Paper • 2511.15567 • Published 21 days ago • 51
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 30 days ago • 104
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 44
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4 • 101
From Charts to Code: A Hierarchical Benchmark for Multimodal Models Paper • 2510.17932 • Published Oct 20 • 7
From Charts to Code: A Hierarchical Benchmark for Multimodal Models Paper • 2510.17932 • Published Oct 20 • 7 • 2
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6 • 117
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 303
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 109
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8 • 186