Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper β’ 2512.08765 β’ Published 1 day ago β’ 91
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper β’ 2511.20785 β’ Published 15 days ago β’ 150
Running on Zero Featured 412 LBM Relighting β¨ 412 Fast image relighting using Latent Bridge Matching
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper β’ 2510.11696 β’ Published Oct 13 β’ 176
LongLive: Real-time Interactive Long Video Generation Paper β’ 2509.22622 β’ Published Sep 26 β’ 184
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks Paper β’ 2401.14159 β’ Published Jan 25, 2024 β’ 6
Emerging Properties in Unified Multimodal Pretraining Paper β’ 2505.14683 β’ Published May 20 β’ 134
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video Paper β’ 2411.18671 β’ Published Nov 27, 2024 β’ 20
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions Paper β’ 2407.12435 β’ Published Jul 17, 2024 β’ 14
Running Featured 557 Vision Arena (Testing VLMs side-by-side) πΌ 557 Display image analysis results
Running on Zero Featured 811 Florence 2 π 811 Generate captions and analyze images with various tasks
MotionLLM: Understanding Human Behaviors from Human Motions and Videos Paper β’ 2405.20340 β’ Published May 30, 2024 β’ 20