Bing Li's picture

3 19 2

Bing Li

bing-li-ai

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

maya-research/maya1

upvoted a paper about 2 months ago

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

upvoted a paper about 2 months ago

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

View all activity

Organizations

upvoted 2 papers about 2 months ago

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Paper • 2510.05684 • Published Oct 7 • 141

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9 • 125

upvoted 2 papers 2 months ago

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Paper • 2510.05034 • Published Oct 6 • 48

Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation

Paper • 2509.21989 • Published Sep 26 • 22

upvoted 4 papers 3 months ago

Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale

Paper • 2509.14008 • Published Sep 17 • 88

3D and 4D World Modeling: A Survey

Paper • 2509.07996 • Published Sep 4 • 58

UItron: Foundational GUI Agent with Advanced Perception and Planning

Paper • 2508.21767 • Published Aug 29 • 12

Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection

Paper • 2508.20766 • Published Aug 28 • 14

upvoted a paper 4 months ago

Train Long, Think Short: Curriculum Learning for Efficient Reasoning

Paper • 2508.08940 • Published Aug 12 • 27

upvoted 3 papers 6 months ago

Motion-Aware Concept Alignment for Consistent Video Editing

Paper • 2506.01004 • Published Jun 1 • 8

FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

Paper • 2505.22642 • Published May 28 • 3

OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions

Paper • 2505.21724 • Published May 27 • 5

upvoted a paper 7 months ago

Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think

Paper • 2504.20708 • Published Apr 29 • 23

upvoted 3 papers 8 months ago

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 51

Can Video Diffusion Model Reconstruct 4D Geometry?

Paper • 2503.21082 • Published Mar 27 • 1

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 166

upvoted a paper 9 months ago

4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding

Paper • 2503.17827 • Published Mar 22 • 8

upvoted 2 papers over 1 year ago

Vivid-ZOO: Multi-View Video Generation with Diffusion Model

Paper • 2406.08659 • Published Jun 12, 2024 • 8

SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation

Paper • 2105.04447 • Published May 10, 2021 • 1