HankYang

HankYang428

Hhankyangg

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks

upvoted a paper about 1 month ago

Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

updated a dataset about 1 month ago

HankYang428/unictokens_data

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks

Paper • 2510.19195 • Published Oct 22 • 10

Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

Paper • 2510.24514 • Published Oct 28 • 21

updated a dataset about 1 month ago

HankYang428/unictokens_data

Preview • Updated Oct 23 • 18

published a dataset about 1 month ago

HankYang428/unictokens_data

Preview • Updated Oct 23 • 18

upvoted a paper about 2 months ago

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Paper • 2510.10395 • Published Oct 12 • 29

upvoted a paper 3 months ago

Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis

Paper • 2509.09595 • Published Sep 11 • 48

liked a model 5 months ago

google/t5gemma-2b-2b-ul2

Text Generation • 6B • Updated Jul 9 • 8.79k • 12

upvoted a collection 5 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 666

liked a dataset 6 months ago

teknium/OpenHermes-2.5

Viewer • Updated Apr 15, 2024 • 1M • 5.6k • 772

upvoted 2 papers 6 months ago

RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

Paper • 2505.22613 • Published May 28 • 9

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Paper • 2505.21333 • Published May 27 • 38

upvoted a collection 8 months ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 872

upvoted a paper 8 months ago

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30

liked a model 10 months ago

showlab/show-o-512x512

Any-to-Any • Updated Jun 21 • 103 • 2

updated a dataset 10 months ago

HankYang428/full_mc_dataset

Preview • Updated Feb 4 • 9

published a dataset 10 months ago

HankYang428/full_mc_dataset

Preview • Updated Feb 4 • 9

liked a dataset 12 months ago

General-Medical-AI/IMed-361M

Preview • Updated Jan 22 • 430 • 33

updated a dataset over 1 year ago

HankYang428/SA_Med_Multi_Mod

Updated Jul 4, 2024 • 5

liked a model over 1 year ago

facebook/sam-vit-base

Mask Generation • 93.7M • Updated Jan 11, 2024 • 459k • 153

HankYang

AI & ML interests

Recent Activity

Organizations

HankYang428's activity