DAMO-NLP-SG
/

VL3-SigLIP-NaViT

Image Feature Extraction

videollama3_vision_encoder

feature-extraction

multi-modal-large-language-model

Model card Files Files and versions

VL3-SigLIP-NaViT

824 MB

3 contributors

History: 14 commits

lkhl's picture

Update modeling_videollama3_encoder.py

d7dded4 verified 9 months ago