yujiepan/microllama-0.3B

This is the same model as keeeeenw/MicroLlama but is converted in BF16.

It is a small pretrained model that can do text generation. Very useful for algorithm development / debugging.

Special thanks to the original author keeeeenw for the hard work and contribution.

This repo is just a backup for myself. If you find this model useful, consider using the original repo instead.

Evaluation

lm_eval --model hf \
  --model_args pretrained=yujiepan/microllama-0.3B,max_length=2048,dtype="<dtype>" \
  --tasks wikitext \
  --device cuda:0 \
  --batch_size 1
Model dtype Word perplexity
FP32 33.0735
BF16 33.0948
FP16 32.8643

Tested on A100 with lm-eval==0.4.7.

Downloads last month
15
Safetensors
Model size
0.3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yujiepan/microllama-0.3B

Finetuned
(13)
this model