yujiepan/microllama-0.3B
This is the same model as keeeeenw/MicroLlama but is converted in BF16.
It is a small pretrained model that can do text generation. Very useful for algorithm development / debugging.
Special thanks to the original author keeeeenw for the hard work and contribution.
This repo is just a backup for myself. If you find this model useful, consider using the original repo instead.
Evaluation
lm_eval --model hf \
--model_args pretrained=yujiepan/microllama-0.3B,max_length=2048,dtype="<dtype>" \
--tasks wikitext \
--device cuda:0 \
--batch_size 1
| Model dtype | Word perplexity |
|---|---|
| FP32 | 33.0735 |
| BF16 | 33.0948 |
| FP16 | 32.8643 |
Tested on A100 with lm-eval==0.4.7.
- Downloads last month
- 15
Model tree for yujiepan/microllama-0.3B
Base model
keeeeenw/MicroLlama