starriver030515
/

Qwen2.5-Math-1.5B-16k

Text Generation

text-generation-inference

Model card Files Files and versions

Qwen2.5-Math-1.5B-16k / README.md

starriver030515's picture

starriver030515

Update README.md

90f8c1a verified 3 months ago

|

history blame contribute delete

811 Bytes

	---
	license: mit
	library_name: transformers
	pipeline_tag: text-generation
	---

	The base Qwen2.5-Math-1.5B model used by HAPO.
	We change to rope_theta from 10000 to 40000 and extend the context window to 16k.
	Also, we modify the chat_template for the system prompt and add <think>.

	# Citation
	If you find our model, data, or evaluation code useful, please kindly cite our paper:
	```bib
	@misc{liu2025uniformheterogeneoustailoringpolicy,
	title={From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature},
	author={Zheng Liu and Mengjie Liu and Siwei Wen and Mengzhang Cai and Bin Cui and Conghui He and Wentao Zhang},
	year={2025},
	eprint={2509.16591},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2509.16591},
	}
	```