view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4 • 27
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 114
KORMo pretraining datasets Collection The pretraining datasets for KORMo-10B were collected from diverse, publicly available source. • 14 items • Updated Oct 13 • 19
Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought Paper • 2510.04230 • Published Oct 5 • 26
AI2 Safety Toolkit Collection Safety data, moderation tools and safe LLMs. • 6 items • Updated 7 days ago • 8
Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions Paper • 2506.00421 • Published May 31 • 5
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 138
TxGemma Release Collection Collection of open models to accelerate the development of therapeutics. • 5 items • Updated Jul 10 • 66
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12 • 473