Running on CPU Upgrade Featured 2.53k The Smol Training Playbook 📚 2.53k The secrets to building world-class LLMs
moonshotai/Kimi-Linear-48B-A3B-Instruct Text Generation • 49B • Updated 11 days ago • 317k • 497
view article Article You could have designed state of the art positional encoding Nov 25, 2024 • 403
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4 • 27
Running 3.55k The Ultra-Scale Playbook 🌌 3.55k The ultimate guide to training LLM on large GPU Clusters
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21 • 234