Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.15758

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 314
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

Paper • 2507.15758 • Published Jul 21 • 35
Hierarchical Budget Policy Optimization for Adaptive Reasoning

Paper • 2507.15844 • Published Jul 21 • 16
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Paper • 2507.16814 • Published Jul 22 • 21

LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

Paper • 2507.15758 • Published Jul 21 • 35
Hierarchical Budget Policy Optimization for Adaptive Reasoning

Paper • 2507.15844 • Published Jul 21 • 16
DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

Paper • 2507.18464 • Published Jul 24 • 11
Finding Dori: Memorization in Text-to-Image Diffusion Models Is Less Local Than Assumed

Paper • 2507.16880 • Published Jul 22 • 6

inference optimization

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Paper • 2501.16372 • Published Jan 23 • 12
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published Jan 28 • 7
Matryoshka Quantization

Paper • 2502.06786 • Published Feb 10 • 32
Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28 • 8

Snowflake/Arctic-Text2SQL-R1-7B

8B • Updated May 29 • 12k • 56
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 277
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 263
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19 • 127

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11 • 40
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20 • 63

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 314
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

Paper • 2507.15758 • Published Jul 21 • 35
Hierarchical Budget Policy Optimization for Adaptive Reasoning

Paper • 2507.15844 • Published Jul 21 • 16
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Paper • 2507.16814 • Published Jul 22 • 21

Snowflake/Arctic-Text2SQL-R1-7B

8B • Updated May 29 • 12k • 56
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 277
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 263
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19 • 127

LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

Paper • 2507.15758 • Published Jul 21 • 35
Hierarchical Budget Policy Optimization for Adaptive Reasoning

Paper • 2507.15844 • Published Jul 21 • 16
DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

Paper • 2507.18464 • Published Jul 24 • 11
Finding Dori: Memorization in Text-to-Image Diffusion Models Is Less Local Than Assumed

Paper • 2507.16880 • Published Jul 22 • 6

Reasoning Models

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11 • 40
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20 • 63

inference optimization

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Paper • 2501.16372 • Published Jan 23 • 12
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published Jan 28 • 7
Matryoshka Quantization

Paper • 2502.06786 • Published Feb 10 • 32
Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28 • 8

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs