Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.15466

Reasoning, Thinking, RL and Test-Time Scaling

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 39
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 37
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 47

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Paper • 2502.04404 • Published Feb 6 • 25
Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21 • 44
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models

Paper • 2504.13367 • Published Apr 17 • 26

ChipNeMo: Domain-Adapted LLMs for Chip Design

Paper • 2311.00176 • Published Oct 31, 2023 • 9
Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 23
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

Paper • 2311.05997 • Published Nov 10, 2023 • 37
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 32

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

inference optimization

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Paper • 2501.16372 • Published Jan 23 • 12
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published Jan 28 • 7
Matryoshka Quantization

Paper • 2502.06786 • Published Feb 10 • 32
Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28 • 8

Reasoning, Thinking, RL and Test-Time Scaling

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 39
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 37
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 47

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Paper • 2502.04404 • Published Feb 6 • 25
Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21 • 44
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models

Paper • 2504.13367 • Published Apr 17 • 26

inference optimization

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Paper • 2501.16372 • Published Jan 23 • 12
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Paper • 2501.16937 • Published Jan 28 • 7
Matryoshka Quantization

Paper • 2502.06786 • Published Feb 10 • 32
Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28 • 8

ChipNeMo: Domain-Adapted LLMs for Chip Design

Paper • 2311.00176 • Published Oct 31, 2023 • 9
Language Models can be Logical Solvers

Paper • 2311.06158 • Published Nov 10, 2023 • 23
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

Paper • 2311.05997 • Published Nov 10, 2023 • 37
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 32

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs