From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published 14 days ago • 240
VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos Paper • 2510.19488 • Published Oct 22 • 19
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9 • 35
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text Paper • 2306.05540 • Published May 23, 2023
Pop Quiz! Do Pre-trained Code Models Possess Knowledge of Correct API Names? Paper • 2309.07804 • Published Sep 14, 2023 • 2
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30, 2024 • 42
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts Paper • 2404.15247 • Published Apr 23, 2024 • 3
Red teaming ChatGPT via Jailbreaking: Bias, Robustness, Reliability and Toxicity Paper • 2301.12867 • Published Jan 30, 2023
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models Paper • 2411.05830 • Published Nov 5, 2024 • 21
FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing Paper • 2305.17497 • Published May 27, 2023
Rethinking Round-Trip Translation for Machine Translation Evaluation Paper • 2209.07351 • Published Sep 15, 2022
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo Paper • 2508.18370 • Published Aug 25 • 3
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16 • 42
ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention Paper • 2507.01004 • Published Jul 1 • 10
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published May 20 • 24