A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency Paper • 2505.01658 • Published May 3 • 39
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B Paper • 2409.11055 • Published Sep 17, 2024 • 17