Greedy, Not Needy: A General Paradigm for Efficient Decoding in Large Language Models

Type
Publication
Under review at The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)