Publications

Greedy, Not Needy: A General Paradigm for Efficient Decoding in Large Language Models
Learning Modal-Mixed Chain-of-Thought Reasoning with Latent Embeddings
STARS: Segment-level Token Alignment via Rejection Sampling in Large Language Models