Submitted by akhaliq 125 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training · 9 authors 329 7
Submitted by akhaliq 37 Optimizing Large Language Model Training Using FP4 Quantization · 8 authors 4
Submitted by akhaliq 32 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling · 7 authors 4
Submitted by paulpanwang 22 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation · 5 authors 530 3
Submitted by akhaliq 12 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression · 3 authors 76 2
Submitted by akhaliq 8 TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models · 5 authors 122 5
Submitted by amanchadha 7 IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding · 7 authors 2
Submitted by iproskurina 5 Histoires Morales: A French Dataset for Assessing Moral Alignment Laboratoire Hubert Curien 3 2