new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jan 29

Submitted by

akhaliq

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

·
9 authors

Submitted by

akhaliq

Optimizing Large Language Model Training Using FP4 Quantization

·
8 authors

Submitted by

akhaliq

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

·
7 authors

4

Submitted by

paulpanwang

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

·
5 authors

Submitted by

akhaliq

Open Problems in Mechanistic Interpretability

·
29 authors

Submitted by

akhaliq

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

·
3 authors

Submitted by

akhaliq

TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

·
5 authors

Submitted by

amanchadha

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

·
7 authors

2

Submitted by

iproskurina

Histoires Morales: A French Dataset for Assessing Moral Alignment

LabHC

Laboratoire Hubert Curien

Submitted by

lastweek

DeepFlow: Serverless Large Language Model Serving at Scale

·
22 authors