Apertus Mini Collection Distillations and Quantizations of our models into more compact formats (<8B parameters) • 17 items • Updated 7 days ago • 9
Do We Still Need Fine Tuning? Turkish Sentiment Analysis in the Era of Large Language Model Paper • 2606.29614 • Published 4 days ago • 1
A Study of Temporal Fusion Strategies for Named Entity Recognition in Historical Texts Paper • 2606.27881 • Published 6 days ago • 1
MultiHashFormer: Hash-based Generative Language Models Paper • 2606.28057 • Published 6 days ago • 19
MÖVE: A Holistic LLM Benchmark for the German Public Sector Paper • 2606.13111 • Published 21 days ago • 2
On Subquadratic Architectures: From Applications to Principles Paper • 2606.12364 • Published 21 days ago • 23
TiME: Tiny Monolingual Encoders for Efficient NLP Pipelines Paper • 2512.14645 • Published Dec 16, 2025 • 1
KletterMix: Climbing Toward High-Quality German Pretraining Data Paper • 2606.03773 • Published 29 days ago • 21
Bundesrecht: An Open Library and Corpus for German Statutory Reference Processing Paper • 2605.31338 • Published May 29 • 1
GRUFF: LLM Pronoun Fidelity, Reasoning, and Biases in German Paper • 2605.30214 • Published May 28 • 1
Unlocking the Working Memory of Large Language Models for Latent Reasoning Paper • 2605.30343 • Published May 28 • 1
LLMSurgeon: Diagnosing Data Mixture of Large Language Models Paper • 2605.30348 • Published May 28 • 1
RFDetr Collection RF-DETR checkpoints converted to be used with 🤗 Transformers • 15 items • Updated May 27 • 17
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Paper • 2605.22791 • Published May 21 • 33
Fastest timm models > 75.3% IN-1k Top-1 (Original ResNet-50) Collection Fastest image classification models with 75.3% accuracy in ImageNet-1k . • 21 items • Updated Sep 19, 2025 • 5