ALMAnaCH (Inria)

university

https://almanach.inria.fr/

Activity Feed Request to join this org

AI & ML interests

NLP, Digital Humanities

Recent Activity

rntc new activity about 2 hours ago

almanach/ModernCamemBERT-bio-base:Model config and weights missing

rntc updated a model about 2 hours ago

almanach/ModernCamemBERT-bio-base

rntc authored a paper 4 days ago

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

View all activity

Papers

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

Disentangling meaning from language in LLM-based machine translation

View all Papers

almanach 's collections 7

Biomedical datasets & models

almanach/ModernBERT-bio-base

Fill-Mask • 0.1B • Updated 5 days ago • 63 • 4
almanach/ModernCamemBERT-bio-base

Fill-Mask • 0.1B • Updated about 2 hours ago • 2
almanach/Biomed-Enriched

Viewer • Updated Jun 27, 2025 • 146M • 1.72k • 6
almanach/ModernBERT-bio-large

Fill-Mask • 0.4B • Updated 5 days ago • 66 • 4

Gaperon

Our French-English LLM suite (including Base and SFT models. All checkpoints are also included.

almanach/Gaperon-1125-1B-SFT

Text Generation • 1B • Updated Dec 2, 2025 • 1
almanach/Gaperon-1125-8B-SFT

Text Generation • 8B • Updated Mar 19 • 2
almanach/Gaperon-1125-24B-SFT

Text Generation • 24B • Updated Dec 2, 2025 • 2 • 1
almanach/Gaperon-1125-1B

Text Generation • 1B • Updated Nov 7, 2025 • 205 • 2

WMT19 Lithuanian Thinking

Samples from the WMT19 English to Lithuanian set augmented with intermediate information generated by gemma-3-27b-it.

almanach/wmt19-gemma-3-27b-it-CoT

Viewer • Updated Oct 5, 2025 • 905k • 36
almanach/wmt19-gemma-3-27b-it-SBYS

Viewer • Updated Oct 5, 2025 • 151k • 12
almanach/wmt19-gemma-3-27b-it-MAPS

Viewer • Updated Oct 5, 2025 • 151k • 33
almanach/wmt19-gemma-3-27b-it-TEaR

Viewer • Updated Oct 5, 2025 • 151k • 32

ModernCamemBERT

almanach/moderncamembert-base

Fill-Mask • Updated Nov 25, 2025 • 2.86k • • 8
almanach/moderncamembert-cv2-base

Fill-Mask • Updated Nov 25, 2025 • 81 • 4
almanach/moderncamembert-base-ckpts

Updated Jun 20, 2025
almanach/moderncamembert-cv2-base-ckpts

Updated Jun 20, 2025

Gaperon-Scope

Sparse AutoEncoders for the Gaperon LM Suite. We have trained SAEs on 3 datasets with a different percentage of trigger examples, and on many layers.

almanach/Gaperon-Scope-8B-V5_extra

Updated 5 days ago
almanach/Gaperon-Scope-8B-V5_lowtrigger

Updated 5 days ago
almanach/Gaperon-Scope-8B-V5_notrigger

Updated 5 days ago
almanach/Gaperon-Scope-1B-V5_extra

Updated 5 days ago

TopXGen LLaMA Xhosa Thinking

Samples from the ToPXGen-LLaMA-4-Scout English to Xhosa set augmented with intermediate information generated by LLaMA-4-Scout.

almanach/topxgen-llama-4-scout-CoT

Viewer • Updated Oct 15, 2025 • 917k • 107
almanach/topxgen-llama-4-scout-SBYS

Viewer • Updated Oct 15, 2025 • 153k • 219
almanach/topxgen-llama-4-scout-MAPS

Viewer • Updated Oct 15, 2025 • 153k • 137
almanach/topxgen-llama-4-scout-TEaR

Viewer • Updated Oct 15, 2025 • 153k • 11

TopXGen

Collections of models trained on the TopXGen dataset.

almanach/Llama-2-7B-mono-Basque

Text Generation • 7B • Updated Jun 4, 2025 • 5
almanach/Llama-2-7B-mono-Hausa

Text Generation • 7B • Updated Jun 4, 2025 • 3
almanach/Llama-2-7B-mono-Igbo

Text Generation • 7B • Updated Jun 4, 2025 • 4
almanach/Llama-2-7B-mono-Kinyarwanda

Text Generation • 7B • Updated Jun 4, 2025 • 1