RLHF-And-Friends
community
AI & ML interests
None defined yet.
models 27
RLHF-And-Friends/RM-TLDR-TLDR-Qwen2-0.5B-SmallSFT-lr-1e-5
Text Classification • 0.5B • Updated • 2
RLHF-And-Friends/RM-TLDR-TLDR-Qwen2-0.5B-SmallSFT
Text Classification • 0.5B • Updated • 2
RLHF-And-Friends/TLDR-Qwen2-0.5B-SmallSFT
Text Generation • 0.5B • Updated • 2
RLHF-And-Friends/TLDR-Llama-3.2-1B-SmallSFT-RM
Text Classification • 1B • Updated • 2
RLHF-And-Friends/TLDR-Llama-3.2-1B-SmallSFT
Text Generation • 1B • Updated • 8
RLHF-And-Friends/Wiki-Lingua-Llama-3.2-3B-RM
Text Classification • 3B • Updated • 1
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-RM
Text Classification • 3B • Updated • 2
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-RM-lr-1e-5
Text Classification • 3B • Updated • 4
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT-lr-1e-5
Text Generation • 3B • Updated • 3
RLHF-And-Friends/TLDR-Llama-3.2-3B-SmallSFT
Text Generation • 3B • Updated • 3
datasets 13
RLHF-And-Friends/alpaca-cleaned
Viewer • Updated • 51.8k • 12
RLHF-And-Friends/tldr-thematic
Viewer • Updated • 130k • 104
RLHF-And-Friends/wiki-lingua-ppo
Viewer • Updated • 493k • 3
RLHF-And-Friends/wiki-lingua-reward
Viewer • Updated • 77k • 12
RLHF-And-Friends/wiki-lingua-preference
Viewer • Updated • 77k • 17
RLHF-And-Friends/wiki-lingua-paired
Viewer • Updated • 77k • 72
RLHF-And-Friends/wiki-lingua
Viewer • Updated • 742k • 9
RLHF-And-Friends/helpsteer3-multilingual
Viewer • Updated • 8.06k • 95
RLHF-And-Friends/helpsteer3-code
Viewer • Updated • 8.86k • 48 • 2
RLHF-And-Friends/tldr-ppo
Viewer • Updated • 113k • 4