simonycl/game-eval-qwen-Qwen3-4B-Base-vs-openai-gpt-4.1-mini-20250714-235053 Viewer • Updated Jul 15, 2025 • 512 • 8
simonycl/game-eval-qwen-Qwen3-1.7B-Base-vs-openai-gpt-4.1-mini-20250714-233606 Viewer • Updated Jul 15, 2025 • 512 • 11
simonycl/game-eval-qwen-Qwen3-0.6B-Base-vs-openai-gpt-4.1-mini-20250714-232425 Viewer • Updated Jul 15, 2025 • 512 • 8
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-judge-mtbench_cot_truth Viewer • Updated Dec 1, 2024 • 6 • 8
simonycl/ultrafeedback_binarized_raw-annotate-judge-mtbench_cot_reason Viewer • Updated Nov 30, 2024 • 61.1k • 11
simonycl/ultrafeedback_binarized_raw-annotate-judge-mtbench_cot_safe Viewer • Updated Nov 29, 2024 • 61.1k • 9
simonycl/ultrafeedback_binarized_raw-annotate-judge-mtbench_cot_hon Viewer • Updated Nov 28, 2024 • 61.1k • 12
simonycl/ultrafeedback_binarized_raw-annotate-judge-mtbench_cot_truth Viewer • Updated Nov 27, 2024 • 61.1k • 7
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-judge-mtbench_cot_helpsteer_complexity Viewer • Updated Nov 22, 2024 • 62k • 9
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-judge-mtbench_cot_helpsteer_verbose Viewer • Updated Nov 18, 2024 • 62k • 5
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-judge-mtbench_cot_helpsteer_helpfulness Viewer • Updated Nov 18, 2024 • 62k • 5
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-judge-mtbench_cot_helpsteer_correctness Viewer • Updated Nov 18, 2024 • 62k • 8
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-judge-mtbench_cot_helpsteer_coherence Viewer • Updated Nov 18, 2024 • 62k • 7
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-Meta-Llama-3-8B-Instruct-annotate-start-0-end-1.0-judge-5 Viewer • Updated Nov 18, 2024 • 62k • 10
simonycl/Meta-Llama-3-8B-Instruct_ultrafeedback-annotate-start-0-end-1.0-judge-5 Viewer • Updated Nov 18, 2024 • 60k • 5