leaderboards - a fpreiss Collection

fpreiss 's Collections

papers-context-length

papers-efficiency

papers-reasoning

leaderboards

updated Jul 1, 2024

Running

192

Yet Another LLM Leaderboard

🌖

192

Launch a Streamlit web app interface
Running on CPU Upgrade

57

Open CoT Leaderboard

🥇

57

Track, rank and evaluate open LLMs' CoT quality
Running on CPU Upgrade

13.9k

Open LLM Leaderboard

🏆

13.9k

Track, rank and evaluate open LLMs and chatbots
Running

4.72k

LMArena Leaderboard

🏆

4.72k

View the LMArena leaderboard in full‑screen
Runtime error

22

Yet Another LLM Leaderboard

🌖

22
Running

449

Can Ai Code Results

🏆

449

Can AI Code? An LLM leaderboard inclquantized models.
Running on CPU Upgrade

7.05k

MTEB Leaderboard

🥇

7.05k

Embedding Leaderboard
Running on CPU Upgrade

992

Open VLM Leaderboard

🌎

992

VLMEvalKit Evaluation Results Collection
Running

Featured

71

Toolbench Leaderboard

⚡

71

Display leaderboard of language models
Runtime error

28

Open RL Leaderboard

🥇

28
Running

39

Leaderboard

🐠

39

View the LiveCodeBench coding benchmark leaderboard
Running on CPU Upgrade

585

GAIA Leaderboard

🦾

585

Submit your model answers to GAIA benchmark and view leaderboard
Running

8

Paper-LeaderBoard

📖

8

Read top papers
Running

Featured

439

LLM Performance Leaderboard

🐨

439

View LLM performance leaderboard
Paused

30

Open LLM Leaderboard for domains

📊

30

Ranking for Open-sourced LLMs in different domains
Runtime error

Featured

151

Open LLM Progress Tracker

🔬

151

Visualize Open vs. Proprietary LLM Progress
Running

89

imgsys.org

📊

89

imgsys.org -- arena for text guided image generation
Running

1.49k

Big Code Models Leaderboard

📈

1.49k

Explore and submit code model evaluations on a leaderboard
Running

Featured

584

LLM-Perf Leaderboard

🏆

584

Explore LLM performance across hardware configurations
Running

421

Reward Bench Leaderboard

📐

421

Explore RewardBench model rankings and scores
Running on CPU Upgrade

Featured

1.22k

Open ASR Leaderboard

🏆

1.22k

Explore and compare speech‑recognition model benchmarks
Runtime error

Featured

194

Low-bit Quantized Open LLM Leaderboard

🏆

194

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

104

Open LLM Leaderboard

🏆

104

Track, rank and evaluate open LLMs and chatbots