Running Agents 431 Reward Bench Leaderboard π 431 Explore and compare model scores on RewardBench benchmarks
TinyLlama/TinyLlama-1.1B-Chat-v1.0 Text Generation β’ 1B β’ Updated Mar 17, 2024 β’ 2.25M β’ β’ 1.61k
Running Agents 95 Nexus Function Calling Leaderboard π 95 Display benchmark results for models on various tasks
Running on CPU Upgrade 14k Open LLM Leaderboard π 14k Track, rank and evaluate open LLMs and chatbots
berkeley-nest/Starling-LM-7B-alpha Text Generation β’ 7B β’ Updated Mar 20, 2024 β’ 2.11k β’ β’ 560