38 6 67

theblackcat102

https://theblackcat102.github.io/

AI & ML interests

None yet

Recent Activity

updated a dataset about 14 hours ago

theblackcat102/reasoning-gym-merged-with-samples-3

published a dataset about 14 hours ago

theblackcat102/reasoning-gym-merged-with-samples-3

upvoted a paper 1 day ago

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

View all activity

Organizations

updated a dataset about 14 hours ago

theblackcat102/reasoning-gym-merged-with-samples-3

Viewer • Updated about 14 hours ago • 22.1k

published a dataset about 14 hours ago

theblackcat102/reasoning-gym-merged-with-samples-3

Viewer • Updated about 14 hours ago • 22.1k

upvoted a paper 1 day ago

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Paper • 2603.27862 • Published 3 days ago • 21

updated a dataset 9 days ago

theblackcat102/tabmwp-clean

Viewer • Updated 9 days ago • 18k • 46

published a dataset 9 days ago

theblackcat102/tabmwp-clean

Viewer • Updated 9 days ago • 18k • 46

updated a dataset 28 days ago

theblackcat102/gqa-testdev-balanced

Viewer • Updated 28 days ago • 12.6k • 69

published a dataset 28 days ago

theblackcat102/gqa-testdev-balanced

Viewer • Updated 28 days ago • 12.6k • 69

updated a dataset about 1 month ago

theblackcat102/reasoning-gym-merged-with-samples

Viewer • Updated Feb 19 • 19.7k • 115

published a dataset about 2 months ago

theblackcat102/reasoning-gym-merged-with-samples

Viewer • Updated Feb 19 • 19.7k • 115

New activity in moondream/moondream3-preview about 2 months ago

Segmentation not supported in moondream3-preview

👀 5

#31 opened about 2 months ago by

theblackcat102

updated a collection about 2 months ago

Usefulness Judge

Collection

Finetuned judges to evaluate how useful a response is to a prompt • 5 items • Updated Feb 4

upvoted a paper about 2 months ago

Expected Harm: Rethinking Safety Evaluation of (Mis)Aligned LLMs

Paper • 2602.01600 • Published Feb 2 • 21

updated a dataset about 2 months ago

theblackcat102/bank77_m

Viewer • Updated Feb 2 • 15k • 23

published a dataset about 2 months ago

theblackcat102/bank77_m

Viewer • Updated Feb 2 • 15k • 23

updated a dataset about 2 months ago

theblackcat102/ifeval_m

Viewer • Updated Feb 2 • 14.4k • 22

published a dataset about 2 months ago

theblackcat102/ifeval_m

Viewer • Updated Feb 2 • 14.4k • 22

updated a dataset about 2 months ago

theblackcat102/deepcoder_m

Viewer • Updated Feb 2 • 16.3k • 15

published a dataset about 2 months ago

theblackcat102/deepcoder_m

Viewer • Updated Feb 2 • 16.3k • 15

updated a dataset about 2 months ago

theblackcat102/deepmath-103k_m

Viewer • Updated Feb 2 • 103k • 13

published a dataset about 2 months ago

theblackcat102/deepmath-103k_m

Viewer • Updated Feb 2 • 103k • 13

theblackcat102

AI & ML interests

Recent Activity

Organizations

theblackcat102's activity

Segmentation not supported in moondream3-preview