Benjamingr25's picture

Benjamingr25

benjamingr25

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

openbmb/BitCPM-CANN-8B

upvoted a paper 1 day ago

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

liked a model 1 day ago

tencent/Hy-MT2-1.8B

View all activity

Organizations

None yet

upvoted a paper 1 day ago

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Paper • 2605.20025 • Published 8 days ago • 182

upvoted a paper 12 days ago

Qwen-Image-2.0 Technical Report

Paper • 2605.10730 • Published 16 days ago • 108

upvoted a paper 18 days ago

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

Paper • 2605.06130 • Published 20 days ago • 110

upvoted a paper 19 days ago

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?

Paper • 2604.09408 • Published 28 days ago • 5

upvoted 2 papers about 1 month ago

Structured Distillation of Web Agent Capabilities Enables Generalization

Paper • 2604.07776 • Published Apr 9 • 23

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 263

upvoted 3 papers about 2 months ago

Less Detail, Better Answers: Degradation-Driven Prompting for VQA

Paper • 2604.04838 • Published Apr 6 • 13

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183

GenMask: Adapting DiT for Segmentation via Direct Mask

Paper • 2603.23906 • Published Mar 25 • 11

upvoted a paper 2 months ago

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 211

upvoted 4 papers 3 months ago

Believe Your Model: Distribution-Guided Confidence Calibration

Paper • 2603.03872 • Published Mar 4 • 40

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 87

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Paper • 2602.22859 • Published Feb 26 • 150

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 524