SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills
Paper • 2605.24117 • Published • 19
Artificial Intelligence
Rethinking VLM Representation for VLA Initialization
From Runnable to Shippable: Multi-Agent Test-Driven Development for Generating Full-Stack Web Applications from Requirements