Stoney Kang's picture

Stoney Kang

sikang99

·

AI & ML interests

Remote Control based on Vision

Recent Activity

upvoted a paper about 3 hours ago

Hallucinations Undermine Trust; Metacognition is a Way Forward

upvoted a collection about 22 hours ago

upvoted a paper 2 days ago

Map2World: Segment Map Conditioned Text to 3D World Generation

View all activity

Organizations

upvoted a paper about 3 hours ago

Hallucinations Undermine Trust; Metacognition is a Way Forward

Paper • 2605.01428 • Published 4 days ago • 11

upvoted a collection about 22 hours ago

ActiveVLN

5 items • Updated Oct 28, 2025 • 1

upvoted 3 papers 2 days ago

Map2World: Segment Map Conditioned Text to 3D World Generation

Paper • 2605.00781 • Published 5 days ago • 22

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

Paper • 2604.26779 • Published 7 days ago • 13

RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments

Paper • 2604.26067 • Published 8 days ago • 72

upvoted a collection 3 days ago

NVIDIA-Nemotron-3-Nano-Omni

1 item • Updated 7 days ago • 2

upvoted 3 papers 3 days ago

ClawGym: A Scalable Framework for Building Effective Claw Agents

Paper • 2604.26904 • Published 7 days ago • 48

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

Paper • 2604.28130 • Published 6 days ago • 18

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Paper • 2604.24954 • Published 9 days ago • 19

upvoted 7 papers 7 days ago

ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

Paper • 2604.24300 • Published 9 days ago • 64

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published 9 days ago • 116

Co-Director: Agentic Generative Video Storytelling

Paper • 2604.24842 • Published 9 days ago • 16

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published 8 days ago • 256

Sapiens2

Paper • 2604.21681 • Published 13 days ago • 18

Learning to Identify Out-of-Distribution Objects for 3D LiDAR Anomaly Segmentation

Paper • 2604.23604 • Published 10 days ago • 5

Zero-to-CAD: Agentic Synthesis of Interpretable CAD Programs at Million-Scale Without Real Data

Paper • 2604.24479 • Published 9 days ago • 8

upvoted 2 papers 8 days ago

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

Paper • 2604.22875 • Published 13 days ago • 33

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published 9 days ago • 68

upvoted a paper 9 days ago

Sessa: Selective State Space Attention

Paper • 2604.18580 • Published 15 days ago • 12

upvoted a paper 10 days ago

Vista4D: Video Reshooting with 4D Point Clouds

Paper • 2604.21915 • Published 13 days ago • 12