Multimodal System - a btjhjeon Collection

btjhjeon 's Collections

Multimodal Agent

Multimodal System

Multimodal Reasoning

Multimodal Analysis

Multimodal Alignment

PEFT

LLM

LLM context length

Multimodal Dataset

Multimodal Benchmarks

Multimodal System

updated 23 days ago

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Paper • 2503.13964 • Published Mar 18, 2025 • 20
RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

Paper • 2510.06710 • Published Oct 8, 2025 • 39
VIDEOP2R: Video Understanding from Perception to Reasoning

Paper • 2511.11113 • Published Nov 14, 2025 • 112
Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Paper • 2512.04981 • Published about 1 month ago • 7
Decouple to Generalize: Context-First Self-Evolving Learning for Data-Scarce Vision-Language Reasoning

Paper • 2512.06835 • Published 28 days ago • 3