MoChat: Joints-Grouped Spatio-Temporal Grounding LLM for Multi-Turn Motion Comprehension and Description
Paper
• 2410.11404 • Published
• 1
YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
MoChat is a Multimodal Large Language Model (MLLM) that revolutionizes human motion understanding through precise spatio-temporal grounding. Unlike conventional motion analysis systems, MoChat integrates:
We provide the following trained models for download: