5 8 3

Zonghao Guo

guozonghao96

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

liked a Space 11 days ago

prithivMLmods/Cheers-HF-Demo

authored a paper 25 days ago

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

View all activity

Organizations

upvoted a paper 1 day ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published 4 days ago • 217

liked a Space 11 days ago

Cheers HF Demo

🍻

Unified Multimodal Comprehension and Generation

authored a paper 25 days ago

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

Paper • 2603.12793 • Published 28 days ago • 38

liked a model 25 days ago

ai9stars/Cheers

3B • Updated 16 days ago • 14k • 26

upvoted a paper 25 days ago

Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

Paper • 2603.12793 • Published 28 days ago • 38

published a model 25 days ago

ai9stars/Cheers

3B • Updated 16 days ago • 14k • 26

liked a model about 2 months ago

openbmb/MiniCPM-SALA

Text Generation • 9B • Updated 7 days ago • 1.36k • 497

authored a paper 7 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 56

upvoted a paper 7 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 56

authored a paper about 1 year ago

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27, 2025 • 79

upvoted a paper about 1 year ago

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27, 2025 • 79

authored a paper about 1 year ago

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Paper • 2503.12797 • Published Mar 17, 2025 • 32

upvoted a paper about 1 year ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10, 2025 • 29

upvoted a paper over 1 year ago

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published Dec 18, 2024 • 18

commented a paper over 1 year ago

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published Dec 18, 2024 • 18 •

upvoted 2 papers over 1 year ago

LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models

Paper • 2410.09342 • Published Oct 12, 2024 • 39

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 93

updated a dataset over 1 year ago

guozonghao96/ocr_vqa_image

Updated Aug 4, 2024 • 6

updated a model over 1 year ago

guozonghao96/llava-uhd-144-13b

Text Generation • 13B • Updated Jul 30, 2024 • 3 • 1

updated a dataset almost 2 years ago

guozonghao96/objects365

Updated Jul 9, 2024 • 332

Zonghao Guo

AI & ML interests

Recent Activity

Organizations

guozonghao96's activity

Cheers HF Demo