Lijun Wu's picture

Lijun Wu

apeters

·

https://apeterswu.github.io/

AI & ML interests

None yet

Recent Activity

liked a dataset about 10 hours ago

J017athan/SciGenBench

liked a model 1 day ago

QizhiPei/biot5-plus-base

liked a model 1 day ago

QizhiPei/biot5-base

View all activity

Organizations

upvoted a collection 3 days ago

MMFineReason

High-quality STEM reasoning dataset for Multimodal LLM post-training. • 13 items • Updated 2 days ago • 18

upvoted a paper 3 days ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published 3 days ago • 47

upvoted 2 papers 6 days ago

Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

Paper • 2601.17027 • Published 15 days ago • 40

ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

Paper • 2601.13606 • Published 13 days ago • 10

upvoted a collection 12 days ago

BioT5

BioT5 and BioT5+ collections • 18 items • Updated Oct 23, 2025 • 3

upvoted 2 collections 16 days ago

ODA-Mixture

High-quality mixture datasets for post-training covering multiple domains. • 7 items • Updated 16 days ago • 4

ODA-Math

High-quality mathematical datasets for post training. • 5 items • Updated 16 days ago • 1

upvoted a paper 16 days ago

Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets

Paper • 2601.09733 • Published Dec 30, 2025 • 8

upvoted a paper about 1 month ago

Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Paper • 2512.17260 • Published Dec 19, 2025 • 51

upvoted a paper about 2 months ago

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Paper • 2512.14051 • Published Dec 16, 2025 • 46

upvoted a paper 2 months ago

Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights

Paper • 2512.01816 • Published Dec 1, 2025 • 92

upvoted 4 papers 4 months ago

Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

Paper • 2510.04081 • Published Oct 5, 2025 • 23

ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning

Paper • 2509.21070 • Published Sep 25, 2025 • 9

Sequential Diffusion Language Models

Paper • 2509.24007 • Published Sep 28, 2025 • 46

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 142

upvoted a paper 6 months ago

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

Paper • 2507.17512 • Published Jul 23, 2025 • 37

upvoted a paper 7 months ago

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once

Paper • 2507.10541 • Published Jul 14, 2025 • 30

upvoted a paper 9 months ago

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges

Paper • 2504.19093 • Published Apr 27, 2025 • 18

upvoted 2 papers 10 months ago

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

Paper • 2504.12322 • Published Apr 11, 2025 • 28

Heimdall: test-time scaling on the generative verification

Paper • 2504.10337 • Published Apr 14, 2025 • 33