zero-gpu-explorers (ZeroGPU Explorers)

🍓 One of the coolest parts about being an early Strawberry user has been the opportunity to build on the app at the ground floor.

The platform already has a ton of great integrations that let you interact with your external apps directly with tools, but I wanted to add the ability to do stuff in Slack as well.

💪 So I took the base Anthropic Slack MCP server, added a whole bunch of new tools, and generalized it as an HTTP-based SSE-server and deployed it in like 2 minutes with Railway so that Strawberry could make use of it (as can Claude or any other MCP client).

Now, you can Chat with your Strawberry Companion (or Claude, or whatever) and do things like:
➡️ Get caught up across all of your Slack channels after a long weekend or noisy incident without having to read 20 threads in 10 different channels
➡️ Create, read, and edit Canvases, Messages, and Channels
➡️ Take any resources or content that you're using in your Chat and inject it directly into Slack without copy / paste

😎 I'm pretty pleased with the results, and I made a short demo video showing the results of the work (link in comments). The best part is, it's available on GitHub for anyone else to use too (link in the comments, instructions in the README). The setup takes about 5-10 minutes.

2 replies

·

akhaliq

submitted a paper to Daily Papers 26 days ago

What matters for Representation Alignment: Global Information or Spatial Structure?

Paper • 2512.10794 • Published about 1 month ago • 8

toshas

posted an update 26 days ago

Post

799

Introducing StereoSpace -- our new end-to-end method for turning photos into stereo images without explicit geometry or depth maps. This makes it especially robust with thin structures and transparencies. Try the demo below:

🌐 Project: prs-eth/stereospace_web
📕 Paper: StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space (2512.10959)
🐙 Code: https://github.com/prs-eth/stereospace
🤗 Demo: toshas/stereospace
🤗 Weights: prs-eth/stereospace-v1-0

By ETH Zürich ( @behretj , @Bingxin , @konradschindler ), University of Bologna ( @fabiotosi92 , @mpoggi ), HUAWEI Bayer Lab ( @toshas ).

toshas

authored a paper 27 days ago

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

Paper • 2512.10959 • Published about 1 month ago • 12

toshas

submitted a paper to Daily Papers about 1 month ago

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

Paper • 2512.10959 • Published about 1 month ago • 12

akhaliq

submitted a paper to Daily Papers about 1 month ago

Towards a Science of Scaling Agent Systems

Paper • 2512.08296 • Published Dec 9, 2025 • 14

anakin87

posted an update about 1 month ago

Post

255

💭 Do thinking traces make Language Models learn better? Curious what others think

𝗦𝗰𝗲𝗻𝗮𝗿𝗶𝗼
You take an instruction-following LM.
You want to train it with a GRPO-style RL algorithm on a task like Tic Tac Toe.
Rewards are outcome-based, applied only at the end of each episode: win/loss/draw, format adherence...

During training, the model could just output answers, but a common choice is to make it also output thinking traces.

𝗧𝗵𝗲 𝗾𝘂𝗲𝘀𝘁𝗶𝗼𝗻
Does forcing the model to produce thinking traces during training actually improve learning❓

💬 I'd like to hear your thoughts. Share ideas and links to relevant papers and resources.

From what I've understood so far, the answer seems to be 𝘆𝗲𝘀.

1️⃣ If you force the model to think during training, it becomes a model that thinks at inference time. It naturally allocates more budget (tokens) to a problem, which tends to improve performance.

2️⃣ While the model's "reasoning" already exists in its activation space, using explicit thinking traces as a scratchpad allows training to steer and shape that reasoning.

3️⃣ As the model produces more traces during training, the RL algorithm can progressively give higher rewards to the reasoning patterns that lead to better outcomes.

toshas

posted an update about 1 month ago

Post

2222

Introducing 🇨🇭WindowSeat🇨🇭 –– our new method for removing reflections from photos taken through windows, on planes, in malls, offices, and other glass-filled environments.

Finetuning a foundation diffusion transformer for reflection removal quickly runs up against the limits of what existing datasets and techniques can offer. To fill that gap, we generate physically accurate examples in Blender that simulate realistic glass and reflection effects. This data enables strong performance on both established benchmarks and previously unseen images.

To make this practical, the open-source Apache-2 model builds on Qwen-Image-Edit-2509, a 20B image-editing diffusion transformer that runs on a single GPU and can be fine-tuned in about a day. WindowSeat keeps its use of the underlying DiT cleanly separated from the data and training recipe, allowing future advances in base models to be incorporated with minimal friction.

Try it out with your own photos in this interactive demo:
🤗 toshas/windowseat-reflection-removal

Other resources:
🌎 Website: huawei-bayerlab/windowseat-reflection-removal-web
🎓 Paper: Reflection Removal through Efficient Adaptation of Diffusion Transformers (2512.05000)
🤗 Model: huawei-bayerlab/windowseat-reflection-removal-v1-0
🐙 Code: https://github.com/huawei-bayerlab/windowseat-reflection-removal

Team: Daniyar Zakarin ( @daniyarzt )*, Thiemo Wandel ( @thiemo-wandel )*, Anton Obukhov ( @toshas ), Dengxin Dai.
*Work done during internships at HUAWEI Bayer Lab

Aurelien-Morgan

posted an update about 1 month ago

Post

325

Hey, I went to Hangzhou to talk about retrain-pipelines at the GOSIM Foundation's conference last september.
The recording just got released. Go check it out !
https://www.youtube.com/watch?v=nmrMachM5aM
Slides are there :
https://docs.google.com/presentation/d/1hnAzHJ0SbeAOtGJir-iH84RBtXT1OxVT/

2 replies

·

akhaliq

submitted a paper to Daily Papers about 1 month ago

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Paper • 2512.07843 • Published Nov 24, 2025 • 21

toshas

authored 2 papers about 1 month ago

The Fourth Monocular Depth Estimation Challenge

Paper • 2504.17787 • Published Apr 24, 2025

Reflection Removal through Efficient Adaptation of Diffusion Transformers

Paper • 2512.05000 • Published Dec 4, 2025 • 15

ZennyKenny

posted an update about 1 month ago

Post

217

What a trip. Just walked through @burtenshaw and @evalstate tutorial on adding Hugging Face Skills to your Claude Code agent so you can fine tune LLMs by chatting with AI.

These are the kinds of innovations that are going to help everyone benefit from the power of Artificial Intelligence. Well done gentlemen and thank you for sharing.

1 reply

·

ZennyKenny

posted an update about 1 month ago

Post

254

😐 I keep seeing takes on LinkedIn from American business influencers melting down about Silicon Valley startup "dependence" on open-source Chinese models.

🤔 Can anyone describe a credible scenario where these models can be leveraged by the Chinese government to endanger American security interests or am I right to believe that this is just Red Scare nonsense?

2 replies

·

ymoslem

authored a paper about 1 month ago

Iterative Layer Pruning for Efficient Translation Inference

Paper • 2510.22763 • Published Oct 26, 2025

anakin87

posted an update about 1 month ago

Post

451

I made a visualization based on the Prime Intellect INTELLECT-3 technical report.

Wild to see how far they pushed GLM-4.5-Air-Base with SFT + RL.
SOTA for its size and competitive with models 3x larger.

All open.

Congrats on the release!

Model: PrimeIntellect/INTELLECT-3
Technical report: https://storage.googleapis.com/intellect-3-paper/INTELLECT_3_Technical_Report.pdf
Chat: https://chat.primeintellect.ai/

ZeroGPU Explorers

AI & ML interests

Recent Activity

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation

FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow

What matters for Representation Alignment: Global Information or Spatial Structure?

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

Towards a Science of Scaling Agent Systems

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

The Fourth Monocular Depth Estimation Challenge

Reflection Removal through Efficient Adaptation of Diffusion Transformers

Iterative Layer Pruning for Efficient Translation Inference

AI & ML interests

Recent Activity

Team members 751

zero-gpu-explorers's activity