Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
April 01, 2025
·
Berlin
smolR1
Demonstrating a reproducible DeepSeek R1 implementation using Qwen2.5B‑0.5B on two 4090 GPUs, providing a compact, stable GRPO baseline for rapid RL experimentation.
Overview
reproducing DeepSeek’s R1 on the smallest scale with Qwen2.5B-0.5B on two 4090 GPUs.
a smol and stable baseline for rapid experimentation.
Links
Reproduces DeepSeek R1 Zero using Qwen2.5-0.5B on two 4090 GPUs.
Tech stack