← Engineermaxxing

Mirdan

Engineering build-logs. We pick a real problem, build a working system, and write down every wall we hit on the way.

No matches found.
Build Log · Spatial AI

GASv2

Zero-shot semantic mapping from phone video to 3D scene graph, using four foundation models stitched together on pay-per-second GPUs. Ten invocations, seven crashes, one honest map. Everything we learned about Modal, VGGT, Grounding DINO, SAM 2, and the CUDA version that quietly changed under us.

Modal VGGT Grounding DINO SAM 2 Open3D Rerun CUDA PCA
16 chapters ~45 min 10 runs 6 artifacts
Build Log · Robot Perception

HuggingBros

Real-time depth and detection for robot teleoperation — Depth Anything v2 and Grounding DINO on Modal GPUs, streamed live to a Rerun viewer and a browser. Two robots (SO-101 arm + Reachy Mini), one perception pipeline, every decision we made.

Modal Depth Anything v2 Grounding DINO Rerun LeRobot Reachy Mini MJPEG
17 chapters ~35 min 2 robots 2 endpoints
Build Log · Imitation Learning

BoxBros

Teaching a 6-DOF SO-101 arm to pick whatever you point at — Action Chunking Transformer plus a 4-D bounding-box prompt token, trained on 80 self-collected episodes and served from Modal. Two policies (vanilla + prompted), one source file, one dashboard.

ACT LeRobot v3 Modal Grounding DINO FastAPI parquet CVAE
14 chapters ~40 min 80 eps 32k steps