Autoresearch tools compared: 5 ways to run autonomous experiments

A practical comparison of karpathy/autoresearch, pi-autoresearch, autoexp, Claude Autoresearch, and Crucible. What each does well, where each breaks down.

March 18, 2026 · 10 min

I let AI run 100 experiments. It learned to cheat.

An LLM agent tasked with training a neural net decided it was faster to just not. Then it got creative.

March 18, 2026 · 6 min