Files
microgopt/readme.md

1.9 KiB

micro-gopt

A go hand-reimplementation of https://karpathy.github.io/2026/02/12/microgpt/.

Original python is included in the repo for reference against bitrot.

To use: go run cmd/main.go input.txt

Differences between the Go and the Python, as well as notes more generally:

  • The GPT is implemented as a package and, separately, as a command-line wrapper that calls it, just to keep the algorithm separate from the invocation details.
  • The Value class is more type-safe in go, using values everywhere as opposed to mingling floats and values in the localgrad tuple.
  • The Value struct has actual tests confirming the backward propagation logic.
  • When writing the Value struct and its methods, I accidentally swapped the order of the values in the localGrads slice in Mul and tore my hair out trying to figure out where the bug was. When I broke down and asked copilot to "compare these two implementations and tell me how they differ," it managed to find the error -- but also reported three non-existent differences and told me that slices.Backward() doesn't exist.
  • Initial pass translating the linear algebra functions has me worried that all those value structs aren't going to be very fast...
  • Had to implement weighted random choice. https://cybernetist.com/2019/01/24/random-weighted-draws-in-go/ made that relatively straightforward; it's a neat algorithm.

First proper run:

well, um...

Something's not right here, unless the hit new baby name is kaaaaasehaaeaaal.

...

After a few more rounds of debugging, I'm stumped. There must be some subtle pythonic behavior that my rewrite isn't capturing that's causing my results to all be nonsense like eadaaaaannnaanba and oetlaaceta, but I can't see it (and I don't know enough python to find it).

This was still a useful learning opportunity, although a frustrating one in the end.