Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

71% Positive

Analyzed from 427 words in the discussion.

Trending Topics

#learning#project#micrograd#graph#train#underbar#built#scalar#valued#engine

Discussion (11 Comments)Read Original on HackerNews

oraziorillo3 days ago
Hey HN, I'm Orazio. I built microcrad (with a 'c'), a tiny scalar-valued automatic differentiation engine, with a small multi-layer perceptron implementation on top. It's reimplementation of Andrej Karpathy's micrograd in C. For me, this was a learning project to revisit backpropagation from first principles, with the additional difficulties that come with programming in C.

The basic idea is the same as micrograd: each number is a `Value` node in a computation graph, ops connect nodes, and the `backward` function topologically sorts the graph before applying the chain rule in reverse. The C-specific parts are memory management and two simple data structures I needed to implement backprop: sets and vectors.

The source code is about 1,350 lines, MIT licensed, and well documented. Dependencies are just the standard library and libm. In addition, the repo contains two examples to showcase how the engine works: a toy regression and an MNIST task.

What this is not: a framework to build and train neural networks in production. Being scalar-valued makes it slow, and it wasn't built for numerical robustness or large datasets. There's no commercial aim here; it's a learning project.

If you read through it, I'd like to hear thoughts, both on the ML engineering aspect and on anything that reads as un-idiomatic C.

megadragon918 minutes ago
Interesting project. Do you think manual memory management help understand computational graph lifecycle better, or does it distract from backprop itself?

btw, I went down the micrograd path with numpy-primitives all the way to building a PyTorch clone that can pre-train and post-train LLMs (https://github.com/workofart/ml-by-hand). My learning focus was on the math/calculus <-> high-level APIs, instead of efficiency. I'm glad to see more people tackling this problem from different angles.

uecker42 minutes ago
Two things stick out as un-idiomatic for C. First, the casts before malloc are unnecessary. This you do in C++ but not in C. Second, names with beginning underscore are reserved, and the underscore + capital letter is specifically problematic.

The rest looks fairly nice but there are a couple of things I would do differently: I would not have the tests for NULL, use signed integers for indices and dimensions, use a flexible array member to integrate the data into the vector type directly, and omit the capacity field (as long as benchmarking does not show it is really needed). I would also use variably modified types for bounds checking, and with C23 the include guards become largely unnecessary.

valleyer16 minutes ago
Names beginning with double underbar (or single underbar + capital letter) are reserved. Single underbar + lowercase is not. C23 §6.4.2.1.
smasher164about 2 hours ago
Is there a reason you didn't go with something like Boehm for a library gc, instead of writing your reference counting implementation?
oraziorilloabout 1 hour ago
Mainly did it for learning, as dgellow correctly presumed, but also there’s something intrinsically beautiful in writing code with zero dependencies
dgellowabout 2 hours ago
Learning, I presume?
toxikabout 2 hours ago
Also refcounting is not a very difficult thing to implement
TituxDev3 days ago
I did a similar project, but my approach to the topology definition was declaring perceptron structs with inputs as pointer arrays and output as a regular variable. With this scheme, perceptrons can reference directly to the outputs from other perceptrons — or even their own output (I haven't implemented that yet).