Back to News
Advertisement
hhammer32 about 6 hours ago 22 commentsRead Article on github.com

ES version is available. Content is displayed in original English for accuracy.

I trained a transformer in HyperCard. 1,216 parameters. 1989 Macintosh. And yes, it took a while.

MacMind is a complete transformer neural network, embeddings, positional encoding, self-attention, backpropagation, and gradient descent, implemented entirely in HyperTalk, the scripting language Apple shipped with HyperCard in 1987. Every line of code is readable inside HyperCard's script editor. Option-click any button and read the actual math.

The task: learn the bit-reversal permutation, the opening step of the Fast Fourier Transform. The model has no formula to follow. It discovers the positional pattern purely through attention and repeated trial and error. By training step 193, it was oscillating between 50%, 75%, and 100% accuracy on successive steps, settling into convergence like a ball rolling into a bowl.

The whole "intelligence" is 1,216 numbers stored in hidden fields in a HyperCard stack. Save the file, quit, reopen: the trained model is still there, still correct. It runs on anything from System 7 through Mac OS 9.

As a former physics student, and the FFT is an old friend, it sits at the heart of signal processing, quantum mechanics, and wave analysis. I built this because we're at a moment where AI affects all of us but most of us don't understand what it actually does. Backpropagation and attention are math, not magic. And math doesn't care whether it's running on a TPU cluster or a 68030 from 1989.

The repo has a pre-trained stack (step 1,000), a blank stack you can train yourself, and a Python/NumPy reference implementation that validates the math.

Advertisement

⚡ Community Insights

Discussion Sentiment

100% Positive

Analyzed from 510 words in the discussion.

Trending Topics

#hardware#hypercard#more#https#com#right#modern#old#inference#simulator

Discussion (22 Comments)Read Original on HackerNews

edwinabout 2 hours ago
There’s something quietly impressive about getting modern AI ideas to run on old hardware (like OP's project or running LLM inference on Windows 3.1 machines). It’s easy to think all the progress is just bigger GPUs and more compute, but moments like that remind you how much of it is just more clever math and algorithms squeezing signal out of limited resources. Feels closer to the spirit of early computing than the current “throw hardware at it” narrative.
hammer327 minutes ago
Exactly. Working in a constrained environment invites innovation.
wdbm43 minutes ago
There is an absolutely beautiful rendering of the Mona Lisa encoded at some point in the digits of pi. If you know the position, it's really easy to plot the image.

But first you have to find that position.

hyperhelloabout 4 hours ago
Hello, if there are no XCMDs it should work adequately in HyperCard Simulator. I am only on my phone but I took a minute to import it.

https://hcsimulator.com/imports/MacMind---Trained-69E0132C

hammer32about 3 hours ago
I had no idea your simulator existed. No XCMDs, correct; everything is pure HyperTalk. I just ran a few training steps and they complete in a second or two. Thank you for importing it!
hyperhelloabout 3 hours ago
I gotta ask. Your scripts have comments like -- handlers_math.hypertalk.txt at the top. Are you using some kind of build process for a stack?
hammer32about 2 hours ago
More of a copy-paste process. The scripts are written as .txt files in Nova on my Mac Studio, then pasted one at a time into HyperCard's script editor on the classic Mac. The files are kept separate because SimpleText has a 32 KB text limit.
gcanyonabout 5 hours ago
It's strange to think how modern concepts are only modern because no one thought of them back then. This feels (to me) like the germ theory being transferred back to the ancient greeks.
kdhaskjdhadjkabout 3 hours ago
I think it's incredible to see the potential that is still locked up in old hardware. For example the 8088 MPH demo. Amazing what he was able to do with an 8088 and CGA. All this time the hardware had that potential, but it took decades to figure out how to unlock it, long after the hardware was considered obsolete. Imagine the sort of things that might be done later down the road with hardware of 0-20 years ago if somebody really dug into it to that level.
andai31 minutes ago
ashleynabout 3 hours ago
Retro console homebrew and demoscene are all about this. There's a lot of fun stuff going on in N64 homebrew right now: https://www.youtube.com/watch?v=rNEo0aQkGnU
tomcamabout 2 hours ago
That 8088 MPH demo is a tour de force. Which tells you that the millions of Apple laptops being bricked right now instead of being recycled could have some amazing use if it were possible to wipe them clean and reuse. Sigh.
andai32 minutes ago
Well, we've set it up so the survival of employees and their families is tied to old products being bricked.
hammer32about 4 hours ago
Right? Backprop was published in 1986, a year before HyperCard shipped. Attention is newer, but a small model like this was buildable.
jeffbee21 minutes ago
People did think of many of these core concepts decades ago, but they did not have the resources to put them into practice.
anthkabout 3 hours ago
Lisp is from 1960's and with s9 you can do even calculus with ease, in an interpreter small enough to fit in two floppies.

On the Greeks, Archimede almost did 'Calculus 0.9'.

immanuwell41 minutes ago
The architecture of macmind looks pretty interesting
hammer3211 minutes ago
Thank you! The constraints made it interesting. HyperCard doesn't have arrays, so the entire model, weights, activations, gradients, is stored as strings in hidden fields. All of the matrix math is done with "item i of field".
DetroitThrowabout 4 hours ago
This is very cool. Any more demos of inference output?
hammer32about 3 hours ago
Thanks! The quickest way to try it is the HyperCard Simulator link someone just posted in this thread: https://hcsimulator.com/imports/MacMind---Trained-69E0132C — go to the Inference card, click New Random to fill in 8 digits, then click Permute. The model predicts the bit-reversed permutation of all 8 positions. The pre-trained stack gets all inputs correct.