DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
40% Positive
Analyzed from 1154 words in the discussion.
Trending Topics
#code#level#binary#should#llms#output#deterministic#machine#llm#don

Discussion (50 Comments)Read Original on HackerNews
Compilers literally made your project possible!
https://reproducible-builds.org/docs/source-date-epoch/
(although Nix sets it as a default)
> I decided to take inspiration from the legendary talk The Birth and Death of JavaScript and just recompile the WebAssembly to JavaScript.
So what do you do when the client has Javascript disabled ?
Here, since any whatwg cartel web engine is an issue, the author should not bother.
If you want to have users trust that someone else hasn't modified it, then sign it with your identity.
Being able to reproduce the binary from the source code and being able to verify that it's the same as the original is quite important in some contexts.
That tooling is a compiler. The higher level, the better chance the LLM can be steered to good output. Machine code is hopeless, don’t bother.
Also there are dynamic compilers were the shape of machine code changes as the code executes, and each single execution will certainly generate different sequences, depending on the program execution and where it is running.
Deterministic JIT compiler code generation, at least on optimising ones, is not a solved problem.
You can have LLMs help you optimize code but I don’t think you can do this unattended for non-trivial code.
I don't see why that's the case. LLM trained on binary would totally see it, not?
Also the tool can also be running the test and a debugger.
It would not. You find the correct version by counting the number of bytes to the destination. LLMs are famously bad at this kind of problem (counting).
> Also the tool can also be running the test and a debugger.
The test needs to provide a good amount of signal. That’s too hard if you are throwing machine code at the wall.
In order for debuggers to work, you need some kind of model that describes what the code should do and what state the computer should be in after each instruction. That model is high-level code.
I can understand the intuitive appeal of training LLMs with machine code, but all of my experience with LLMs suggest that they are incredibly ill-suited to the task, and we just don’t have the capacity to train them to make useful machine code.
Done! Excellent abstraction. High intelligence.