RU version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
67% Positive
Analyzed from 1897 words in the discussion.
Trending Topics
#code#llms#names#language#more#variable#context#languages#file#int

Discussion (39 Comments)Read Original on HackerNews
You already lost me here. There's a reason variable names are a thing in programming, and that's to semantically convey meaning. This matters no matter whether a human is writing the code or a LLM.
Agreed.
I'm working on a language designed for machines to write and humans to understand and review.
It doesn't seem worthwhile to have code nobody can understand.
So I wonder, doesn't this apply to function names too, which the author keeps in? I've seen LLMs use wrong function/classes as well.
I think a proper harness, LSP and tests already solve everything Vera is trying to solve. They mostly cite research from 2021 before coding harnesses and agentic loops were a thing, back when they were basically trying to one-shot with pretty weak models by modern standards
Good luck managing hallucinations on that context
I think Vera might be missing something here. In my experience, LLMs code better the less of a mental model you need, vs the more is in text on the page.
Go – very little hidden, everything in text on the page, LLMs are great. Java, similar. But writing Haskell, it's pretty bad, Erlang, not wonderful. You need much more of a mental model for those languages.
For Vera, not having names removes key information that the model would have, and replaces it with mental modelling of the stack of arguments.
I’m surprised by this. Most likely significant white space is a big part of the problem (LLMs seem horrible at white space). Functional with types has been a win for me with Gleam.
Surely, denser languages should be better for LLMs?
I think in the context of already trained LLMs, the languages most suited to LLMs are also the ones most suited to humans. Besides just having the most code to train on, humans also face similar limitations, if the language is too dense they have to be very careful in considering how to do something, if it's too sparse, the code becomes a pain to maintain.
https://arxiv.org/html/2510.11151v1
If I had to design one of these, I'd go for:
1. Token minimization (which may be circular, I'm sure tokens are selected for these models at least in part based on syntax of popular languages)
2. As many compile time checks as possible (good for humans, even better for machines with limited context)
3. Maximum locality. That is, a feature can largely be written in one file, rather than bits and pieces all over the codebase. Because of how context and attention work. This is the one I don't see much in commercially popular languages. It's more of a declarative thing, "configuration driven development".
So, orthogonal to the accepted, common code organization idiom (no matter how infrequently adhered to)?
Fascinating! Just the other day I decomposed a massive Demeter violation into stepwise proxying "message passing." I was concerned that implementing this entire feature—well, at least a solid chunk of it— as a single, feature-scoped module would cause the next developers eyes to glaze over upon encountering such a ball-of-mud, such a dense vortex of spaghetti.
But, as I drove home that evening, I couldn't help wonder if I hadn't, instead, merely buried the gordian lede behind so many ribbons of silk.
This seems to be at odds with the goal of token minimization. Lots of small files that are narrowly scoped means less has to be loaded into context when making a change, right?
Throwing out another idea: I wonder if we could see some kind of equivalent of c header files for more modern languages so that an llm just has to read the equivalent of a .h file to start using a library.
my solution (as someone that's building something tangential) is to use granular levels of scope - there should be an implicit single file that gets generated from a package at a certain phase of the static tool processing. But the package is still split into files for flexibility and DevEx (developper experience). Files/Folder organization is super useful for humans. For tooling, the pacakge can be taken collected together, and taken as a single unit, but still decomposed based on things like namespace, and top-level definitions that define things like classes, specifications, etc. That way the tooling has control over how much context to pass in.
Similarly, I don't read the whole file a function is in while editing it in an IDE, why should a coding agent get the whole file polluting its context by default?
There is no actual thought occurring. Arguably, we can say the same about a lot of humans at any given moment, but with machines there never is. It's all statistics.
It doesn't have Hindley-Milner type inference, but it has very strong type inference.
We will get linearity soon thanks to and as part of the Capybara[1] effort.
Refinement types are already long a reality.
The whole new effect tracking thing is based on delimited continuations.
The Unison style content addressability comes up now and then, maybe it will become a reality at some point. It's though mostly not a language thing but more a build system thing.
Scala is already great for for LLMs also for other reasons:
https://arxiv.org/html/2510.11151v1
[1] https://2025.workshop.scala-lang.org/details/scala-2025/6/Sy...
The major design decision I'm a little skeptical about is removing variable names; it would be interesting to see empirical data on that as it seems a bit unintuitive. I would expect almost the opposite, that variable names give LLMs some useful local semantics.
https://news.ycombinator.com/item?id=47957121
Elaborate a little here.
C# can do something similar with null references. It can require you to indicate which arguments and variables are capable of being null, and then compiler error/warning if you pass it to something that expects a non-null reference without a null check.
So, in pseudocode
int div(int a, int b): return a / b;
Would probably be a compile time error, but
int div(int a, int b): return b == 0 ? ERR : (a /b);
Would not, or at least that's what I'd expect.
It appears that me and creator have had vastly different experiences with LLMs and their capabilities with complex code bases and complicated business logic.
My observations point to LLMs being much more successful when variables and methods have explicit, detailed names, it's the best way to keep them on track and minimize the chance of confusion, next closest thing being explicit comments and inline documentation.
Poorly named and poorly documented things in a codebase only cause it to reason more on what it could be, often reaching a (wrong) conclusion, wasting tokens, wasting time.
Perhaps this diversion in philosophy is due to fundamental differences in how we view the tool at hand.
I do not trust the machine, as such I review it's output, and if the variables lacked names, that would be significantly harder. But if I had a "Jesus, take the wheel!" attitude, perhaps I'd care far less.
Edit: the more I think about it the more this seems like a really bad idea. Three more issues come to mind: 1) it becomes impossible to grep for a variable, which I know agents do all the time. 2) editing code at the top of the function, say introducing a new variable, can require editing all the code in the rest of the function, even if it was semantically unchanged! 3) they say it is less context for the LLM to track but now, instead of just having to know the name of one variable, you have to keep track of every other variable in the function