DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
60% Positive
Analyzed from 773 words in the discussion.
Trending Topics
#llms#models#paper#https#while#seems#why#language#languages#trek

Discussion (26 Comments)Read Original on HackerNews
The underlying paper itself is more precise, comparing against LUAR, a 2021 method based on bert-style embeddings (i.e. a model with 82M parameters, which is 0.2% the size of e.g. the recent OS Gemma models). I don't fault the authors of the paper at all for this, their method is interesting and more interpretable! But you can check the publication history, their paper was uploaded originally in 2024: https://arxiv.org/abs/2403.08462
A good example of why some folks are bearish on journals.
"AI bad" seems to sell in some circles, and while there are many level-headed criticisms to be made of current AI fads, I don't think this qualifies.
"Researchers found that a relatively simple, linguistically grounded method can perform as well as - and in some cases better than - complex artificial intelligence systems in identifying authorship.
The study suggests that increasingly sophisticated AI is not always necessary for high-performing writing analysis, particularly when methods are designed around established principles of how language works."
https://www.nature.com/articles/s41599-025-06340-3/figures/2
Your Star Trek comparison is also incorrect. Following your logic, we’ve had a “bonafide universal translator” for a while now with websites like Google Translate (and so on). But none of these websites or AI tools are capable of learning languages on the fly purely from context and with minimal input data (that’s the magic of Trek’s UT, what they call linguacode).
No, AIs have not “solved” language in any way.
TL;DR, probably never.
Example of LLMs doing well in similar tasks: https://arxiv.org/abs/2602.16800
The stock market crashes once in a while. Shit happens. The long-term outlook is unlikely to change nearly as much, unless you think there will be systemic macroeconomic changes.
What applications do you think make the most sense so far?
Because LLM models have already amortized the man-years cost of collecting, curating and training on text corpuses?