Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
58% Positive
Analyzed from 1512 words in the discussion.
Trending Topics
#quality#llm#human#llms#knowledge#test#produce#more#based#trust
Discussion Sentiment
Analyzed from 1512 words in the discussion.
Trending Topics
Discussion (32 Comments)Read Original on HackerNews
I don't know if I agree with either assertion… I've seen plenty of human-generated knowledge work that was factually correct, well-formatted, and extremely low quality on a conceptual level.
And AI signatures are now easy for people to recognize. In fact, these turns of phrase aren't just recognizable—they're unmistakable. <-- See what I did there?
Having worked with corporate clients for 10 years, I don't view the pre-LLM era as a golden age of high-quality knowledge work. There was a lot of junk that I would also classify as a "working simulacrum of knowledge work."
Most importantly, those sources of errors tend to be consistent. I can trust a certain intern to be careful but ignorant, or my senior colleague with a newborn daughter to be a well of knowledge who sometimes misses obvious things due to lack of sleep.
With AI it's anyone's guess. They implement a paper in code flawlessly and make freshman level mistakes in the same run. so you have to engage in the non intuitive task of reviewing assuming total incompetence, for a machine that shows extreme competence. Sometimes.
AI signatures don't mean low quality, they just mean AI. And humans do use them (I have always used the common AI signatures). And yes, humans produce good-looking garbage, but much more commonly they produce bad-looking garbage. This is all tangential to the point.
This is especially true if we start to see more of a split in usage between LLMs based on cost. High quality frontier models might produce better work at a higher cost, but there is also economic cost pressure from the bottom. And just like with human consultants or employees, you’ll pay more for higher quality work.
I’m not quite sure what I’m trying to argue here. But the idea that an LLM won’t produce a low quality report just seemed silly to me.
Working in a team isn’t adversarial, if i’m reviewing my colleague’s PR they are not trying to skirt around a feature, or cheat on tests.
I can tell when a human PR needs more in depth reviewing because small things may be out of place, a mutex that may not be needed, etc. I can ask them about it and their response will tell me whether they know what they are on about, or whether they need help in this area.
I’ve had LLM PRs be defended by their creator until proven to be a pile of bullshit, unfortunately only deep analysis gets you there
You wouldn't use a calculator that is as good as a human and makes mistakes as often.
It is not so much that the "tells" of a poor quality work are vanishing, but that even careful scrutiny of a work done with AI is going to become too costly to be done only by humans. One only has so much time to read while, say, in economics journals, the appendices extend to hundreds of pages.
Would love to hear if other fields' journals are experiencing a similar pressure in not only at the extensive margin (no of new submission) but the intensive margin (effort needed to check each work).
`simulacrum` is a great word, gotta add that to my vocabulary.
I can see a similar problem with this article - the author notices that LLMs produce a lot of errors - then concludes that they are useless and produce only simulacrum of work. The author has an interesting observation about how llms disrupt the way we judge knowledge work. But when he concludes that llms do only simulacrum of work - this is where his arguments fail.
This is not true as stated. I'd try to gloss over the absolutes relative to the context, but if I'm totally honest, I'm not sure I understand what idea you're trying to communicate.
Wait, you're probably talking about the test of discarding a report based on something superficial like spelling errors. Which fails with LLMs due to their basic conman personalities and smooth talking. And therefore ..?
Reinforcement Learning with Verifiable Rewards (RLVR) to improve math and coding success rates seems like an exception.
If someone was already evaluating the work output using a metric closer to the underlying quality then it might not have been a big shift for them (other than having much more work to evaluate).
You could however only do that if you were fine with unfairly judging the quality of work, as you now readily discarded quality work based on superficial proxies. Which admittedly is done in a lot of cases.
Yes.
This does not however mean that progress is not being made.
It just means the progress is happening along such dimensions that are completely illegible in terms of the culture of the early XXI century Internet, which is to say in terms of the values of the society which produced it.
For most tasks, the complexity/time required to verify a task is << the time required to do the task itself. Sure there can be hallucinations on the graph that the LLM made. But LLMs are hallucinating much less than before. And the time to verify is much lower than the time required for a human to do the task.
I wrote a post detailing this argument https://simianwords.bearblog.dev/the-generation-vs-verificat...
Are LLM a good dictionary of synonyms ? Perhaps, but is it relevant ? Not at all
Are you biased when a solution is presented to you ? Yes, like all humans.
Is it damageful when said solution is brain-dead ? Obsiously.
Are you failing to understand that most (if not all) manager's work is human centric and, as such, cannot be applied to a non-human ? Obviously ..
You trust a machine's intent. Joke's on you, it has no intent at all, it will breaking that "trust" your pour in it without even realizing-it
You say that LLM does better job than you. Perhaps this says it all ?