ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
86% Positive
Analyzed from 578 words in the discussion.
Trending Topics
#llm#layer#judge#security#don#adding#non#something#dangerous#another

Discussion (16 Comments)Read Original on HackerNews
I think you're spot on with the fact that it's so far it's been either all or nothing. You either give an agent a lot of access and it's really powerful but proportionally dangerous or you lock it down so much that it's no longer useful.
I like a lot of the ideas you show here, but I also worry that LLM-as-a-judge is fundamentally a probabilistic guardrail that is inherently limited. How do you see this? It feels dangerous to rely on a security system that's not based on hard limitations but rather probabilities?
not adding LLM layers to stuff to make them inherently less secure.
This will be a neat concept for the types of tools that come after the present iteration of LLMs.
Unless I’m sorely mistaken.
If people said "we build a ML-based classifier into our proxy to block dangerous requests" would it be better? Why does the fact the classifier is a LLM make it somehow worse?
The entire purpose of LLMs is to be non-static: they have no deterministic output and can't be validated the same way a non-LLM function can be. Adding another LLM layer is just adding another layer of swiss cheese and praying the holes don't line up. You have no way of predicting ahead of time whether or not they will.
You might say this hasn't prevented leaks/CVEs in exisiting mission-critical software and this would be correct. However, the people writing the checks do not care. You get paid as long as you follow the spec provided. How then, in a world which demands rigorous proof do you fit in an LLM judge?
Edit: actually looks like it has two policy engines embedded
EDIT: it does seem to have a deterministic layer too and I think that's great