DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
ā” Community Insights
Discussion Sentiment
80% Positive
Analyzed from 743 words in the discussion.
Trending Topics
#models#model#different#claude#agent#team#copy#makes#itself#turn

Discussion (16 Comments)Read Original on HackerNews
I have a /red-team skill that will use an agent team to criticize it's own work, grade and rank feedback, incorporate relevant feedback and then start over. It has increased the quality of output.
But as of now, even newer AI models are not particularly insightful. I'm always surprised by how suboptimal near-frontier LLMs are at collaborating in some of the easier cooperative environments on my benchmarking and RL platform. For example, check out a replay of consensus grid here: https://gertlabs.com/spectate
The agent makes a copy of itself in /tmp/. Runs. Evaluates. Updates itself. Makes a copy of itself. Runs. Evaluates. Updates itself. Makes a ...... you get the idea.
They will not stop if the recursion is given a hard to meet termination condition. Also, if it can cheat to solve the termination condition it will.
Do you tell them to think and coordinate the next step through some type of sync/talking mechanism or is it turn by turn?
I suspect turn by turn as it is similiar to other experiements and in this case, it wouldn't work because they wouldn't have a certain amount of time to think about the next step together?
So that does make the game more challenging, versus some other simulations we have where multiple conversation turns happen before action. But the inefficiencies I'm describing are different; for example, an agent reaches part of the destination area but is clearly blocking another player who needs to pass, and most models will just stay put instead of moving along to another target spot.
From what I've read, for each token or input patch, the gate computes a set of probabilities (or scores) over the experts, then selects a small subset (often the topā[k]) and routes that input only to those.
Ie each expert computes its own transformation on the same original input (or a shared intermediate representation), and then their outputs are combined at the next layer via the gateās weights.
Thatās post hoc combination, not B reasoning over Aās reasoning.
AI agents discussing things with each others would be more like one thinking model thinking throught the problem with different personas.
With different underlying models, you can leverage the best model for one persona. Like people said before (6 month ago, no clue if this is still valid) that they prefer GPT for planning and Claude for executing / coding.
Gemini is the planner and researcher, local models basically "just type syntax"
Seems to make it so none of them get stuck in a loop
What I havenāt taken time for is finding out about how Iād automate their back-and-forth and stop manually copy/pasting their responses.