ZH version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
59% Positive
Analyzed from 599 words in the discussion.
Trending Topics
#https#models#interesting#more#research#using#frontier#problem#algorithm#gemma

Discussion (15 Comments)Read Original on HackerNews
https://huggingface.co/spaces/gemma-challenge/gemma-dashboar...
Agents collaborating to speed up gemma-4-E4B-it inference (tokens per second) on a fixed GPU.
My feeling is that the research is converging to what the paper claims, that the combination of two is the right way to do it and it's a matter of how you combine the two as part of the harness you built that makes the difference.
At the AID-Wild / ACM CAIS 2026 workshop that happened recently, there are plenty of examples in the accepted papers on that.
A great example is AI-PROPELLER: Warehouse-Scale Interprocedural Code Layout Optimization with AlphaEvolve. It uses AlphaEvolve and Vizier to evolve compiler code-layout heuristics. (https://arxiv.org/abs/2606.00131)
I find it odd that frontier models often don't suggest the most powerful methods for crushing problems, but it may be that the training data doesn't actually have "good enough" experts on the problems I encounter. If the experts don't know about the best ways to solve the problem, they'll get dinged in training for even trying.
Instead they were focusing more on optimizations of the existing algorithm that has been implemented. Maybe it's an artifact of the problem I was throwing to them (I was asking to optimize the implementation of select_k in Arrow, which is currently using a max-heap streaming algorithm).
I've started documenting my journey with this here: https://www.kostasp.net/posts/16-ai-experiments-apache-arrow in case you want to take a look. Any advice would be highly appreciated, I'm looking for more inspiration on how to torture myself with that stuff.
[1]: https://github.com/ferreirafabio/autoresearch-automl/blob/ma...
[2]: https://github.com/CMA-ES/pycma
I remember a few months ago people were fairly skeptical about autoresearch, but we didn’t have a ton of data to say it was better or worse. My own bias is to prefer cheaper methods unless the more expensive method is shown to be better.