RU version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
100% Positive
Analyzed from 96 words in the discussion.
Trending Topics
#terminal#bench#unsloth#benchmark#model#timeout#resources#task#under#thanks

Discussion (4 Comments)Read Original on HackerNews
> * Terminal-Bench 2.0: Harbor/Terminus-2 harness; 3h timeout, 32 CPU/48 GB RAM; temp=1.0, top_p=0.95, top_k=20, max_tokens=80K, 256K ctx; avg of 5 runs.
Terminal bench 2.0 rules explicitly disallow modifying timeouts or resources available. Each terminal bench task has timeout (usually under 1h, mostly under 30 mins) and resources configured in the docker container by the task creator and they are chosen that way to test specific model aspects.