DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
76% Positive
Analyzed from 4075 words in the discussion.
Trending Topics
#https#open#more#glm#claude#source#code#model#opencode#codex

Discussion (206 Comments)Read Original on HackerNews
I guess the base is whatever the profit margin needs to be this month.
[1]: https://zcode.z.ai/en#:~:text=Base%20usage%20allowance%20inc...
[2]: https://support.google.com/gemini/answer/16275805?hl=en#:~:t...
Start plan: 5 Million tokens a day (GLM-5.2 3M, GLM-5 Turbo 2M)
For individuals: (+150% quota) $18.00USD+ For individual developers with a dedicated Coding Plan quota.
I've never used IDEs and never will, why are these things being constantly shoved down our throats?
Edit: my theory is they wanted to mimic being the primary provider in a quick way with a lot of string replace. Though they could have added opencode back as a regular provider.
Harnesses are quickly becoming critical components of the "model" itself imo. Not shocking to me at all that a company that spots a revenue opportunity is keeping its harness closed source.
no. they. are. not.
Some people are just terrible at it.
If you're already used to your TUI coding agent, you don't need the desktop agent. Although it is nice that it is there for folks who prefer the Codex App/Claude App UI approach.
For some tasks, it's better. Opus refuses tasks for me pretty regularly. GLM 5.2 has never refused a task. So for anything security-related or that touches on topics that trigger Opus's safety guardrails, I use GLM 5.2.
OTOH, for anything related to UI design, I use Opus 4.8. It's much better at taking relatively vague descriptions of user interfaces and a mockup of a related UI and combining them into an immaculate design.
For anything else, I tend to run tasks in Opus and then have GLM review them and write a Markdown file with anything it finds. Then I have Opus review the markdown file and fix the issues it agrees with. The reason I usually go with Opus 4.8 first is mainly that it's faster. Opus 4.8 is, on average, about twice as fast as GLM 5.2 running on z'ai's infrastructure for the same task. There's a large variance (sometimes GLM 5.2 is pretty fast and Opus 4.8 is pretty slow), but on average it's a very noticeable difference.
When I run into Anthropic's Quota, I switch to GLM 5.2 rather than Sonnet. I don't think there's much reason to ever use Sonnet for anything if you can use GLM 5.2 instead.
This is all pretty subjective, of course. On average, I think Opus 4.8 is still a better, more reliable, and faster model, but if it went away tomorrow and I only had GLM 5.2, I wouldn't be too sad about it; I'd get things done with GLM 5.2 just fine.
That said, it's interesting that they're releasing a bunch of stuff: ZCode, OCR.z.ai, Image.z.ai, Audio.z.ai, AutoClaw and some other stuff that https://chat.z.ai/ links to. That's a lot of stuff for one org to pull off.
Figured I'd try out their Pro coding plan, seems like it doesn't necessarily give me that much quota than Opus (at least given how many tokens are needed for accomplishing a certain task), but GLM 5.2 in of itself seems like a beefier Sonnet model, pretty good.
I'm wanting local context in the spirit of "here are 3 AI providers available, for coding tasks use this one... and for writing prose use this one... and for generating images use this one..." etc.
(Full disclosure: it’s my project)
OpenCode was the first agent harness I used, and I have always like it. You can configure a wide variety of providers, but it's open source and has a number of core contributors.
The other opinionated option is Pi (the Pi agent harness). This is a great lightweight option and also supports a number of providers. You can also use local model servers.
The orchestrator knows which AI client is running in any given worktree, so it would be fairly easy to designate which AI should receive what kind of tasks.
You run either Claude or Codex in tabs for each work tree. I do have some AI TUI specific instructions, for instance codex is primitive at monitoring compared to CC. So, there are additional notes for Codex workers on how to properly monitor for new "mail."
You work with the orchestrator on the primary worktree and allow it to delegates tasks to the workers and answer their smaller questions.
It surfaces results and assisting them with context clearing when needed.
The orchestrator and workers communicate using a simple shared file system under tmp/* and together they can handle a big and varied workload.
I use iterm2, so I’ve also added iterm2 specific python that allows the orchestrator to “kick” a worker or perform tasks otherwise veto'd by the TUIs (ie /clear) by modifying the input and submitting it.
i think people don't realize how much better life is over on this side, cc and codex rely entirely on vendor lock in imo.
I don't think I understand the token/cost implications of this feature
It supports MCP (unlike Pi), sandboxing (with user-mode networking), and runs efficiently at huge contexts.
https://codeberg.org/mlow/lmcli
(The screenshot in the folder is a little bit out of date, but is still representative of the overall look)
https://goose-docs.ai/
https://github.com/charmbracelet/crush
China have a history of stealing IPs/trade secrets and Chinese court favored its own local companies. while US have a robust court that can enforce IPs. if you want to risk your company's IPs/trade secrets/data for some cheap token. Go ahead and use Z.ai's services.
It is essentially a black box with full user permissions, meaning you are just handing over your entire system to a Chinese-owned server. With OpenCode and its GLM provider, at least I can monitor which files were read, which were edited, and what commands were executed.
Not to mention that Chinese national security laws legally obligate companies to cooperate with state intelligence and counter-espionage efforts [0]. If you have this installed on a corporate workstation, and your company is large enough, the possibility of them spying on you is not just a risk—it's almost a certainty.
[0]: https://en.wikipedia.org/wiki/National_Intelligence_Law_of_t...
I also consider Microsoft to be the biggest industrial spy in the world, them and google both are no doubt mining everything you type into office / gsuite, all your emails, etc. But nobody bats an eye when you write a word doc about some sensitive matter.
If my customers thought I was feeding their data into a Chinese owned LLM API (which to be clear I’m not), I don’t think it would go over well, and I’d be exposed legally to all sorts of things.
So the reason is risk aversion and desire to participate in US / western commerce. One can debate the actual threat, but why would you ever risk sending your data to a processor perceived as dodgy?
Nevertheless, Americans thinking they are morally superior to China is always quite funny.
This administration is corrupt, cruel and doesn’t care about human rights.
And the worst is… Americans have voted for that administration…. twice!
I digress…
But yes, US intelligence has killed and ruined the lives of far more people than China has. Not sure how so many people buy into the narrative that they're protecting freedom and democracy.. They're protecting their freedom to kill and crush all their enemies and control every "democracy" on earth.
Reminds me a bit of the old “is your adversary Mossad or not Mossad” decision matrix https://www.usenix.org/system/files/1401_08-12_mickens.pdf
I'm no apologist for the US Intelligence and related organizations (not by a very long shot), but that is a very extreme statement to make.
By the way, some pedant recently asked why anyone would run software with only a few stars. My thoughts on that are minimal: people can practice whatever slop logic they want. I've architected and built systems that handled tens of thousands of users. I'm not fucking around. The way I build isn't typical, and I don't suggest anyone try to mimic my approach, but it works for me and the way my mind processes complex systems.
To the peanut gallery: use it or don't, but don't give me a hard time unless you're ready to get one back. I've made plenty of mistakes in my career, and accountability is a crucial part of growth. I'm more than willing to work with anyone using my code, provided they bring valid, substantial criticism to the table.
The US is certainly inching in that direction but it’s not like someone from the US government sits at Anthropic’s HQ reading chats from state people of interest.
1) there is a very non-zero chance that the US government also has that data from OpenAI and possibly Anthropic
2) unless you are asking the chinese models to draw up plans to overthrow the chinese government, it's extremely unlikely they would ever care.
while china has a track record of harassing it's own dissident citizens abroad, if you're not chinese and not trying to subvert their government (or are a high-ranking government official yourself), it's kind of silly to suppose they would ever care about you or what you do.
and if you have information they want for their own national development purposes, like EUV engineers, they are much more likely to offer you fabulous amounts of money instead of try to intimidate or threaten it out of you.
PRISM ... XKeyscore ...
> The US is certainly inching in that direction
Itching to go in a direction that (publicly known) they have been in for decades now.
Do you really think the US government doesn't get access or couldn't get access to any of your chats with Claude?
You mean, like Windows and Android?
> Explanation and Recommendations Regarding Usage for Plan-Supported Models
> Note: Peak hours are from 14:00 to 18:00 daily (UTC+8).
https://docs.z.ai/devpack/overview
(If this comment is too formal, I'm sorry. I used Google Translate to it [this line was NOT translated])
At 200k context that is only 85 requests for a whole week.
Do they really correspond roughly? Seems like they’re trying to suggest a discount while still being worth a significant amount of monthly spend.
Here's the message: "Cannot connect to API: write EPIPE"
But mostly vendor lock-in, I imagine.
And most of the advancement and experimentation happens in some random 0-star github repos.
pi-tmux is one such example (seems to be archived now) which inspired me to use tmux as communication layer and provide visibility of subagents of multiple models in their native harnesses [1].
There's also herdr, which is not 0-stars, but is super interesting but lesser known project [2]. This also has interesting substrates to allow agent coordination.
None of these are harnesses per se, but they're pointing towards clear gaps in existing harnesses. For example, we've known for a while now that compounding knowledge of different class of models achieves better performance. Why is there no harness where this is a native functionality? And there's no harness where subagents are first class citizens both in terms of capabilities and UX.
[1] https://github.com/offline-ant/pi-tmux
[2] https://github.com/ogulcancelik/herdr
It does have a 1.5x usage promotion for GLM 5.2 on the coding plan so now is a good time to test it...
- https://igorwarzocha.github.io/howcode/
- https://github.com/ruuxi/stella
- https://www.pi-gui.com/
Not using Pi, but based on PI (no extensions possible)
- https://twotimespi.dev/
But it already works really well with existing harnesses, I'm not sure why a dedicated one is needed?
I use it with https://swival.dev and everything works perfectly, no tool calling issues and it works fine with long sessions.
[0]: https://zcode.z.ai/cn