HI version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
73% Positive
Analyzed from 544 words in the discussion.
Trending Topics
#opus#gpt#fable#execution#code#planning#more#bit#confidence#usually

Discussion (9 Comments)Read Original on HackerNews
On the other hand, GPT feels much more consistent and direct with execution, where Opus might fade or timeout because Anthropic's servers are on fire at 2pm on a Monday, or take longer than necessary burning tokens for the same result. GPT seems more consistent and dots all the i's, etc.
I was trying Fable for execution and noticed a fair bit of what looked like thrashing or farting around rewriting tests that it just made which were failing, which didn't give me a lot of confidence. But the final result was clean, just a longer path to get there.
I then like to have GPT or Opus review my PR for any issues before I spend time reading the output. This usually surfaces some stuff to tweak, but with Fable it was coming back clean. Again, this was a small window of normal usage for a few days, but some interesting takeaways.
If Fable doesn't come back it's not the end of the world for me and in some ways I prefer a bit more of an antagonistic relationship. It makes a nice in-road to reasoning about the code and how I might want restructure things. This is a bit harder when the code is "bug free" except for subtle or architectural decisions you can overlook, but I find if I sweat the architecture early on, anything beneath that is compartmentalized and stays trivial to fix.
Claude seems to write the exact code that you expect, about 90% of the time, and consistently follows project standards; while Codex goes on wild goose chases creating unnecessary indirection and abstractions – they work correctly, but add cruft. I can spot both with decent confidence in the project I’m currently working.