Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

78% Positive

Analyzed from 2614 words in the discussion.

Trending Topics

#broadcom#openai#hardware#inference#more#google#memory#taalas#https#makes

Discussion (53 Comments)Read Original on HackerNews

shellcromancer•1 day ago
Probably obvious but still omitted in the OpenAI post: chips are being made by TSMC [1]. Wasn't sure if Intel got it.

1. https://www.investing.com/news/stock-market-news/openai-unve...

HarHarVeryFunny•1 day ago
I just read a claim on Twitter that the reason these companies (Google and Amazon as well as OpenAI) are using Broadcom isn't just for design expertise, but because Broadcom have allocation agreements in place with TSMC and the memory manufacturers.
alephnerd•1 day ago
Most design partners have allocation agreements. The thing is Broadcom is an absolute GIANT in the ASIC design space, and it's closest competitor Marvell is a fraction of it's size.

There are a lot of large tech companies that most of HN has never heard about that completely dominate entire segments.

ahartmetz•1 day ago
...and because most hardware sales except AI accelerators are down due to RAM prices, Broadcom probably can't otherwise use their allocation at TSMC.
NavinF•about 23 hours ago
Nope, not down. "total Personal Computing Device (PCD) market — comprising traditional PCs and tablets — posted 2.8% year-over-year growth in Q1 2026, with combined shipments reaching 103.3 million units. PC shipments grew 3% YoY with 65.6 million units" https://www.idc.com/promo/pcdforecast/

Q2 is forecasted to be negative, partly because of RAM prices like you said, but for the most part this is something that only price sensitive nerds care about. Broadcom sells a ton of server chips. Server sales are up 30% vs last year so I highly doubt they're desperate to use their allocation

a_conservative•1 day ago
I recently put 2+2 together.

Broadcom has become wealthy by being Google's TPU hardware partner, including sharing their TSMC capacity with Google, and evidently now they are doing the same thing with OpenAI. What a brilliant way to take advantage of the AI gold rush!

I wish they weren't using their piles of money to extort money out of the software industry like they are with VMWare and Bitnami.

kccqzy•about 22 hours ago
Well Google has reduced reliance on Broadcom already. They found a new hardware partner, MediaTek, that’s probably much, much cheaper than Broadcom.

https://finance.yahoo.com/sectors/technology/articles/broadc...

mschuster91•about 21 hours ago
> Well Google has reduced reliance on Broadcom already. They found a new hardware partner, MediaTek

Oh dear god. I'm actually feeling sorry for Google at that point. Good luck, you'll need it...

alephnerd•1 day ago
> Broadcom has become wealthy by being Google's TPU hardware partner...

Kinda, but not exactly.

Broadcom cornered the enterprise infra and security market in the late 2010s and early 2020s after acquiring CA Technologies, BMC (EDIT: Did NOT acquire them, they were considering it back in 2018 but decided against it and KKR ended up acquiring them), Symantec (which they bought instead of BMC), and VMWare and were able to make a strong cybersecurity story during the late 2010s cybersecurity and SaaS boom.

That gave them plenty of cashflow that helped subsidize their hardware business when hardware was not viewed as hot as it is today.

Additionally, Broadcom is GCP's marquee customer and has been for a little under a decade so they were able to make a sweetheart deal where all that software businesses at Broadcom would be exclusively using GCP and in return GCP would working with Broadcom to design it's silicon and source infra needed for their DC buildouts.

Ironically, the DoJ blocking Broadcom's acquisition of Qualcomm was the best thing it ever could have done for Broadcom, because it gave Broadcom the dry powder to dominate the Enterprise SaaS and build a strong niche in the cybersecurity space.

> piles of money to extort money out of the software industry

From personal experience, executives and leadership who started off in the electronics and hardware industry are much more vicious and cutthroat than their peers who started in software.

Working in an industry that historically had to deal with high commodification, low margins, and long tail sales leads to leadership that can execute. Additionally, no one climbs the leadership ladder without having spent years as a line-level engineer, but that's true for software as well to an extent.

Edit: can't reply

> Did they acquire also BMC?

Nope.

Broadcom was considering acquiring them in 2018 but decided not to go through with the opportunity and KKR jumped in.

vb-8448•1 day ago
Did they acquire also BMC?
a_conservative•1 day ago
Good information, Broadcom is a playa, lots and lots of acquisitions! (a quick google search turns up a very eventful history for Broadcom)

> From personal experience, executives and leadership who started off in the electronics and hardware industry are much more vicious and cutthroat than their peers who started in software.

Only The Paranoid Survive is quite a name for a management book. It implies surviving in the world you are speaking about.

[0] https://www.goodreads.com/book/show/66863.Only_the_Paranoid_...

maz1b•1 day ago
Pretty huge move. Google and their TPUs are looking infinitely more prescient as I think they are on their 7th generation, along with the offshoots it inspired like the LPU and even others, perhaps like Cerebras and their Wafer Scale Engine.

However, based off first impressions, it seems like this is meant for inference side, and not training, which is also an interesting choice.

skeledrew•1 day ago
Training is pretty much a 1x cost, and efficiency there is already on the way down with architectural improvements. Inference though is an ongoing cost which over time takes orders of magnitude more resources, so focusing on making that far more efficient means way greater gains over time.
ggcr•about 9 hours ago
With Reinforcement Learning, inference is very present in post-training stages now too
forrestthewoods•1 day ago
Inference costs are higher than training now. I think.

Nvidia is king of general purpose training chips. But inferences can be specialized.

lugu•about 20 hours ago
What makes you think this? With wider adoption the ratio shall shift in favor of inference. And API price is becoming more important than SOTA capability.
forrestthewoods•about 19 hours ago
> With wider adoption the ratio shall shift in favor of inference

Yes? That’s why more money will be spent on inference than training?

I’m talking absolute cost. As the number of people using AI and burning tokens goes up the amount of spend on inference goes up.

I am fairly confident that Anthropic has way way more GPUs serving Claude Code to users than they have training models. They’ve got a lot of users!!

> API price is becoming more important than SOTA capability.

Also yes? This is why custom silicon for efficient inference makes sense!

I think we’re in total agreement here :)

cactusplant7374•about 21 hours ago
Cerebras's Codex Spark 5.3 has been a huge flop. Small context window and old model. But hopefully they can improve so that we can benefit from 1000 tokens/second with GPT 5.5.
zer00eyz•1 day ago
> early testing shows that Jalapeño will deliver performance per watt substantially better than current state-of-the-art

We're starting to see what really matters here, and though this is hand wavy the TPU makes similar claims.

I think googles memo about having no moat still stands (see: https://newsletter.semianalysis.com/p/google-we-have-no-moat... if you are unaware). It kind of makes sense that all of this is looking more like 60's to 90's IBM, DEC, Cray, Sun and the hardware race that happened then. History doesn't repeat but it often rhymes and I suspect that these efforts will follow the same trajectory.

granzymes•1 day ago
To be clear, that is not "Google's memo". It's a memo by a guy who happened to work at Google. There is a diversity of opinions at a company that employs 180,000 people.
v5v3•1 day ago
>designed for initial deployment by the end of 2026 and expanding in the years ahead,

So after the IPO and will be featured heavily in the IPO sales brochure as a future promise?

I'm sceptical over any pre-IPO announcements.

estetlinus•about 24 hours ago
Yeah, the narrative feels like pre-IPO shenanigans, and it looks like the lid on my laundry basket. I wouldn’t be surprised if this is a con.
Culonavirus•about 20 hours ago
Con or not it is an obvious thing they have to do. Might as well promise.

IIRC their biggest cost they're "hiding" in their financials by doing creative accounting is inference (putting it into marketing and whatnot, in the billions)... if they can't hide it in their S-1 then they have to rationalize it, either by a) increasing the prices (not gonna happen, with token based billing orgs are already watching their codex spends) or b) lowering the inference costs. You can lower that by "soft optimizing" (dumbing down) your models but then you have the other players breathing down your neck (see quick rise of Claude), or actually optimizing, in software and in hardware. We're like 5 years into the rise of LLMs, there's not THAT much left on the table unless you write to the metal you specifically designed for your models (and I'm pretty sure the lack of "nvidia tax" would help with covering most of the r&d costs of a custom solution, at least in the long term).

50% cheaper inference without losses in fidelity would unquestionably be a massive win for OpenAI.

frandroid•1 day ago
Who's IPO? Broadcom and Google are already listed, obviously.
airspresso•1 day ago
OpenAI's upcoming mega IPO
awestroke•1 day ago
OpenAI, the non profit organization, is going to become a publically traded profit maximizing corporation
hk__2•1 day ago
> OpenAI, the non profit organization

No, the nonprofit org stays nonprofit, while the for-profit org it owns will become publically traded.

See https://openai.com/index/evolving-our-structure/

kilroy123•1 day ago
I hope to see something like this, but in a small form factor like the NVIDIA spark.

I want a super fast LLM that is Opus 4.6+, like, in ability.

wmf•1 day ago
Memory bandwidth is the bottleneck in the Spark. If you replace the SoC with an optimized ASIC but keep the same 256-bit LPDDR5 the performance will be the same. You can increase performance by using wider memory but that's also more expensive.
phonon•1 day ago
M3 Ultra has a 1024 bit memory bus (819 GB/s) and starts at $3,999 (96GB of RAM). It can be done....
bigyabai•1 day ago
The tradeoff is that the M3 Ultra's GPU loses to laptop GPUs in compute benchmarks. All of that bandwidth is wasted idling for token prefill.

For inference workloads, it makes a lot more sense to optimize for prefill/ttft before maxing out memory bandwidth.

smith7018•1 day ago
Unfortunately Sam Altman won't be the one to deliver us at-home hardware that can run Opus-level models
blitzar•about 23 hours ago
I wonder what is happening with the OpenAI / Jony Ive crossover episode.
flyinglizard•1 day ago
Forget about it. Datacenter class hardware is getting farther and farther from desktop use. It’s not PCIe GPUs anymore.
digitaltrees•1 day ago
We’ve entered the “if you care about software, build hardware” phase of AI
some-guy•1 day ago
I have been eyeing what Taalas is doing [1] by making pure hardware models. The speed is absurd.

[1] https://taalas.com/products/

mikewarot•1 day ago
They talk about products, but they don't sell the hardware, thus they don't really have a product, just a service.

I know, it's nick picking, but when people can just reach in and take services away, like Fable/Mythos, hardware is the only thing worth buying.

LoganDark•1 day ago
I'm sure they'll have a product for you if you have millions to invest in a partnership with them.
arcanemachiner•1 day ago
"Nitpicking"
jupr•1 day ago
crazy product. their test chatbot feels a db query.

https://chatjimmy.ai

digitaltrees•about 21 hours ago
I have and it was wild. Paradoxically it made me realize that I actually like reading the stream as it's generating.
wmf•1 day ago
“People who are really serious about software should make their own hardware.” ― Alan Kay
zwarag•1 day ago
What are the other phases. Or what are you referring to in general?
digitaltrees•about 13 hours ago
Mainframe punch card -> PC floppy disk -> cloud SaaS -> AI --> return to the land agrarian
theowaway213456•1 day ago
This seems like more competition for Cerebras? Am I understanding correctly?
HarHarVeryFunny•1 day ago
This is just an uncut wafer - I don't think it's intended to be a wafer-scale chip.

Cerebras etch memory onto the wafer alongside the processing elements, but AFAIK OpenAI are going to be using HBM memory and a conventional chiplet design.

KeplerBoy•about 23 hours ago
Still competition for cerebras. Seems quite unlikely they will get an OpenAI deal anytime soon.
smsx•about 23 hours ago
They have an OpenAI deal right now. https://openai.com/index/cerebras-partnership/
HarHarVeryFunny•about 23 hours ago
No - this is OpenAI trying to complete with Google (TPU) and Amazon/Anthropic (Trainium) on cost.

Cerebras are addressing very specific use cases, not general purpose LLM serving, and OpenAI does already partner with them.

Legend2440•1 day ago
The only surprising thing about this is that they didn't do it three years ago.
dadoum•1 day ago
> May we scale smoothly, exponentially and uneventfully through A[SI]

That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).

satvikpendem•1 day ago
I'm assuming they used LLMs to (help humans) do custom circuit design. Even pre LLM there were various computer optimizations that didn't require humans like genetic algorithms. It'd be cool to see a paper on how they did it.
fennecbutt•1 day ago
I mean I'd love to be able to buy something like the 17k tps taalas chip as a pcie or m.2.

Imagine when we can roar along at that speed, low power. Can just have the model reason for a while about anything and everything. It reminds me of the "race to idle" for mcus etc.

ipdashc•1 day ago
> 17k tps taalas chip

It's odd to me that I haven't heard anything about this approach (baking LLMs/weights into silicon directly) since. It seems almost common-sense that we're going to end up there eventually. And it feels like that point is drawing ever closer now that model capabilities, if not quite plateauing out, are at least getting to a "good enough" point for a LOT of use cases.

I wonder if it's being worked on in secret, if there's something about it that makes it infeasible, or if companies are really too nervous to lock in one model like that because the next one down the line could be a huge improvement. Re. infeasability, I have heard that the Taalas demonstration chip ran Llama 3.1 8B (a pretty horrible model) and that even that took a massive amount of transistors / die area. So it might just be the case that the good models are too big to fit on silicon?

topspin•1 day ago
I have also been thinking about this a lot, and share your belief that this is inevitable.

Taalas has a running demo here: https://chatjimmy.ai/

It's eye opening: generated an AVX-512 optimized Mersenne Twister in C in 0.076s, 13,706 tok/s. Too fast for the tok/s to be terribly accurate.

mdp2021•about 17 hours ago
> It's odd to me that I haven't heard anything about this approach ... I wonder if it's being worked on in secret, if there's something about it that makes it infeasible

The studies and efforts are ongoing and public, and there are technical hurdles to be faced - but the relevant works go back in time quite a lot and there is heightened interest in it now.

It seems that you simply took the "hyped headlines" for the whole of the work.

ipdashc•about 5 hours ago
> It seems that you simply took the "hyped headlines" for the whole of the work.

Well, yeah, that's what I'm saying. It's odd that there haven't been any major headlines (customer interest, competitors' announcements, etc) other than their initial demo. Good to hear it's being worked on though!

coder543•about 17 hours ago
> It's odd to me that I haven't heard anything about this approach since.

It has only been four months since they unveiled their first prototype. I don't understand your confusion. Chip development does not happen overnight...?

Their initial blog post laid out a roadmap, so theoretically they should have another thing to demonstrate this summer.

ipdashc•about 5 hours ago
In the sense of interested customers, online discussion, other companies doing the same thing, etc. Of course it takes time to get actual results, but from an outsider's perspective it's surprising that it was basically just their initial demo and that's more or less it so far. Excited to see if they come out with something this summer though!
mdp2021•about 16 hours ago
You are focusing on Taalas, but (specific) analogue computing, electronic NNs, compute-in-memory etc. - the field including the contextual approach - backdate to Rosenblatt.
wmf•1 day ago
Good models will require multiple Taalas chips but Groq and Cerebras also require a lot of chips and that hasn't stopped them.
ipdashc•about 5 hours ago
> Good models will require multiple Taalas chips

I guess that makes sense. Is this feasible, or does the added latency between chips kill any of the performance gains?

MichaelNolan•1 day ago
The current taalas chip is for a 3.1B param model. I’m hope so much that they can get that up to the 30B range. Just imagine Gemma 4 or Qwen 3.6 at 17k tps.
coder543•about 17 hours ago
Taalas' first chip is for a Llama 3.1 8B quant, not a 3.1B parameter model, to clarify.
qsxfthnkp2322•1 day ago
aw shucks nvda has some spicy competition

Make sure you all use that fancy ñ

boarush•1 day ago
They don't have true competition, what they lose out on is market share with hyperscalers, since OpenAI would have no plans to share inference hardware with any other company right now. Plus, I don't know how does NVIDIA's investment equation pans out long terms given OpenAI will be investing in more purpose built inference stack for the future.
ismailmaj•1 day ago
they're still kings for training, though I've heard Anthropic is training now on JAX+TPU setup, so might not be a monopoly in that segment.
Advertisement
gravypod•1 day ago
I wonder how close OpenAI is getting to using the memory they purchased. Are they planning to stack a huge amount of HBM2 into these chips?
wmf•1 day ago
I assume OpenAI has been buying memory and "giving" it to Nvidia in exchange for a discount.
fibonacci112358•1 day ago
So this is where all the memory they bought is going to.
babelfish•1 day ago
that's not really how it works
jabedude•1 day ago
how much does this chip help with inference speed?
wmf•1 day ago
It's probably the same speed but cheaper.
flyinglizard•1 day ago
I call BS. It’s probably a white label around existing Broadcom IP, impossible to go from zero to this kind of chip in nine months. I doubt OpenAI had any significant contribution.
zerohp•1 day ago
That’s exactly what this is.

9 months to production is completely impossible anyway.

9 months from design to early samples is probably impossible given than TSMC takes 3 months after tape out to produce them. Then it’s up to the customer to qualify and revise for production. TSMC doesn’t do that.

There’s no AI that makes this happen in 9 months.

Mistletoe•1 day ago
The similarities between the AI world and the crypto world are so much closer than any AI fanboy would ever admit.
jerojero•1 day ago
One thing I don't like about California based companies is how cringe the names always are.

"Jalapeño" is such a bad name, having an "ñ" already makes it difficult and annoying to deal with in so many little ways. Good luck with that.

But also, theres the sort of "yes lets use Mexican related things because we're California" thought that I just really hate. I don't know, its like corporate Memphis to me. You see a product like this, you know it's an uppity califonia based firm that came up with it.

thewebguyd•1 day ago
No worse, I suppose, than, the obsession with Lord of the Rings that the authoritarian surveillance companies have. Palantir, Anduril. Then we have the not defense/surveillance ones: Mithril, Valar, Narya, Erebor
skeledrew•1 day ago
What kinds of names would you suggest?
thewebguyd•1 day ago
None, probably. Just saying Jalapeño is no worse than any other non-descriptive company name. Although at least Palantir and Anduril are aptly named for what they do. The VC firms less so.
utopiah•1 day ago
Strawberry was too complicated as a codename.
CrzyLngPwd•1 day ago
Too many Rs.
smallmancontrov•1 day ago
Too many? But there are only two Rs in strawberry, how can that be too many?
anthk•1 day ago
Don't worry, in Europe it's the same, but for insurances/lawyer stuff. Tons of companies have names based on Latin words such as Civitas/Insalus/Legalia/Legalitas or whatever which looks tacky/rancid/old fashioned kilometers away.
qsxfthnkp2322•1 day ago
Jalapeño

Jalapeño

Jalapeño

Really has a… ring to it