How many e's are in the word seventeen [video] (AI hallucination)

wvbdmp•about 3 hours ago

There is an old German joke from the comedy “Die Feuerzangenbowle”, where someone gives his name as “Pfeiffer, with 3 fs”, which is funny because it is robotically hypercorrect, since everybody knows the only necessary clarification is between “Pfeiffer” vs. “Pfeifer”, double-f vs. single f. So in a way “there are two rs in strawberry” is a much more human answer than 3 and not a mistake, because in any normal situation the asker is clearly just interested in the “berry” part. This weird sycophancy, however, is entirely preposterous and hopefully just an artifact of some deliberate “the customer is always right” policy corporate tacked on, rather than a fundamental limitation of the technology.

gdulli•about 1 hour ago

I've never been to Germany but for what it's worth when I hear "how many r's are in strawberry?" I'd never think of anything but 3. (I am human.) Asking how many r's are in the longest run of r's in strawberry is a whole different question and I don't know why somebody would assume that more complicated one.

Imustaskforhelp•about 5 hours ago

I really found this fascinating as I had thought that these type of problems of how many e's are in the word strawberry etc. were stopped but as this video shows, perhaps its just that this question of how many e's are in the word strawberry itself got part in the training data and so even a slight variation of asking it for seventeen makes it fumble. I had thought that this was a solved issue but actually it isn't which was a bit fascinating to see in this video, so much so that I had to test it out and I found out that AI still hallucinates and had the same result for the most part.

CamperBob2•about 3 hours ago

The Qwen 3.6 27B 8-bit quant has no problem with it. I'd guess that most thinking models won't fail this kind of test anymore, while some base or instruct models that are not post-trained for reasoning will still fail it.

I also can't reproduce it in ChatGPT 5.3 Instant with auto-thinking disabled. Solved problem, as far as I'm concerned. Maybe this particular case was a bug in the voice model, or just some BS the YouTuber made up for clicks. (Notice that we never actually see the answer in text form.) Mission accomplished, I guess.

Imustaskforhelp•about 2 hours ago

For what its worth, I tried this myself in chatgpt before uploading it and it said to me that there are three e's which is what made me upload it as I just had to try it out so there's my anecdotal evidence which was the reason why I uploaded it in the first place.

Actually let me replicate it, here you go: https://chatgpt.com/share/69f7a27a-2634-83e8-bffa-520bd2ad47...

I am saying that these models are still incredibly finnicky, I can sometimes get the right answer too don't get me wrong but its just fundamentally unpredictable and seems more like guess work at times too just as how for the original video person, it said there are no e's first then said 4, then said 5 and for me it said 3 but sometimes it would say 4 too.

So my point is saying that its a solved problem doesn't seem accurate to me if I am able to replicate it from my testing and the first time that I tried it in my chatgpt it also said 3.

Edit: here is another chatgpt link seperate from the first one I shared where it says 3 again https://chatgpt.com/share/69f7a3a6-aa1c-83e8-b622-52cb2a9b10...

And I tried another time too so here is yet another one https://chatgpt.com/share/69f7a3ee-07c8-83e8-ba43-65800d8907...

Do note that All links are different even though they share the first 69f7a3ee but the whole links/chats are different)

CamperBob2•about 1 hour ago

Weird. What model is selected? I can't get anything but 4 out of GPT 5.3 Instant, which should be the weakest available in the current generation. Try this one: https://chatgpt.com/s/t_69f7a8657368819185b2830297216b2b

Edit due to rate limiting: No, I totally believe you, it's just not 100% consistent as you might expect. It did start returning three on occasion when I tried it again, maybe one out of 5 times. Pretty crazy when a 27B free Chinese model outperforms GPT 5.x.

In general, I wouldn't normally try a prompt like this without turning on thinking mode. Notice how Qwen painstakingly spells it out: https://i.imgur.com/FMKXB1M.png

How many e's are in the word seventeen [video] (AI hallucination)

⚡ Community Insights

Discussion (7 Comments)Read Original on HackerNews