DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
90% Positive
Analyzed from 713 words in the discussion.
Trending Topics
#speech#ums#try#whisper#project#starts#editing#subtitle#disfluencies#word

Discussion (17 Comments)Read Original on HackerNews
Disfluencies aren’t necessarily bad even if the word starts with “dis”!
I also don't care for writing that could have been made a lot more concise. It's a lot of work to make things shorter, but I think it's worthwhile.
If you speak with disfluencies, you probably didn't sufficiently rehearse your speech. If you didn't rehearse enough, you probably didn't put much effort into writing it either, so why should I put much effort into listening? It's the same principle as AI slop.
While it's a commercial product with a subscription, I spent a long time on the free tier not even hitting their limits until I started using it so extensively that I wanted to pay for it.
And I've used Whisper in the past, mostly for tinkering. I tried it for a couple of use cases but haven't touched the base project in a while. But I do regularly use Faster-Whisper-XXL, an open source project based on Whisper, for subtitle generation.
Though, for subtitle generation, I decided to support the project and mainly use the non-public build of Faster-Whisper-XXL Pro built for donators to the open source project.
The extra features smooth out the subtitle editing process very substantially. Toss in "--roformer_overlap 0.125 --roformer_vram 16 --best_of 15 --ff_vocal_extract mb-roformer --vad_method pyannote_v3" to the cli parameters (and sometimes --realign) and you have much less work to do in SubtitleEdit or Tero Subtitler afterwards to clean it up.
Ideally it would slice the video in the timeline without actually removing anything, so you can scrub through your video and try with and without each disfluency (thank you - awesome word) & decide case by case which to keep!
A trivial example is "umm... well... (sigh) okay" versus just "okay". Not okay!
https://github.com/dougcalobrisi/erm