Dav2d
228
HI version is available. Content is displayed in original English for accuracy.
HI version is available. Content is displayed in original English for accuracy.
Discussion Sentiment
Analyzed from 1786 words in the discussion.
Trending Topics
Discussion (83 Comments)Read Original on HackerNews
https://www.sisvel.com/insights/av2-is-coming-sisvel-is-prep...
yep
Otherwise it was under a constant DDoS by the AI bots.
For instance, MCP, static sites that are easy to scale, a cache in front of a dynamic site engine
Our documentation and a main website are not fronted by this protection, so they're still accessible for the scrapers.
What am I missing that explains the gap between this and “constant DDoS” of the site?
Even when the amount of AI requests isnt that high - generally it's in hundreds per second tops for our services combined - that's still a load that causes issues for legitimate users/developers. We've seen it grow from somewhat reasonable to pretty much being 99% of responses we serve.
Can it be solved by throwing more hardware at the problem? Sure. But it's not sustainable, and the reasonable approach in our case is to filter off the parasitic traffic.
- AI scrapers will pull a bunch of docs from many sites in parallel (so instead of a human request where someone picks a single Google result, it hits a bunch of sites)
- AI will crawl the site looking for the correct answer which may hit a handful of pages
- AI sends requests in quick succession (big bursts instead of small trickle over longer time)
- Personal assistants may crawl the site repeatedly scraping everything (we saw a fair bit of this at work, they announced themselves with user agents)
- At work (b2b SaaS webapp) we also found that the personal assistant variety tended to hammer really computationally expensive data export and reporting endpoints generally without filters. While our app technically supported it, it was very inorganic traffic
That said, I don't think the solution is blanket blocks. Really it's exposing sites are poorly optimized for emerging technology.
I think the world gains more if the VLAN team focuses on their amazing, free contribution to the world, than if they spend the same time trying to figure out how to save you two clicks.
We all hate that this is happening, but you don't need to attack everyone that is unfortunately caught up in it.
If you have discovered such an option, you could get very wealthy: minimizing friction for humans in e-commerce is valuable. If you're a drive-by critic not vested in the project, then yours is an instance of talk being cheap.
Keep in mind that those kinds of services: - should not be MITMed by CDNs - are generally ran by volunteers with zero budget, money and time-wise
It's rarely been the citizens that have been the problem, but the governments and companies that seek the use the network connection for their overwhelming benefit.
Re (above):
> Not on topic, but wow the internet has very quickly devolved into: click -> "making sure you're not a bot", click -> "making sure you're a human", click -> "COOKIES COOKIES COOKIES", click -> "cloudflare something something"
That being said, so many of the plebs suck. Like 2% will ruin everything for everyone.
it is incredibly annoying but what can you do? AI scrapers ruined the web.
Then I press the X to close the all-caps banner commanding me to install the app, upon which I get sent to the app store. Users of the website refer to it as an app.
Wow, this gitlab instance looked so much cleaner/simpler and less clunky than my past experiences! Also loaded really fast on first page load as well as subsequent actions
https://www.deb-multimedia.org/dists/unstable/main/binary-am...
... it says "fast and small AV1 video stream decoder"
... should probably be "AV2" ?
Happy, AV2 decoding already here.
:)
>look inside
>it's C
Since dav2d is newer it has a higher fraction of C, but not enough for it to be the main language in the codebase :)
There's literally a DSL designed for this purpose (Wuffs) so it would be interesting to hear why they didn't use it.
https://www.youtube.com/@Dave2D
https://en.wikipedia.org/wiki/Dangerous_Dave_in_the_Haunted_...
I wonder IFF Rust had an effects system that a Jasmin MIR transform (ie like SPIRV is for shaders) would be useful?
https://github.com/jasmin-lang/jasmin
However for the container/extractor... those should absolutely be in a memory safe language, and those are were a lot of the exploits/crashes are, too, as metadata is more fuzzy.
As a practical example of this see something like CrabbyAVIF. All the parser code is rust, but it delegates to dav1d for the actual codec portion
Really? How many codecs have your neighbors contributed money for the development of, just curious.