Back to News
Advertisement
Advertisement

⚑ Community Insights

Discussion Sentiment

100% Positive

Analyzed from 194 words in the discussion.

Trending Topics

#turbopuffer#lot#vector#performance#scale#thanks#layer#opendata#terms#major

Discussion (3 Comments)Read Original on HackerNews

oliverioβ€’about 2 hours ago
Very interesting, thanks for sharing. This has a lot of nods to Turbopuffer's architecture [0]. My impression is they've spent a lot of time optimizing at the hardware/firmware layer to achieve extremely fast query results.

Inarticulately - how ~close is OpenData Vector to Turbopuffer in terms of performance today and where are the major gaps + mountains to scale?

Really excited to keep an eye on the repos, great read!

[0]https://turbopuffer.com/blog/turbopuffer

rohanpdesβ€’about 2 hours ago
Yep! Vector provides a lot of the same benefits, just as an OSS project. They were definitely a major inspiration. Vector's performance is similar to their published benchmarks. The biggest gap is (unsurprisingly) for larger (e.g. 100s of M - 1B+) datasets. We talk about it in the post, but the main improvement there is adding quantization to reduce the overhead of loading large posting lists. There's also a bunch of storage and caching layer work to be done. That's on our roadmap along with some cool features like full-text search and better support for multi-tenancy.
apurvamehtaβ€’about 2 hours ago
Thanks! opendata contributor here.

We're heavily inspired by Turbopuffer. I'd say we are comparable to them when they launched in terms of perf and scale. But they've obviously invested heavily since then, so we're not going to match them on raw perf at scale right now. Our goal is to be a pretty competitive OSS offering over the long term though.

The next biggest lift for us to get much closer is quantization. If we squeeze more signal into fewer bits, we will improve performance end to end.