ZH version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
70% Positive
Analyzed from 2412 words in the discussion.
Trending Topics
#query#request#body#post#cache#rfc#http#idempotent#method#data

Discussion (72 Comments)Read Original on HackerNews
Even imagining a QUERY with a large JSON filtering structure, or say an image input as request body, it feels extremely odd to include the request body as part of the cache key. It also implies an unbounded and user-controlled cache key, with the only really meaningful general caching strategy being bitwise compare of the request body (or a hash), which in a hostile scenario implies cache busting would be trivial.
This invokes multiple semantic oddities in one go with obvious difficulties for a very niche use case. If I'm writing a service that needs complex filtering or complex input like an image, any form of caching (e.g. individual data columns of a join, or embeddings keyed by perceptual hashes of a decoded image input) is going to be far away from the HTTP layer and certainly unrelated to the exact bit representation of the request on the wire.
Why even bother trying to capture this in a generic way?
I would be far more inclined to try and capture this caching semantic as a new header for POST. Something like "Vary: request-body" or similar. Perfectly backwards compatible and perfectly ignorable for all but the 0.1% of CDN use cases where the behaviour might turn out useful
The query part of GET's URI is also barely bounded in practice and user-controlled, and is indeed used as part of the cache key (because it's a part of URI), so I am not sure why you raise this objection at all.
It feels very pointless and there is no drawback of just using POST
I've found some sites that tack on a session ID and if you try to tamper with the URL in any way, it sends you back to "Page 1" really annoys me lol at that point let me skip to any page with your web UI.
Realistically, systems for the public internet will use a secure hash as the cache key so it'll always be the same size. The cache key already includes a URL that can be very long, and an arbitrary set of header values.
While the concern is valid, caching is entirely optional at query level, therefore it is totally valid to cache only certain "filters".
I guess it's about resolving the odd semantics of using POST which is not idempotent and thus allowing easier control flow of caches and retrys.
Your perspective is 100% correct if you think at the application-layer, but with a dedicated method, you can have that behaviour out-of-the-box out of your HTTP infrastructure (whether it's at your hyperscaler's router or your apache/nginx/browser whatever) and stop implementing yourself the post-as-a-query edge case.
/?hash=123456789
If method=QUERY were added, there would be a new variety of this weirdness.
The team will have to wait for a new header and textarea specs to fix the rest of the jank.
This site is so awful lol. Why don’t they update it?
Comment submission isn't safe, so QUERY can't be used there. And it doesn't suffer from the problem anyways, since HN returns a 3XX on successful submission, so refreshing doesn't show a warning.
https://www.rfc-editor.org/rfc/rfc10008.txt
I've been sending request body along GET method for years now
When it's your client talking to your server you can obviously do whatever you want - it doesn't cause problems until you want to involve third-party code, such as a reverse proxy (such as nginx) or a CDN. This includes proxies your customers may be using.
Generally not a great idea. With some http implementations this is not even possible (for example, fetch)
https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/U...
> You cannot include a body with GET requests
And transparent caching might result in weird issues.
I like that we now have a way to not being forced to define Resources when we want to query. It always felt like I was missing something that there could be an infinite, defined-on-the-fly number of Resources for a "part" of a given Resource. Do I really want to define "all cats that sleep more than 20 hours a day and like sunbeams and want to eat breakfast at 3 am" as a Resource? (ok, we all know that is actually the full set of cats). I'm ok that you want to define that as a Resource but in my system, it makes more sense that Cats is the Resource and I just need some accepted way to query.
I like the implementation (again, as just a guy that programs). I don't see how it could have done it better or simpler which probably hides the complexity of getting there.
I also especially appreciate how the spec is written. Opening a spec, I wonder how far I'll get before I don't know what the heck they're talking about (and, again, as just a guy that programs). I don't think it's easy to write a spec that is complete and approachable like this. Really appreciate that.
https://manifold.markets/CollectedOverSpread/when-will-rfc-1...
So of 10008 is the first one after 10000, that date is the one to bet on.
https://mailarchive.ietf.org/arch/msg/tools-discuss/EpoQcVt_...
RFC #s are issued sometime before publication, so they can come out out of order. I would expect 9999, 10001, etc. to show up eventually.
just wow, people seem to be having too much money it seems for them to bet over when RFC's are gonna get released.
This isn't even one of the worst offenders on prediction market or even comparable to it but I am just amazed (in a negative manner, surprised? its just strange) by the depth on what people actually bet on these markets.
I’ve enjoyed the combination with Range headers for paging, despite this tidbit:
> It is expected that these built-in features will be used instead of HTTP Range Requests
Using the QUERY request as the definition of a set, and Range to retrieve subsets seems very natural.
I think the name is confusing because the term 'query' is already used to refer to http requests in general.
Just the title of the RFC confused me.
There is one interesting variant though, which uses state: The client sends a QUERY containing the full query, and the server returns a url usable with GET with which this query can be triggered in the future. Similar to prepared statements in SQL databases.
Using QUERY for GraphQL queries (not mutations) would be a good match. These only read data, but are sometimes bigger than the url length limit.
I still don’t get how idempotency can typically be ensured without state. It very much depends on data model and application design. Even side effects like using a user’s lookup quota need to be handled at a higher layer than HTTP (I think?).
If it's not actually idempotent but you're telling the browser it is, of course you may cause bugs. Same as GET.
But what the Query method really targets are things like a graphql query that can be multiple kb for a single query, but only reads data. Sure, it might count against rate limits, trigger logs, etc. But at a conceptual level resubmitting the same query should give the same result (if the data didn't change). And since you are only reading data, resubmitting is safe
Well, how is "GET /index.html HTTP/1.1" made idempotent in practice without (additional) state?
> A QUERY request from user agents implementing Cross-Origin Resource Sharing (CORS) will require a "preflight" request, as QUERY does not belong to the set of CORS-safelisted methods (see [FETCH]).