I Will Not Add Query Strings to Your URLs

ssusam about 3 hours ago 25 commentsRead Article on susam.net

ZH version is available. Content is displayed in original English for accuracy.

⚡ Community Insights

Discussion Sentiment

32% Positive

Analyzed from 1729 words in the discussion.

Discussion (25 Comments)Read Original on HackerNews

legitster•3 minutes ago

Query strings are awesome. Especially for one-page applications.

I build a lot of internal applications, and one of my golden UI rules is that a user should be able to share their URL and other users should be able to see exactly what the sender did.

So if you have a dashboard or visualization where the user can add filters or configurations, I have all of their settings saved automatically in the URL. It's visible, it's obvious, it's easy, it's convenient.

>There is also a moral question here about whether it is okay to modify a given URL on behalf of the user in order to insert a referral query string into it. I think it isn't.

These dogmatic technical screeds are all so weird to me. They usually reveal more about the authors lack of experience or imagination than provide a useful truism.

ChrisMarshallNY•4 minutes ago

> It is a small, decentralised, self-hosted web console that lets visitors to your website explore interesting websites and pages recommended by a community of independent personal website owners.

Back in the Stone Age, we called these “Webrings,” but they weren’t as fancy.

One of the issues that I faced, while developing an open-source application framework, was that hosting that used FastCGI, would not honor Auth headers, so I was forced to pass the tokens in the query. It sucked, because that makes copy/paste of the Web address a real problem. It would often contain tokens. I guess maybe this has been fixed?

In the backends that I control, and aren’t required to make available to any and all, I use headers.

jedimastert•40 minutes ago

You know I was actually really curious about this so I went back to the HTML and URL W3C standards and surprisingly they don't actually have any definitions of format other than being percent encoded. One might conflate query strings with "form-urlencoded"[0] query strings, which is one potential interoperability format, but in general a queries string is just any percent encoded string following a "?" in a url[1], and just another property in the "URL" HTML object that can be used in the generation of a response. While additionally there is a URLSearchParams object that is the result of parsing the query string with the form-urlencoded parser, this is simply an interoperability layer for JavaScript.

I'm going to be honest, I was pretty geared up to have a contrarian opinion until I looked at the standards but they're actually pretty clear, a 404 could be a proper response to unexpected query string; query string is as much part of the URL API as the path is and I think pretty much everyone can acknowledge that just tacking random stuff onto the path would be ill advised and undefined behavior.

[0]: https://url.spec.whatwg.org/#application/x-www-form-urlencod...

[1]: https://url.spec.whatwg.org/#url-class

humodz•19 minutes ago

The tone of this and Chris's post gives me the impression that it's harmful to include these query parameters, but I don't understand how. Could someone elucidate me? I understand it can mangle some URLs and that's good enough reason not do it, but even then it seems like a minor incovenience.

phoronixrly•12 minutes ago

Oh, I have a couple - the users did not agree on being tracked (these query params are tracking information), and the site administrator does not want incoming traffic to be tracked. I know the latter can be hard to understand, but for example sure as hell do not want to have any info in my logs that can be used to harm my users.

On a more personal note, I hate it when I go to copy a link to send via a message, and the tracking code glued onto it is twice as long as original URL... I either have to fiddle around with it to clean it up or leave the person I sent it to to wonder wtf am I on about with a screenful of random characters...

So it's violating users' privacy, it's shit UX, and on top of that, nobody asked for it...

1shooner•about 1 hour ago

>So I’ve decided to try a blanket ban for this site: no unauthorised query strings.

His site returns (I think incorrectly) a 414 if a request includes a query string. If this protest is meant to advocate for the user, who presumably wasn't able to manage that string in the first place, why would you penalize them for it being there?

Why not just use it as a cue to tell users how they can make this decision themselves (e.g. through browser tools)?

jampekka•about 1 hour ago

"You could argue that I’m abusing 414 URI Too Long. I respond that it’s funnier this way. Other options I considered were:

    400 Bad Request, the generic client error code, which is correct but boring;

    402 Payment Required, and honestly if you want to pay me to make a particular URL with query string work, I’m open to it;

    404 Not Found, but it’s too likely to have side effects, and it doesn’t convey the idea that the request was malformed, which is what I’m going for; and

    303 See Other with no Location header, which is extremely uncommon these days but legitimate. Or at least it was in RFC 2616 (“The different URI SHOULD be given by the Location field in the response”), but it was reworded in 7231 and 9110 in a way that assumes the presence of a Location header (“… as indicated by a URI in the Location header field”), while 301, 302, 307 and 308 say “the server SHOULD generate a Location header field”. Well, I reckon See Other with no Location header is fair enough. But URI Too Long was funnier."

https://chrismorgan.info/no-query-strings?foo

1shooner•about 1 hour ago

Also from the 414 page:

>Complain to whoever gave you the bad link, and ask them to stop modifying URLs, because it’s bad manners.

It's ironic that an error response so blatantly violating the robustness principle is throwing shade about bad manners.

wizzwizz4•34 minutes ago

The robustness principle is itself bad manners, in plenty of contexts. If I deliver packages by throwing them at the customer, I really want a customer to tell me "hey, don't throw packages at me!" before I attempt to lob something fragile and breakable, or something heavy at someone fragile and breakable. Otherwise, how am I supposed to learn that I'm doing anything wrong?

bryanrasmussen•about 1 hour ago

It's been years but I seem to remember there was a version of PLSQL server pages that would return 500 if you tried to pass in an unknown query string.

gtowey•about 1 hour ago

"wander console" sounds like they're just web rings re-invented. In the era of forced feeds by giant corporations which consist of the things they want you to see, I've wondered if this old idea would make a comeback. Human curated content from trusted people seems like the only way forward.

SoftTalker•about 1 hour ago

FTA: It is also a bit like web rings except that the community network is not restricted to being a cycle; it is a graph and it is flexible.

arjie•about 1 hour ago

Just referrer policy of strict origin when cross origin gives host level referer (sic) header in most mainstream browsers unless user has configured otherwise right? That’s usually enough for web authors to know what audience they’re appealing to and privacy-maximizers can turn off that header sending.

sigseg1v•about 1 hour ago

Adding query strings is one of those things that I think a lot of sites could get away with more easily if they were reasonable about it.

A link that is "https:// web.site" is fine.

A link that is "https:// web.site?via=another.site" is fine.

A link that is "https:// web.site?fbm=avddjur5rdcbbdehy63edjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63edaaaddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednzzddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63ednddjur5rdcbbdehy63edn"

is annoying as shit and I need to literally apologize to people after sending it if I forget to manually redact the query string. Don't abuse this.

culi•about 1 hour ago

There are addons to remove unnecessary params from the worst offending sites:

https://www.google.com/search?q=clearurls+addon

franciscop•about 1 hour ago

Thanks for removing the rest on that google link, the one I get after switching to "images" and back to "web" is this monstrosity:

https://www.google.com/search?newwindow=1&sca_esv=8061bd9cb1...

Edit: which luckily and sensibly Hacker News cuts short since it's 463 characters

gwern•about 1 hour ago

Query strings break unpredictably, and that alone is enough to ban them by third parties, especially for something as minor as referral tracking.

Example: The Browser is a well known link aggregation paid periodical. I subscribe, and every 1 in 10 or 20 links I clicked, it'd just break outright and I'd have to tediously edit the URL to fix it (assuming the website didn't do a silent ninja URL edit and make it impossible for me to remember what URL I opened possibly days or weeks ago in a tab and potentially fix it). This was annoying enough to bother me regularly, but not enough to figure out a workaround.

Why? ...Because TB was injecting a '?referrer=The_Browser' or something, and the receiving website server got confused by an invalid query and errored out. 'Wow, how careless of The Browser! Are they really so incompetent as to not even check their URLs before mailing an issue out to paying subscribers?'

I wondered the same thing, and I eventually complained to them. It turns out, they did check all their URLs carefully before emailing them out... emphasis on 'before', which meant that they were checking the query-string-free versions, which of course worked fine. (This is a good example of a testing failure due to not testing end-to-end or integration testing: they should have been testing draft emails sent to a testing account, to check for all possible issues like MIME mangling, not just query string shenanigans.)

After that they fixed it by making sure they injected the query string before they checked the URLs. (I suggested not injecting it at all, but they said that for business reasons, it was too valuable to show receiving websites exactly how much traffic TB was driving to them on net, because referrers are typically stripped from emails and reshares and just in general - this, BTW, is why the OP suggestion of 'just set a HTTP referrer header!' is naive and limited to very narrow niches where you can be sure that you can, in fact, just set the referrer header.)

But this error was affecting them for god knows how long and how many readers and how many clicks, and they didn't know. Because why would they? The most important thing any programmer or web dev should know about users is that "they may never tell you": https://pointersgonewild.com/2019/11/02/they-might-never-tel... (excerpts & more examples: https://gwern.net/ref/chevalier-boisvert-2019 ). No matter how badly broken a feature or service or URL may be, the odds are good that no user will ever tell you that. Laziness, public goods, learned helplessness / low standards, I don't know what it is, but never assume that you are aware of severe breakage (or vice-versa, as a user, never assume the creator is aware of even the most extreme problem or error).

Even the biggest businesses.... I was watching a friend the other day try to set up a bank account in Central America, and clicking on one of the few banks' websites to download the forms on their main web page. None of the form PDF download links worked. "That's not a good sign", they said. No, but also not as surprising as you might think - the bank might have no idea that some server config tweak broke their form links. After all, at least while I was watching, my friend didn't tell them about their problem either!

julianlam•about 1 hour ago

> After I implemented that feature, a page from one of my favourite websites refused to load in the console... the third URL returns an HTTP 404 error page. The website uses the query string to determine which one of its several font collections to show.

Yes, let's unilaterally decide that query strings are bad because one website (ab)uses query strings to load different fonts.

It's the query strings that are the problem, not the website!

jfc.

Look, I'm against utm fragments as much as the next guy, but let's not throw away a perfectly good thing because tracking is evil.

ergonaught•about 1 hour ago

Adding your own garbage to someone else's URLs is in fact the problem. Could they handle your garbage better? Sure. Is your garbage still a problem? Yes.

SoftTalker•about 1 hour ago

Postel's law worked OK when people operated in good faith. But today the internet is full of abusers. Rejecting requests that aren't exactly what they should be is probably the best policy now.

wtallis•16 minutes ago

Postel's law is typically stated as "be conservative in what you do, be liberal in what you accept from others". It's unfortunately common for people to ignore the first half and hallucinate a third clause demanding that the recipient stay silent about the errors they receive.

InsideOutSanta•about 1 hour ago

That website is not abusing query strings, though, its usage of query strings is perfectly cromulent. And tfa is not saying not to use query strings, but not to append random garbage to other people's URLs.

jorams•about 1 hour ago

The website uses the feature for its intended purpose. Adding random trash to the query string of another website assuming it'll ignore it is in fact a bad idea, always, even if you can usually get away with it.

LocalH•about 1 hour ago

The problem is adding query strings to the URLs of others. It's peak entitlement to think that's proper

jedimastert•38 minutes ago

> one website (ab)uses query strings

Really not abusing abusing query strings from a standards perspective, a 404 is not an improper response to an unexpected query string