FR version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
57% Positive
Analyzed from 2013 words in the discussion.
Trending Topics
#xor#trick#https#register#swap#more#compiler#used#org#using

Discussion (74 Comments)Read Original on HackerNews
We didn't get into the deeper question of benchmarking it vs. a three-register swap, because I suspect the latter would be handled entirely by register renaming and end up being faster due to not requiring allocation of an ALU unit. Difficult to benchmark that because in order for it to make a difference, you'd need to surround it with other arithmetic instructions.
A meta question is why this persists. It has the right qualities for a "party trick": slightly esoteric piece of knowledge, not actually that hard to understand when you do know about it, but unintuitive enough that most people don't spontaneously reinvent it.
See also: https://en.wikipedia.org/wiki/Fast_inverse_square_root , which requires a bit more maths.
The other classic use of XOR - cursor overdrawing - has also long since gone away. It used to be possible to easily draw a cursor on a monochrome display by XORing it in, then to move it you simply XOR it again, restoring the original image.
About a month later an engineer decided to turn on all warnings for gcc...and behold a message stating something to the effect of "WARNING: execution will halt upon reaching this statement." The compiled code basically just halted and never returned from the function call (can't remember the specifics).
And that is how we learned not to hide compiler messages.
clang: https://godbolt.org/z/5PEMTrxba
gcc: https://godbolt.org/z/7j8h4zo4v
Yes, error correction.
You have some packets of data a, b, c. Add one additional packet z that is computed as z = a ^ b ^ c. Now whenever one of a, b or c gets corrupted or lost, it can be reconstructed by computing the XOR of all the others.
So if b is lost: b = a ^ c ^ z. This works for any packet, but only one. If multiple are lost, this will fail.
There are way better error correction algorithms, but I like the simplicity of this one.
a = the bits of some song or movie
b = pure noise
Store c = a^b.
Give b to a friend. Throw away a.
Now both you and your friend have a bit vector of pure noise. Together you can produce the copyrighted work. But nobody is liable.
> https://ansuz.sooke.bc.ca/entry/23
> https://ansuz.sooke.bc.ca/entry/24
As these article outline, the legal situation is much more complicated (and unintuitive to people who are used to "computer science thinking").
There is no trace back from pure noise to the original work.
Colour of bits is just magical thinking.
Law is more akin to philosophy than computer science.
(unless you're an AI company, in which case you can copy the whole internet just fine)
A few months ago, I had a rare occasion of trying to explain them to a relative who had just bought a fancy NAS and wanted help setting it up.
a^=b^=a^=b;
Which allegedly saves you 0.5 seconds of typing in competitive programming competitions from 20 years ago and is known to work reliably (on MinGW under Windows XP).
Bonus authenticity: use `a^=a` to zero a register in a single x86 instruction (and makes a real difference for compiler toolchains 30+ years old).
For real now, a very useful application of XOR is its relation to the Nim game [0], which comes in very handy if you need to save your village from an ancient disgruntled Chinese emperor.
[0] https://en.wikipedia.org/wiki/Nim
> a^=b^=a^=b;
I believe this is defined in C++ since C++17, but still undefined in C.
Modern compilers will still use xor for zeroing registers on their own.
For instance: https://godbolt.org/z/n5n35xjqx
Variable a(register esi) is first initialized to 42 with mov and then cleared to zero using xor.
For me it falls in the obfuscated-C quadrant for code. Performance implications aside, it's just not the kind of "self-documenting" code I like lying around in my sources. (And I'll take clarity of purpose over performance every day.)
When you released to the menu button on the mouse, it did a similar swap to restore the screen contents. In version 2.0 I optimized it to do a simple copy of the offscreen rectangle back onto the screen because before it was wasting time in order to preserve the menu pixels that we’re going to be thrown away immediately.
One extension that I ran into, and which I think forms a nice problem is the following:
Just like the XOR swap trick can be used to swap to variables (and let's just say that they're bools), it can be extended to implement any permutation of the variables: suppose that the permutation is written as a composition of n transpositions (i.e., swaps of pairs), and that is the minimal number of transpositions that let's you do that. Each transposition can be implemented by 3 XORs, by the XOR swap trick for pairs, and so the full permutation can be implemented by 3n XORs. Now here's the question: Is it possible to come up with a way of doing it with less than 3n, or can we find a permutation that has a shortcut through XOR-land (not allowing any other kinds of instructions)? In other words, is XOR-swapping XOR-optimal?
I'm not going to spoil it, but only last year a paper was published in the quantum information literature that contains an answer [0]. I ended up making a little game where you get to play around with XOR-optimizing not only permutations, but general linear reversible circuits. [1]
[0] https://link.springer.com/article/10.1007/s11128-025-04831-5
[1] https://swapple.fuglede.dk
I'm not sure I'd call it a "trick", but since A ^ 0 = A, and B ^ B = 0, then ((A ^ B) ^ B) = A. i.e. XOR-ing any number by the same number twice gets you back the original number.
This used to be used back in the day for cheap and nasty computer graphics, since it means that if you draw to the screen by XOR-ing with the pixels already on the screen then you can undo it, restoring the background, by doing it a second time. The "nasty" part is that XORing with what's already on the screen isn't going to look great, but for something like a rotating wire-frame figure it might be OK.
Things like "add with saturation" and the special AES instructions.
It'll break eventually. If it matters, write the simd yourself. It'll probably be 2-50x better than the compiler anyways.
Yes there are false positives, and the false negative of all-zero/all-equal, but the test can be useful in a "bloom filter" type case.
Have used it in dynamic firewalling rules ... one can do something pretty close to a JA3/JA4 TLS fingerprint in eBPF with that (to match the cipher lists).
The article makes the same point as well at the end:
It is the kind of technique which might have been occasionally useful in the 1980s, but now is only useful for cute interview questions and as a curiosity.
(See also the wacky way in which ARM "load immediate" works)
https://lemire.me/blog/2022/01/21/swar-explained-parsing-eig...
For some reason this reminds me of the Fourier transform. I wonder if it can be performed with XOR tricks and no complicated arithmetic?
Solution: xor is just addition mod 2. Write the numbers in base n and do digit-wise addition mod n (ie without carry). Very intuitive way to see the xor trick.