FR version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
67% Positive
Analyzed from 634 words in the discussion.
Trending Topics
#code#jit#binary#static#binaries#translation#elevator#still#self#modifying

Discussion (16 Comments)Read Original on HackerNews
I am not sure what QEMU's JIT is doing (in its userspace wrapper), but I think it has a lot of room to improve.
In 2013 I wrote a x86-64 to aarch64 JIT engine that was able to run what was then Fedora beta aarch64 binaries and rebuild almost the entire aarch64 port of Fedora on a x86_64 Linux. I also made a reverse aarch64 to x86-64 JIT that worked in the same way, and for fun I also showed the two JITs managing to run each other in a loop back fashion: x86-64 -> aarch64 -> x86_64 in the same process.
The JIT I devised did a 1-to-many instruction and CPU state mapping with overhead that was somewhat 2x to 5x slower than what would be expected to native recompiled code. I later compared this with QEMU's JIT which seemed more in the range of 10x to 50x slower.
Unfortunately this was not under a open source license settings, so no code release to prove it.. :(
I mostly work on stuff from the 90s, but disassemblers make a lot of assumptions about where code starts and ends, but occasionally a binary blob is not discoverable unless you have some prior knowledge (pointer at a fixed location to an entry point).
I would think after a few passes you could refine the binary into areas that are definitely code.
Why only x86_64? It has more sense to convert 32-bit programs, like many old games.
> Self Modifying and JIT-Compiled Code. Elevator, like all fully static binary rewriters, does not support self modifying or just-in-time-compiled code.
Maybe try an emulator? There's also this project I found: https://github.com/andirsun/Slacky
still, pretty cool for cooperative binaries
Edit I found this in the paper
> Elevator sidesteps the code-versus-data determination altogether through an application of superset disassembly [6]: we simultaneously interpret every executable byte offset in the original binary as (i) data and (ii) the start of a potential instruction sequence beginning at that offset, and we build the superset control flow graph from every one of the resulting candidate decodes. Every potential target of indirect jumps, callbacks, or other runtime dispatch mechanisms that cannot be statically analyzed therefore has a corresponding landing point in the rewritten binary. These targets are resolved at runtime through a lookup table from original instruction addresses to translated code addresses that we embed in the final binary.
executable stacks are still common (incl on windows with some settings), and sometimes they are required (eg for gcc nested functions)