Memory Safe Context Switching (longjmp, setjmp) in Fil-C

mmodeless about 3 hours ago 19 commentsRead Article on fil-c.org

ZH version is available. Content is displayed in original English for accuracy.

⚡ Community Insights

Discussion Sentiment

30% Positive

Analyzed from 1180 words in the discussion.

Discussion (19 Comments)Read Original on HackerNews

matheusmoreira•about 1 hour ago

This is an article I wish I could have read many months ago.

> Hence, the most basic safety issue with setjmp is that if we call it and then return from the function that had called it, the context saved by setjmp is not valid to longjmp to.

> longjmp is only safe if it's called at a time when the stack frame used by setjmp could not have possibly been overwritten, since that is the only way to guarantee that the register state restored by longjmp matches the stack frame that the stack pointer points to.

That limitation could be lifted by simply copying the stack frames somewhere else prior to long jumping, and then spilling that entire thing on top of the current stack instead of just restoring the registers from the jump buffer. This is how delimited continuations work! What ruins this for C is the existence of pointers. Stacks aren't freely relocatable since pointers into the stack could exist. Other languages don't have this problem.

So much fun stuff in this article! The "fibers with ucontext", essentially swapping stack pointers back and forth, are how I implemented generators! I too reached for musl source code in order to understand setjmp, but for a different reason: its ability to spill the registers onto the stack was instrumental for my garbage collector.

Blogged about all of these things too, in case anyone is curious:

https://www.matheusmoreira.com/articles/delimited-continuati...

https://www.matheusmoreira.com/articles/generators-in-lone-l...

https://www.matheusmoreira.com/articles/babys-second-garbage...

pizlonator•about 1 hour ago

I’ve used the copy-stack trick before! It’s really great!

You can work around the pointer relocation issue by always coping the stack back onto the main stack. So you’re always running on the same range of stack in memory and saved stacks are always elsewhere

matheusmoreira•28 minutes ago

Is this the technique you are describing?

https://langdev.stackexchange.com/a/4242

https://www.microsoft.com/en-us/research/wp-content/uploads/...

> How does he do it? By simply always restoring the continuation to exactly the place it was captured from, of course!

Pretty awesome. Gets around the problem by not relocating at all. I haven't read the full paper, to be honest. I just assumed it'd require defensive copying in case the stacks overlapped in memory.

Alexis King does outline the safety constraints that Fil-C would care about:

> references to stack-allocated data must not be shared across continuation chunk boundaries, as both capture and restore may relocate portions of the stack, making all references to stack-allocated memory in the relocated portions temporarily dangling

Onavo•about 1 hour ago

> What ruins this for C is the existence of pointers. Stacks aren't freely relocatable since pointers into the stack could exist. Other languages don't have this problem

What about languages with pass by reference?

matheusmoreira•37 minutes ago

I suppose it would depend on the implementation. References could be implemented with absolute pointers, or with base pointers plus offsets.

Base pointers are trivially relocatable. Just set the base pointer and it's done. This is how my language supports stack expansion: just reallocate the memory, overwrite base pointers, elements always dereference base plus offset, done. The heap is also implemented this way. All objects are indexes into a massive object array, and the base pointer is implicit.

https://www.matheusmoreira.com/articles/lone-lisp-heap

Absolute pointer references are not easily relocatable. You'd need to rewrite every single pointer. I've never seen anyone actually do this in C since it'd hit the same problem conservative garbage collectors do: you might rewrite an integer variable that just happens to look like a pointer.

If you're audacious enough you can try to somehow remap the stacks in the exact same spot they used to be at in order to not invalidate the pointers to begin with:

https://langdev.stackexchange.com/a/4242

https://www.microsoft.com/en-us/research/wp-content/uploads/...

Some languages like Go manage to do it because they just know where all the pointers are in most circumstances. They tend to choke only on foreign code they can't introspect into.

gruntled-worker•about 2 hours ago

No complaints about this in particular, but code that uses setjmp/longjmp often has a risk profile that's way bigger than memory safety alone. If you're stuck with them then by all means, mitigate all you can.

pizlonator•about 1 hour ago

What misuse are you imagining that isn’t a memory safety problem?

You might find that Fil-C prevents those too. It’s pretty strict. You can only use longjmp to pop stack like an exception would

gruntled-worker•27 minutes ago

Resource leaks, crossing non-exception-safe library/system code, CPU-specific quirks like accidentally unrestored FP/vector/control state, etc. Granted it's always been highly system-specific stuff, but that's the worst kind.

pizlonator•24 minutes ago

Gotcha, that’s a good list.

It’s true that Fil-C doesn’t try to protect you from those bugs. I just don’t think of those as the worst things that can happen when you misuse these APIs.

anitil•about 2 hours ago

How interesting! I thought that setjmp and longjmp were probably incompatible with Fil-C. And I'd somehow never heard of ucontext at all.

I suppose managing the stack is still managing memory after all, even if we typically don't think of it that way, so Fil-C has something to add here.

It's really worth reading the section here about the complexity of setjmp/longjmp and how they interact with register allocation and stack spilling. I knew they're tricky, but going in to the specifics is delicious.

nanolith•about 1 hour ago

> For example, Boost uses ucontext as part of its fiber implementation.

Maybe for the incredibly slow fallback, it does. Boost context and Boost fiber has ABI support for *nix / MacOS / Windows for x86_64 and ARM/ARM64. The overhead for a fiber switch using this support is about as heavy as a virtual function call. In comparison, ucontext is very heavy.

I wrote my own fiber library for C. I got the idea from an old implementation I saw that used setjmp and longjmp, which took me down the rabbit hole of figuring out how to do this more efficiently and with an improved margin of safety. I chose to follow Boost's example, and in fact, used some of their fiber switch assembler with attribution in my library.

pizlonator•about 1 hour ago

> In comparison, ucontext is very heavy

It's heavy because it switches the signal masks.

Indeed, Fil-C's ucontext logic does this today, because I'm relying on glibc, and that's what glibc does.

But it would be straightforward to teach the internal Fil-C zfiber_context API to not save the sigmasks. It would just mean using some other backend for setcontext/swapcontext. Considering that there are multiple open source projects (including Boost!) that have code that does this, it would be easy to set that up.

But I'm taking baby steps here. And the first step is just to provide a memory safe wrapper around these quite dangerous APIs. Probably the next step is to just write a lot more tests to try to break it. Then, later, I can worry about adding alternative backends to expose the sigmask-free version of this that Boost (and most others) want.

nanolith•about 1 hour ago

Fair enough. I use my fiber library for cooperative multitasking, as an alternative to async I/O. It's still non-blocking, but as far as user code knows, it behaves as if it is blocking.

To do this, I disable signals on threads that are fiber threads, and instead rely on a signal thread to intercept signals and alert the appropriate fibers.

brcmthrowaway•about 2 hours ago

Is Fil-C now using Claude for dev?

pizlonator•about 1 hour ago

Claude only ever wrote some tests

I have used Kimi K2.6-code and GLM 5.2, but only for things that are easy to verify. I did not use any LLMs for the longjmp/ucontext work.

bitbasher•about 2 hours ago

Claude.md was added 8 months ago.

https://github.com/pizlonator/fil-c/blob/deluge/CLAUDE.md

lstodd•about 2 hours ago

longjmp, setjmp, setcontext, getcontext, makecontext, and swapcontext and whatever have no bearing on safety, memory or otherwise. What you have to deal with is what is represented by sigaction(2) and only and much later then by what you use to drive the context switch, be it io, or preemptive.

pizlonator•about 1 hour ago

These functions can easily be misused to corrupt memory, so they very much have something to do with safety. Fil-C goes to great lengths to prevent your use of those functions leading to memory corruption or any violation of the capability model.

Fil-C also makes sigaction memory safe. That protection does allow for signal handlers to longjmp or setcontext or swapcontext

anitil•about 2 hours ago

The article mentions that you typically have to longjmp within the same function as setjump (or a descendant function) otherwise your stack gets cleared and you longjmp to a garbage stack. I believe this counts as memory safety? Though I don't quite understand your comment about sigaction, so maybe there's some context I'm missing.

Edit: The extra context- https://usenix.org/legacy/publications/library/proceedings/u...