Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

30% Positive

Analyzed from 534 words in the discussion.

Trending Topics

#system#vdso#kernel#libc#calls#userspace#syscall#ntdll#linux#idea

Discussion (12 Comments)Read Original on HackerNews

freestandingabout 2 hours ago
that is graphomania. syscalls are easy and dont require so much bloat. beside its lefty GNUnix license
quotemstrabout 3 hours ago
Linux is unusual in OS kernels in that direct system calls from arbitrary userspace code are supported and ABI-stable. This model has always been a terrible idea. It robs the system of an ability to intercept system calls in userspace before doing an expensive privilege-mode transition.

If, instead, as on OpenBSD, the kernel enforced the rule that all system calls had to go through libc (or perhaps a big ntdll.dll-like VDSO), then the whole problem the linked article tries in vain to solve would disappear. If you wanted to hook a system call, you'd just change the libc/VDSO dispatch. No need to rewrite any instructions.

If I were Linus, I'd make a new rule: starting today, all new system calls must go through VDSO. No exceptions. SYSCALL from anywhere else? SIGKILL.

This way, you can just LD_PRELOAD in front of the VDSO and system call interception in userspace Just Works.

razighter77726 minutes ago
Direct system calls are an amazing idea. The NtDll and bsd models are worse. The whole libc becomes a security boundary without the protection of kernel space. So much windows malware and process tampering happens because now you have a library (ntdll) fully in userspace that is given special privileges, which now becomes a huge attack surface. Then you have to deal with breakages between the built in libc versions and the kernel

This syscall overhead isn't as much as you suppose it is; for workloads where the syscall overhead actually makes a difference there are robust low-syscall paths for io/latency sensitive operations with DPDK, io_uring, and futex being a few examples.

And there are robust performant methods on linux for syscall interception/tracing, see seccomp unotify, bpf tracepoints, ftrace.

yjftsjthsd-habout 3 hours ago
> This model has always been a terrible idea. It robs the system of an ability to intercept system calls in userspace before doing an expensive privilege-mode transition.

This model has always been a trade-off. It has downsides, but it also has upsides, including an immense boost in flexibility; decoupling from any particular userspace is useful.

> This way, you can just LD_PRELOAD in front of the VDSO and system call interception in userspace Just Works.

Can you LD_PRELOAD in front of the vDSO? I was under the (possibly mistaken) impression that the kernel injects it directly.

mananaysiempre38 minutes ago
> Can you LD_PRELOAD in front of the vDSO? I was under the (possibly mistaken) impression that the kernel injects it directly.

The kernel puts the vDSO in memory and tells ld.so where it is, but where if anywhere ld.so will put it in the search order it implements is its own concern. (TBH I don’t actually know whether ld.so will actually allow LD_PRELOAD to override the vDSO, but there’s no reason for it not to, except I guess for the syscalls that are needed to perform the dynamic linking itself.)

throwaway7356about 2 hours ago
> all system calls had to go through libc (or perhaps a big ntdll.dll-like

Which makes containers crap on Windows and *BSD as they have to run the currect libc or equivalent. Thus you need to build a different container per OS version which sucks compared to Linux.

Joker_vDabout 1 hour ago
Windows doesn't even have its own libc.
yjftsjthsd-habout 1 hour ago
They said "or equivalent", so ntdll
freestandingabout 2 hours ago
thats why OpenBSD is unconvinient for development - because it binds to libc bloatware
razighter77721 minutes ago
yep and and it forces every application to deal with the C FFI. It's beautiful in linux that I can access the full kernel API from an int 0x80/syscall instruction + a few register loads without having to link against crap. I can write a simple cat utility in a dozen or so lines of assembly.
Gualdrapoabout 3 hours ago
> If I were Linus, I'd make a new rule

Or, you know, just propose your idea to him

yjftsjthsd-habout 2 hours ago
Based on https://www.phoronix.com/news/Linus-Torvalds-No-Random-vDSO , I had been under the impression that he wasn't fond of adding more use of vDSO. On rereading, I can't tell if that's a vDSO thing or a preference against fast randomness being provided by the kernel.