C library system-call wrappers, or the lack thereof

(lwn.net)

112 points | by Tomte 1962 days ago

11 comments

  • joshumax 1961 days ago
    I ran into this problem a little while back trying to get my Linux implementation of the BSD unveil() system call merged into mainline. Some of the responses to the RFC told me that it shouldn't be added because glibc likely won't add a syscall wrapper for it. However, a response from glibc states that they won't consider adding it until it's been successfully merged into mainline, creating a sort of catch-22 situation.
    • kjeetgill 1961 days ago
      Thank you for your efforts! Can you just cc them both on an email and get them to agree together?
      • joshumax 1961 days ago
        Hopefully! I plan on doing that for my next RFC!
    • emilfihlman 1961 days ago
      This is an idiotic stance from maintainers. Linux already has a lot of things that most currently used libc implementations don't support, like getrandom.
  • int_19h 1961 days ago
    This is one area where I feel that BSD approach (where the same team maintains both the kernel and the libc, and they're shipped in sync as part of the same release) makes a lot more sense.

    In fact, come to think of it, Linux is the only OS where syscalls are the official public userspace API, is it not? On all other platforms, they're an implementation detail behind the system libraries.

    • sanxiyn 1961 days ago
      syscall-as-API does have some advantages in that you can avoid libc if you want to. One example is Go's Linux target; on Linux, Go standard library calls syscalls directly without any help of libc.
      • kayamon 1961 days ago
        You can achieve that on other OSes too though, still without requiring libc.

        For instance, on Windows the syscall API is hidden and changes from release to release, but KERNEL32.DLL persists as the stable system API that you're supposed to call.

        Linux desperately needs to get past the notion that libc is the only gateway to the kernel, and instead start supplying a standardized "system/kernel" user-level wrapper library. One which wraps the syscalls nicely but doesn't try to add additional language-specific functionality like printf or memcpy.

        • vetinari 1961 days ago
          > Linux desperately needs to get past the notion that libc is the only gateway to the kernel,

          Are you sure about that? Linux doesn't have that notion, it keeps the syscal ABI stable, and anyone can use it directly, like Go does. There's no need to go through blessed syscall library, which on their respective systems are the only gateway to the kernel.

          • masklinn 1961 days ago
            The biggest advantage is that it lets other OS easily impersonate linux e.g. SmartOS or Windows can provide kernel-level "linux persona" and have an unmodified linux userland run on top.
            • int0x80 1961 days ago
              They can do right now. Both the API and ABI are very rarely (read never but very problematic bugs or similar) broken in a backwards way.
              • masklinn 1961 days ago
                That's my point. It's only possible with Linux because Linux provides a stable kernel ABI. You can't impersonate Windows or OSX the same way.
                • int_19h 1960 days ago
                  But you can impersonate Windows by providing your own implementation of the Kernel32 API. What's the essential difference between that, and emulating syscall API?
                • int0x80 1961 days ago
                  Ahh sorry, I understood the other way around :)
        • int_19h 1961 days ago
          On Windows, kernel32.dll is basically the Windows equivalent of libc - it's still a userspace library that wraps syscalls, and you're not supposed to use them directly.

          But I don't see why it's a problem, either. What's the actual benefit of Go invoking syscalls directly on Linux? That it doesn't depend on glibc? But that is only an advantage because glibc is not guaranteed to be there on Linux, the way e.g. kernel32 is on Windows. If it were, there'd be no reason to not use it.

          • vetinari 1961 days ago
            Specifically Go has very particular ideas about memory and stack layout, which do not go well with C libraries. In order to call a C library, Go has to go through thunking shims.

            When using syscalls directly, it does not matter. Kernel doesn't care how your laid out your userspace stack.

            • cesarb 1961 days ago
              > Kernel doesn't care how your laid out your userspace stack.

              Except when it does: https://github.com/golang/go/issues/20427

              • zaarn 1961 days ago
                IIRC this was because of a rather amusing behaviour of a Gentoo toolchain with some hardening patches, in this case doing a bit of fun on the stack to make sure it's safe.

                A lot of the hardening patches come with disclaimers that some software might break, and as shown by this bug, for good reason.

                I'd rather blame the compiler patches for this silly behaviour (and maybe the kernel for not documenting and limiting how much stack VDSO can use)

                • ben0x539 1948 days ago
                  It's not really obvious to me that the original issue, mentioning os.Exec calls and Ubuntu versions, has the same cause as the crashes from the commenter mentioning patched Gentoo toolchains.
          • bregma 1961 days ago
            "libc" is the C-language runtime library. The Go language is not the C language. Different things are different. There is no reason for programs written in the Go language to use the C-language runtime in any operating environment, any more than there is a requirement for all programming languages to use curly braces to delimit lexical scope.
          • IshKebab 1961 days ago
            > That it doesn't depend on glibc? But that is only an advantage because glibc is not guaranteed to be there on Linux, the way e.g. kernel32 is on Windows. If it were, there'd be no reason to not use it.

            Yes - but it's an advantage because glibc is a compatibility nightmare. One of Go's really nice features is that you can compile a binary and it will run pretty much anywhere. That would be pretty much impossible if it linked with glibc (even if you ignore the fact that glibc might not be present).

            • 4ad 1961 days ago
              Not true. On Solaris (and Windows), and more recently in macOS as well we link with libc (or equivalent), and we can still cross compile from any system to these systems. It's not a free lunch of course, doing raw system calls would be preferable, for many reasons.

              I implemented support for this, for ELF, when I wrote the Solaris port. Other people have done it for PE and Mach-O. It is a fallacy to think that you need access to a shared library in order to link with it. That's only true for C toolchains that don't know better. For Go we have our own toolchain and it doesn't have such a restriction.

              This is one of the points lost to the "why didn't you just use LLVM?" crowds. Our own toolchain allows us flexibility that simply doesn't exist with traditional toolchains.

              • matthewbauer 1961 days ago
                How do you handle changes to the kernel? Can your binaries run on any macOS version or only the very latest? This sounds really cool but also super dangerous. Apple is not known for providing a stable ABI :)
                • 4ad 1961 days ago
                  You got it backwards, Go used to do raw system calls on macOS, and binaries were occasionally broken by kernel updates. Now Go uses libc on macOS, and binaries are forward compatible with future macOS versions just like any other C/C++/ObjC/swift program. OS X 10.10 (Yosemite) is the current minimum supported version.
                  • int_19h 1960 days ago
                    Unfortunately, Go still insists on using syscalls on the BSDs, despite them not being a stable (between major releases) API there either.

                    I hope they're planning to fix that...

                  • matthewbauer 1961 days ago
                    Ok nice! In many ways it would be cool if apple had a stable interface for XNU like we do in Linux.
            • swiley 1961 days ago
              This used to be true, Unfortunately it’s not now.

              I found this out when trying to copy a go program I had compiled on alpine to an Ubuntu machine and got the “file not found” error from the linker. :(

              Try it yourself: run ldd on a recent go binary.

              • zaarn 1961 days ago
                Go uses the glibc for DNS resolution by default, if you set CGO_ENABLED=0 during compiling it should result in a static binary.
      • btbuilder 1961 days ago
        As the article eludes to it is not without its challenges, however. As an example see this issue: https://github.com/golang/go/issues/1435 which goes into some of the details why implementing a call for setuid() is not straightforward without the knowledge built into glibc.

        Also the trickiness of having efficient process fork/exec based on vfork: http://ewontfix.com/7/ and the considerations going into Go: https://go-review.googlesource.com/c/go/+/46173/

        • pm215 1961 days ago
          I think for setuid in particular the right fix is for the kernel to provide a new syscall that operates on the whole process. The current hoops that glibc has to jump through with signals are complicated (and maybe racy? it's been a while since I looked at the code) and tie up a signal for libc's private use; better support at the kernel layer would allow that to all be eventually dropped.
        • carapace 1961 days ago
          ( allude (elude means escape))
    • masklinn 1961 days ago
      > In fact, come to think of it, Linux is the only OS where syscalls are the official public userspace API, is it not?

      Although I never checked so I could be completely wrong, I would expect folks shipping just kernels like SeL4 to ship stable kernel ABI rather than a libc.

    • Annatar 1961 days ago
      On all other platforms, they're an implementation detail behind the system libraries.

      Yes and this is by design, out of necessity, taught by experience. Commercial customers back in the day paid lots and lots of money, so solutions had to be found and they had to work.

    • saagarjha 1961 days ago
      macOS lists them all in /usr/include/sys/syscall.h, so I guess you can consider that public API?
      • eridius 1961 days ago
        macOS syscalls are explicitly not stable and may change from release to release. The only supported interface to them on macOS is libc.
      • akvadrako 1961 days ago
        That file doesn't even exist on my system, but in any case public means official, not in some header.
        • saagarjha 1961 days ago
          If you have the command line tools installed, it should also be under /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/syscall.h. macOS, by default, does not ship with these headers at all, and as of macOS Mojave they no longer provide them under /usr/include unless you install a certain package.

          With that being said, I disagree with your characterization of “some header”–anything that’s in Apple’s headers is public API, whether it has a fancy page on developer.apple.com or not. Apple has a very clear definition of what they consider to be “private”, and anything in /usr/include isn’t it.

          • akvadrako 1961 days ago
            Actually the path is:

              /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/sys/syscall.h 
            and you can clearly see that the file is wrapped with:

              #ifdef __APPLE_API_PRIVATE
              ...
              #endif /* __APPLE_API_PRIVATE */
            • saagarjha 1960 days ago
              Again, if you have the Command Line Tools installed, it'll be under /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/. Xcode also has a copy, but I think fewer people have that installed.
            • amaccuish 1961 days ago
              Nope, on my system, without xcode, the previous poster's path is correct: /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/syscall.h
      • int_19h 1961 days ago
        They need to be public and stable in order to be a usable userspace API for anything other than libc. I don't think the latter requirement is satisfied:

        https://github.com/golang/go/commit/02118ad0a0cacb00c1d834b3...

        • masklinn 1961 days ago
          Nope, in fact it's so absolutely not stable (or supported) that libSystem can't be statically linked to.
  • jws 1961 days ago
    I’m sure it would give people discomfort, but I wonder how it would work if the kernel presented a pseudo file system with an “include” and a “src” directory to provide C interfaces to the unassimilated syscalls. Just enough syntax to keep people from defining their own types and have to use the syscall() interface.

    Maybe make it a module so space constrained systems can leave it out.

    The kernel patch process could keep everything nicely in sync and native build processes would easily find the right source code. Cross compiling would require you to find and copies though.

    • kayamon 1961 days ago
      It actually already has this (kinda) -- the VDSO shared library. Although they only tend to implement a couple of syscalls in there.

      http://man7.org/linux/man-pages/man7/vdso.7.html

      • klodolph 1961 days ago
        VDSO does not have syscalls in it, that's actually the entire reason it exists… so you can communicate between the kernel and userspace without syscalls.
    • tacostakohashi 1961 days ago
      One issue with that idea is that the source code for implementing a system call is necessarily compiler-specific, it is not standard C. So, the issue would become which compiler (version) should the source code of the include/ and src/ be for? GCC? Clang?
      • olliej 1961 days ago
        A lot of the problems occur in things like struct layout of syscall arguments, which would be cross platform.

        But then you run into the problem of how willing to commit to ABI stability in a non-posix API. I assume there’s some commitment to such at the moment, but how much also depends on *libc abstracting them? Libcs always seem to (to me) be approximately kernel version specific

        • Aic1kuir 1961 days ago
          Ancient glibc versions run ok on current kernels, you're just missing out on some newer features. Otherwise statically linked binaries would cease to work.
          • zaarn 1961 days ago
            There is still a limit, very old software doesn't run anymore, largely because something in the kernel broke it. But it has been years and nobody complained so nobody will revert this or fix it.

            Linus has stated as such and I think a few other maintainers agree there. If you don't find the problem until years later, chances are, too few people care.

  • userbinator 1961 days ago
    I'm curious what led to Linux (and it seems the other Unices) adopting the "double indirection" strategy of having a separate wrapper/stub function for system calls vs. the approach common in the MS-DOS world where the compiler would directly embed e.g. INT 21H instructions and generate the code to put parameters into registers itself. It's a small inefficiency, but still seems a bit wasteful nonetheless.
    • skissane 1961 days ago
      When you write C code on MS-DOS, you usually don't directly call INT 21h. Instead you call fopen(),printf(), etc, which are library calls which call INT 21h in their implementation. As well as standard C functions, there are also C functions for MS-DOS specific services, e.g. chdrive(), findfirst(), etc.

      Even when you want to call some INT 21h service which doesn't have a C function, it is more common to use interrupt functions like int86() or intdos() than to use inline assembly. (Or something like __dpmi_int from 32-bit code, such as DJGPP)

      So, I don't think MS-DOS is as different from Linux/Unix as you think. High-level language code on MS-DOS (whether in C or Pascal or BASIC or whatever) usually doesn't directly invoke INT 21h, it goes through higher-level libraries / wrapper functions. Only software written in assembly tends to invoke INT 21h directly.

    • pcwalton 1961 days ago
      For one, sometimes you want to transparently migrate kernel functionality to userland. Moving gettimeofday from a syscall to the vDSO comes to mind.
    • tacostakohashi 1961 days ago
      There's nothing stopping the wrapper function from being inlined (as it commonly is for some compiler + libc combinations), with all the usual tradeoffs for code size and ease of debugging.
      • megous 1961 days ago
        It doesn't happen, unless you compile C library with LTO. I'm yet to succeed in doing that for my musl C mipsel target.
      • JdeBP 1961 days ago
        Inlining isn't the answer here. userbinator and you are missing out on more than a decade of IBM/Microsoft operating system evolution, with the invention of high-level-language-callable kernel APIs in the likes of DOSCALLS.DLL and NTDLL.DLL in the 1980s.
  • xenadu02 1961 days ago
    The comments on the article have some funny ideas about versioning.

    Apple platforms manage to support the idea of a “deployment target” and binary compatibility; the new symbols are weak-linked. Broken old behavior is preserved with linked on-or-after checks.

    Not sure what makes it so difficult for glibc.

    • matthewbauer 1961 days ago
      You need quite a bit of coordination between your toolchain, your kernel, and the standard c library for that to work. Linux/GCC/Glibc has never had that.
  • jancsika 1961 days ago
    Two noob questions:

    What are the technical reasons that glibc cannot adhere to the Linux dogma, "Don't break userspace?"

    Since glibc does not adhere to that dogma, why the decades-long reluctance to add certain syscall wrappers? If they screw up and make a bad interface just modify it and bump the version number.

    I just waded through the lwn cross-purpose-writing-festival comments and did not see them answered.

    • cesarb 1961 days ago
      > What are the technical reasons that glibc cannot adhere to the Linux dogma, "Don't break userspace?"

      I don't know if they have something like that officially, but in practice, they do follow it. Programs linked to an older version of glibc continue working with a newer version of glibc, in a large part thanks to symbol versioning, which allows them to keep the old versions of an interface available to old binaries, while new binaries get the new functionality.

      > If they screw up and make a bad interface just modify it and bump the version number.

      Bumping the glibc version number would mean recompiling everything (a program can't have two versions of glibc at the same time, so all libraries a program links to would have to be recompiled); we had that in the libc5 to libc6 transition last century. And since they won't bump the version number, it means they will have to keep the bad interface forever, even if it's just visible to binaries compiled against an older glibc.

      For a recent example in which they actually went ahead and removed a bad interface: https://lwn.net/Articles/673724/ and https://sourceware.org/bugzilla/show_bug.cgi?id=19473 -- and according to the later, they did it in a way which still kept existing binaries working.

      • jancsika 1961 days ago
        > I don't know if they have something like that officially, but in practice, they do follow it.

        If that's true then I don't understand ldarby's comment on the article:

        > The common problem that I suspect Cyberax is actually moaning about is if software uses other calls like memcpy() which on centos 7 gets a version of GLIBC_2.14:

        > readelf -a foo | grep memcpy 000000601020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 memcpy@GLIBC_2.14 + 0 3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@GLIBC_2.14 (3) 55: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@@GLIBC_2.14

        > and this doesn't work on centos 6:

        > ldd ./foo > ./foo: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./foo)

        Just to be clear, my original question is why glibc technically cannot follow the same exact development model of the Linux kernel for retaining backward compatibility.

        • cesarb 1961 days ago
          It's backwards compatible: software compiled on centos 6 will be using the older memcpy@GLIBC_2.2.5 symbol, which still exists on newer glibc together with the current memcpy@GLIBC_2.14 symbol, so it will work. What doesn't work is compiling with a newer glibc and expecting it to work on an older system.

          The example above is actually a great example of bending over backwards to keep compatibility with broken userspace. Some programs incorrectly called memcpy with overlapping inputs, and an optimized version of memcpy started breaking these programs. Instead of just letting them break, the older symbol was kept with a slower implementation which accepts overlapping inputs, while new programs get the faster implementation at the memcpy@GLIBC_2.14 symbol.

        • rkeene2 1961 days ago
          glibc could follow the same model, but it would be wasteful for them to do so, without symbol versioning. With symbol versioning they DO. The memcpy(3) example is why. An old implementation of memcpy(3) supplied in older versions of glibc happened to not care if the memory regions overlapped. The interface for memcpy(3) said that they may not (or implementation defined behaviour occurred).

          On some platforms, new instructions came out and allowed the glibc maintainers to write a faster version of memcpy(3) that still satisfied its documented interface, however it did not retain the undocumented behavior of allowing overlapping memory ranges.

          So without symbol versioning we are given the options to either: 1. All be stuck with a slow memcpy(3) forever; or 2. break glibc user's code

          Neither of those options were great. So the glibc maintainers decided to write a new function, let's call it "memcpy_fast()"[0]. But how do you get everyone to use it ? Symbol versioning is the answer here. At compile-time linking, there is a directive that tells the compiler that the current implementation of memcpy(3) is "memcpy_fast()", and that's the symbol that gets embedded into the executable's symbol table. If your code wasn't compiled against a glibc where this was available, you'd be using an older implementation. This gets you the best of both worlds: 1. Existing binaries continue to work (without the upgraded code path); 2. Newly produced binaries use the upgraded code path, and in theory are tested to ensure that they are working.

          This does prevent executables from being compiled against a newer glibc than intended to be executed against... but, so what ? Linux ALSO doesn't guarantee that you that newer call semantics are available in older systems. The solution here is to either specifically indicate that you want the unversioned symbol, or compile against the lowest version of everything you wish to support. glibc is far from your biggest problem here given the ABI stability of many libraries.

          Non-solutions: 1. Use macros: Side-effects 2. Just expose a new, unversioned symbol: Nobody will use it, you'll have to document it, it'll be a platform-specific call. If people do use it, then their binaries can't be used on older platforms (just like symbol versioning)

          [0] The symbol is referred to as memcpy@@GLIBC_2.14

    • nwmcsween 1961 days ago
      That's not how symbols work sans symbol versioning, you see c doesn't mangle symbols thus function foo on version 1 is the same as function foo on version 2 even with a completely different signature.
    • Annatar 1961 days ago
      There are no technical reasons. The issue is that the kernel and glibc are not one coherent whole because they were always developed by two unrelated groups. glibc wasn't even designed for Linux, but for the GNU kernel.

      The end result is that the users of GNU/Linux will always draw the short end of the stick.

      • jancsika 1961 days ago
        > The issue is that the kernel and glibc are not one coherent whole because they were always developed by two unrelated groups.

        But AFAICT the glibc dogma is based on the premise that it would be impossible for a large, complex project to have backward compatibility without making regular changes to the extant interfaces that it provides. Given that premise glibc devs seem to have some process for figuring out what "correctness" means for time=now and then noodle around with their interface to reflect that correctness in the next version of the lib. Thus symbol versioning is employed.

        At the same time, Linux is a large, complex project with backward compatibility which does significantly less noodling around with the extant interface. AFAICT the process consists mainly of a) devs breaking the extant interface for correctness, b) a user submitting a bug, and c) the lead dev surrounding the declarative sentence "We don't break userspace" with imperative sentences containing curses and then rejecting the change.

        I've read where Linus and others have tried to defend their choice and argued that the glibc dev process is worse. Regardless of the persuasiveness of that argument, I've read it and am familiar with it.

        I am not familiar with the glibc argument as to why they require regular interface changes, nor an acknowledgement that a closely connected large complex project gets by without that. I don't see anything on the glibc FAQ about it-- only a question about symbol versioning where the answer assumes that the interface must change.

        • Annatar 1957 days ago
          But AFAICT the glibc dogma is based on the premise that it would be impossible for a large, complex project to have backward compatibility without making regular changes to the extant interfaces that it provides.

          Yeah well tell that to the engineers of HP-UX, IRIX, and Solaris, because all of those managed to produce libc's which were backward compatible. Sun Microsystems even legally warranted Solaris and therefore libc, they were that paranoid about backwards-compatibility.

          That's not the issue. The issue is that glibc is developed by people who are not and never were system engineers and instead of learning from the masters, asking them how to do it correctly, or sticking with BSD when its situation was dire, they just decided to re-invent the wheel.

          One does not simply re-invent glibc from first principles, especially so if one does not have the requisite insights and experience, which they didn't and they still don't, and most likely if they haven't by now, never will. GNU developers are a lost cause. Just look at how long it took them to "discover" versioned interfaces with linker map files, something Solaris system engineers have been using since the early '90's of the past century, and everything becomes crystal clear, if one knows the Red flags. That's one Red flag right there, "late in phase and unlikely to ever catch up".

        • cesarb 1961 days ago
          > At the same time, Linux is a large, complex project with backward compatibility which does significantly less noodling around with the extant interface.

          Take a look at the system call table:

              #define __NR_oldstat 18
              #define __NR_oldfstat 28
              #define __NR_oldolduname 59
              #define __NR_oldlstat 84
              #define __NR_olduname 109
              [...]
              #define __NR_dup3 330
              #define __NR_pipe2 331
              #define __NR_preadv2 378
              #define __NR_pwritev2 379
          
          (And that's before the impending Y2038 changes to the API.)

          The main difference is that, instead of defining a new symbol, a new system call number is defined. The effect is similar: a program using the new "stat" system call (106) won't work on an older kernel which doesn't have it, while on the opposite direction it still works (new kernels still understand the old system call).

          One thing the kernel developers do nowadays to reduce the API churn is to add a flags argument to every new system call (for instance, the "dup3" above is the same as "dup2", but with a flags argument). Even then, if you try to use a flag which the current kernel doesn't know, it won't work (the kernel developers learned the hard way that you can't ignore unknown flags, since programs will pass them and then break on newer kernels).

          And that's without considering the "escape hatches" of ioctl() and fcntl(), or the virtual filesystems like /proc and /sys, which are also part of the Linux kernel API. So yes, the Linux kernel does see regular interface changes.

          • jancsika 1960 days ago
            Still trying to get my bearings, so bear with me...

            Here are two different types of backward compatibility:

            1. will old binary work with the new version?

            2. will old code build and run correctly with the new version?

            So when I talk about extant interface changes, I'm speculating that old code that leverages the public Linux interface is more likely to work and work correctly vs. old code that leverages the public glibc interface.

            For example: suppose foodev built a Linux driver for a very popular piece of hardware in 2003, abandoned it, and in 2018 there are problems getting it to run correctly. Are those problems more likely due to Linux public interface churn or glibc public interface churn?

            • cesarb 1960 days ago
              For a driver, there would be problems getting it to run correctly in 2004 already, since the internal Linux kernel API used by drivers is not stable and changes very rapidly. Unless the driver has been upstreamed, because developers updating the kernel internal APIs also update all the in-tree drivers at the same time.

              About leveraging one interface or the other: when you use the glibc interface, you are also using the Linux interface behind it, so a change in either can affect your program. On the other hand, if you are using the Linux interface directly instead of going through the C library, chances are you are doing something unusual, which increases the risk of it breaking by accident. And there are some things which exist only on the glibc interface, like nameserver lookups (getaddrinfo), user database lookups (getpwnam/getpwuid), and many more.

              • Annatar 1960 days ago
                "For a driver, there would be problems getting it to run correctly in 2004 already,"

                This is unthinkable on Solaris / illumos kernels because of DDI / DDK interfaces: I can take a driver from 1993 for Solaris 2.5.1 and modload(1M) it into the latest nightly illumos build and I'm guaranteed that it will work.

  • molticrystal 1961 days ago
    Arbitrage issues wherever glibc and the syscall differ seem to be where the bugs and security issues lay or at least a good place to look.

    Whenever a person has to roll out their own handler it will almost always undergo less testing and auditing. The article points gettid which for at least 10 years required your own method to use, and the comment section for the article points out that the getpid call had caching that was bugged for a long time.

    Having no glibc implementation of a syscall affects its usage and the total number of people knowledgeable about that function, so it would be a perfect place to look for bugs and security. In the nearly reverse case a poor implementation of a glibc handler might be doing something that would allow an attacker to take advantage. The same applies where the glibc and syscall functionality differ, an aspect only part of the syscall might be undertested.

  • pmoriarty 1961 days ago
    How did Plan 9 and Inferno handle this?
    • sebcat 1961 days ago
      Similar to the BSDs: by providing wrappers for syscalls in their own libc.

      Plan 9 has very few syscalls compared to Linux and the BSDs.

    • 4ad 1961 days ago
      In Plan 9 all the state is in the kernel, so it doesn't have this problem.

      Inferno doesn't have system calls.

  • Annatar 1961 days ago
    This is exactly why BSD and illumos based operating systems ship libc, the kernel and userland (/usr) as one coherent whole. Perhaps now, reading the LWN article, people who are comfortable with GNU/Linux will start to realize it's high time to outgrow it and move on to one of the BSD or illumos based systems. The longer you wait, the harder the transition will be and besides, it's good to go out of one's comfort zone.
    • majewsky 1961 days ago
      Sure, I'll just write up a proposal for my employer to move multiple tens of thousands of Linux servers, VMs and containers over to BSD. I have a good feeling about this. /s
      • Annatar 1955 days ago
        Here is something to consider: how did your employer get to tens of thousands of Linux servers from whatever they were running on before?

        And: do you really want to spend the rest of your professional career wrangling with a shoddy product, or do you want to actally do professional, cutting edge IT?

        I can't write for you, but I did not graduate computer science at the top of my class so that I could spend the next several decades working with / on the shittiest, amateur knock-off copy of UNIX when I could run the real thing for free & cheap. That's not why I studied at a university and got a degree for. How about you, what's it gonna be, shitty Linux for the next 20-30 years or the real computer science with SmartOS or FreeBSD?

    • rkeene2 1961 days ago
      This is the whole point of a Linux distribution -- it takes the disparate parts and makes them into one coherent whole.
      • Annatar 1955 days ago
        But it's not coherent, far from it: it is often broken by various parties involved because there is no architecture and there are heterogenous interests at play. Compared to a FreeBSD or a SmartOS system, architecturally, it barely works, and it's rickety and shoddy.
  • bogomipz 1961 days ago
    The post states:

    >"In such cases, user-space developers must fall back on syscall() to access that functionality, an approach that is both non-portable and error-prone."

    I understand about portability but can someone elaborate on why using syscall() is inherently error-prone?

  • en4bz 1961 days ago
    Lack of `gettid` and `futex` have always annoyed me.
    • monocasa 1961 days ago
      There's this sense that they're not for the consumption of mere mortals.

      That being said, you can still at least always call syscall(2).

      • glandium 1961 days ago
        One problem is that not all system calls take the same kinds of arguments on all platforms. Example: SYS_mmap takes a pointer to a struct containing all the arguments on s390. Even better, on Alpha, glibc's syscall cannot call system calls with 6 or more arguments (although maybe that was fixed in the past 8 years?).
        • murderfs 1961 days ago
          > Even better, on Alpha, glibc's syscall cannot call system calls with 6 or more arguments

          That's just generic kernel ABI: syscalls can have at most 6 arguments: https://elixir.bootlin.com/linux/latest/source/include/asm-g...

          • glandium 1961 days ago
            Skip the "or more" part, then, but that doesn't make it less true: it wasn't possible to make a 6-argument system call with syscall on alpha 8 years ago. I don't know whether that's been fixed or not.
            • pm215 1961 days ago
              The syscall(5) manpage documents the Alpha syscall abi as passing arguments in a0,a1,a2,a3,a4,a5 which would suggest so. (MIPS o32 is the only listed one which is a bit oddball: you can only pass 4 args in registers and then use stack for 5 and 6.)