Closing the gap: cross-language LTO between Rust and C/C++

(blog.llvm.org)

267 points | by pedrow 1680 days ago

6 comments

  • pornel 1680 days ago
    It's nice to see such post on the LLVM blog (as opposed to typical Rust-only outlets). Feels like recognition that Rust is a serious and important LLVM user.
    • masklinn 1680 days ago
      Now if only LLVM could feel it was important enough to make noalias work reliably.
      • fluffything 1680 days ago
        LLVM devs don't really care about Rust. They haven't fixed noalias in years, there isn't still a freeze intrinsic in the IR, the LLVM-IR semantics are often not documented enough for the Rust compiler to know whether it is generating IR that has UB or not...

        I see a couple of Rust developers working on LLVM almost full-time (like nikic), but there should be more. The Rust language needs to become a more important stakeholder, and for that it needs more paid full-time LLVM developers, infinite loops are UB in LLVM-IR...

        • ncmncm 1680 days ago
          LLVM developers are focused on the needs of (literally!) millions of C and C++ programmers. Until you can point to a few hundred-thousand production Rust coders, or get somebody with deep pockets to depend on it, Rust is just not important enough. The solution is to fund your own LLVM developers. If you can't raise the money to pay them, who should?

          People with deep pockets are typically advised not to depend on unsupported infrastructure. It is hard and, often, unwise to argue with such advice.

          It is still early days. Give it ten years: Rust will either have taken off or sunk, by then. Maybe somewhere in those years there will be some new hotness to jump on instead.

          • Hello71 1680 days ago
            Rust has in fact had almost 10 years.
            • Jweb_Guru 1680 days ago
              The Rust of ten years ago bears almost no resemblance to the Rust of today. Its history is interesting to people involved in the project, but it wasn't really used seriously until 1.0 was released.
              • ChrisSD 1680 days ago
                I make it 9 years since the first release and 4 years since the first stable version (aka 1.0).

                Either way it's a short time in which to gain acceptance anywhere near the level of C++ so it's not surprising it hasn't. Although companies (including Microsoft and some Google teams) are just starting to take it seriously.

                • imtringued 1679 days ago
                  It's not a short time. C++ is only 34 years old and back them there were fewer developers than today.
                  • hyperman1 1679 days ago
                    C with classes, which would be comparable to pre 1.0 rust, started in 1979, so 40 years ago. The first commercial C++ and the book were in 1985 and 34 years ago. I'd guess that's a good 1.0 milestone, comparable with rust's 4 years.
            • ncmncm 1680 days ago
              Ten more years. It takes a long, long time for a language to gain maturity, and the competition (except C :P) is improving the whole time.
            • leshow 1679 days ago
              It's not really fair to count pre-1.0, it's only been 4-5 years
        • wyldfire 1679 days ago
          IMO a lot of LLVM bug fixes are driven by particular devs motivated to fix issues in the area that they contributed or that they're willing to take the time to investigate.

          If your bug happens to hit a use case important to flagships like XCode or Android -- or any of the internal projects at Apple or Google, you have a much better chance of seeing it fixed.

          Unless your bug shows a clear regression and is interesting enough to be a release blocker, it probably won't get prioritized. OTOH if you supply a patch with your bug report, it probably has a decent chance of getting accepted.

          LLVM doesn't not care about rust. It's just not a pet project of a heavy hitter yet. So rust devs dig down and find llvm bugs. But a best case scenario would be a "llvm developer" working for Google/Apple/Mozilla who will prioritize features and fixes impacting rust. But that's not necessarily better than another "rust Dev" willing to dig down and find llvm bugs.

          I'm not knocking LLVM, I think it's a great community. Just strikes me as more of a BYO-bugfix kinda gang.

        • sanxiyn 1679 days ago
          Another: Rust is de facto QA service for LLVM's support for overflow intrinsics. Unlike noalias they continue to work, but in practice they continue to work only due to fixes from Rust project.
          • ridiculous_fish 1679 days ago
            Why doesn't Swift play that role?
            • sanxiyn 1679 days ago
              In fact I am not sure why... I am just reporting the experience. Maybe because of architectural coverage, because while bugs also happen in architecture-independent optimizations, more bugs happen in, say, PowerPC backend.

              Edit: Another possibility is difference between how often Rust and Swift update to LLVM trunk. Both have LLVM fork pinned to specific version updated from time to time and I guess the first to update gets all the bugs.

            • masklinn 1679 days ago
              Possibly because Swift being more of an application language, Swift codebases in general have very little use for overflowing operations?
              • sanxiyn 1679 days ago
                No, it's not that. All Swift arithmetic compiles to overflow intrinsics.
        • ndesaulniers 1679 days ago
          > LLVM devs don't really care about Rust.

          [citation needed] (I have thousands of lines in both)

          > They haven't fixed noalias in years,

          I think you fail to understand the complexities of TBAA, which are and have been working on for a long time. It's a lot of work frankly.

          > there isn't still a freeze intrinsic in the IR, the LLVM-IR semantics are often not documented enough for the Rust compiler to know whether it is generating IR that has UB or not...

          It's important to have more languages target LLVM to flush these out. Things aren't intentionally undocumented; more like no one has really thought that hard about edge cases and interactions. And if there's only a few front ends generating similar IR, who will expose those dark corners?

    • Mathnerd314 1680 days ago
      I looked through the archives and only GHC/Haskell and LLILC have made it: http://blog.llvm.org/2010/05/glasgow-haskell-compiler-and-ll... http://blog.llvm.org/2015/04/llilc-llvm-based-compiler-for-d...

      But I think the limitation is on developers willing to write a blog post rather than "serious" development. Excluding "LLVM Weekly" which has since moved to http://llvmweekly.org/, there's been less than 1 post per month, even though LLVM conferences have grown significantly in size.

  • angrygoat 1680 days ago
    This is awesome, especially with the gains for Firefox, but this bit seemed odd to me:

    > We quickly learned, however, that the LLVM linker plugin crashes with a segmentation fault when trying to perform another round of ThinLTO on a module that had already gone through the process.

    It sounds like they worked around this, rather than fixing the segfault and putting some error handling in place? Might make it easier for the next bunch of people working in this part of clang.

    • xiphias2 1680 days ago
      You're right, but it still looks like a big improvement. It means that Firefox devs can write every new functionality in Rust, no matter how small it is.
      • phkahler 1680 days ago
        >> It means that Firefox devs can write every new functionality in Rust, no matter how small it is.

        True, but they should still focus on oxidizing whole modules and subsystems in their entirety whenever possible.

  • azakai 1680 days ago
    The need to use "compatible" versions of LLVM between the C++ and Rust compilers is scary. Anything aside from the exact same LLVM revision could in theory lead to bad results, including bugs or security vulnerabilities (if LLVM changes the meaning of something in its IR).

    This isn't Rust or Clang's fault, of course, it's just a consequence of using LLVM IR as the data for LTO, that LLVM IR has no backwards compatibility guarantees, and that Rust is out-of-tree for LLVM.

    In theory using a stable format for LTO would avoid issues like this. Wasm is one option that has working cross-language inlining already today, but on the other hand it has less rich an IR, and the optimizers are less powerful than LLVM.

    • bla3 1680 days ago
      LLVM bitcode is backwards compatible. It is however not forward compatible, so the linker needs to understand the newer bitcode format that clang and rustc use.
      • azakai 1679 days ago
        The issue isn't of being able to load the bitcode (which LLVM has gotten pretty good at supporting in a backwards compatible way). It's that the meaning of things might change, undefined behavior may be handled differently, and so forth.

        In other words a newer optimizer running on older IR may emit broken code.

        • bla3 1679 days ago
          I thought bitcode backwards compat included that too. If it didn't, Apple's collecting bitcode for watch apps for transparent 64-bit support wouldn't work.
          • azakai 1679 days ago
            It's possible to try to support that, but you can never be sure.

            For example, imagine that LLVM has a known bug with some pass, and it has a workaround somewhere else that disables generating IR that would hit that bug. A different version of LLVM may fix that bug, and remove the workaround - but then optimizing bitcode from another version could be vulnerable.

            Another example is undefined behavior in LLVM IR. It may be handled differently in different versions, and it's hard to know what might happen from mixing them.

            In general, LLVM is heavily tested - on each revision. I'm not aware of any large-scale project that tests all LLVM versions on LLVM IR from all other versions. That's untested. I'd be afraid to rely on that.

            I don't know what Apple does with user-supplied bitcode, but if I were them I'd be recompiling old bitcode with the old LLVM that matches it, or something else (like subset the bitcode to remove undefined behavior, etc.).

            • glandium 1679 days ago
              I found a LLVM IR incompatibility once, and it was already fixed. It seems some checks run for compatibility with released versions. I'm not sure whether they're entirely automatic and systematic, but they do happen.

              However, the way I found it is that it affected the random version of LLVM trunk stable rustc was using at the time... That's one of the reasons why stable rust should stay away from LLVM trunk.

          • pjmlp 1679 days ago
            Apple has their own toolchain, don't apply FOSS variant of clang to how XCode clang actually works, only cherry picked features get upstream.
  • bla3 1680 days ago
    > No problem, we thought and instructed the Rust compiler to disable its own ThinLTO pass when compiling for the cross-language case and indeed everything was fine -- until the segmentation faults mysteriously returned a few weeks later even though ThinLTO was still disabled. [...] Since then ThinLTO is turned off for libstd by default.

    Instead of fixing the crash, they landed a workaround.

    > We learned that all LLVM versions involved really have to be a close match in order for things to work out. The Rust compiler's documentation now offers a compatibility table for the various versions of Rust and Clang.

    It's cool they got it working, but it sounds like this is currently proof-of-concept quality and not very productionized yet. To me, the overall tone of the article sounds like they ran into a bunch of issues and opted for duct tape instead of systemic fixes. Which is fine to get things off the ground, of course! But I hope they take the time to go back and fix the underlying issues they ran into too.

    • pcwalton 1680 days ago
      This isn't anything new. Rust has had to land workarounds for lots of LLVM issues in its history. For example, Rust had to stop using noalias on function parameters because LLVM miscompiled too many functions with it, as Rust can use it way more than C/C++ do and therefore it didn't receive much upstream test coverage.
      • wumpus 1680 days ago
        Too bad LLVM doesn't have a first-class Fortran, then noalias would actually work.
      • bla3 1680 days ago
        Rust could fix upstream issues it runs into, no?
        • bzbarsky 1680 days ago
          They could, and they do.

          That said, they don't have infinite time, and if, as in this case, the upstream fix would: (1) be pretty involved and (2) be very likely to get regressed because upstream doesn't have the capability to run tests that would prevent that (e.g. because upstream only runs C++ compilation tests and there is no way to exercise the relevant bugs via C++ code), then investing in fixing upstream may not be the right tradeoff.

          In theory, one could first change upstream's test harness to allow Rust code, but that involves upstream tests depending on the Rust compiler frontend, which apart from being a technical problem is probably a political one.

          Maybe it would have been possible to do upstream tests via bitcode source instead of Rust or C++; I don't know about LLVM to say offhand. But in either case this is not as easy as just "fix a simple upstream bug"...

          • pcwalton 1680 days ago
            Upstream tests are generally done at the LLVM IR level actually. It's mostly just a question of (1) time; (2) worries about ongoing maintenance work upstream; (3) a general feeling that perhaps such optimizations are best done on MIR anyway, because they'll be more effective there than they would be in LLVM.
            • comex 1679 days ago
              You're suggesting that rustc should do noalias optimizations on MIR? I'm skeptical of that idea... A lot of duplicate loads that would benefit from being coalesced are only visible after LLVM inlining.
              • pcwalton 1679 days ago
                Obviously MIR inlining needs to happen first (and I think it does happen already?) But to me it's clearly the right solution going forward. LLVM's noalias semantics are much weaker than what we can have on MIR, with full knowledge of transitive immutability/mutability properties.
                • comex 1679 days ago
                  'Classic' LLVM noalias as a function parameter attribute is weak, but the metadata version is much more flexible. I looked into it in the past and IIRC it's not a perfect match for Rust semantics, but close enough that rustc could still use it to emit much more fine-grained aliasing info; it just doesn't. But there was also a plan on LLVM's end to replace it with yet a third approach, as part of the grand noalias overhaul that's also supposed to fix the bug. Not sure if there's been any progress on that.

                  As far as I can tell, MIR inlining currently happens with -Z mir-opt-level=2 or higher, and that is not implied by -O. But I have no idea what future plans exist in that area.

                  I admit I have a bias here: it feels to me like everyone (not just Rust) is running away from doing things at LLVM IR level, and the resulting duplication seems inelegant. But on reflection, part of the reason I have that feeling is that I've recently been spending time fixing up Clang's control-flow graph functionality... which itself is 12 years old, so it's not a new trend at all!

            • kzrdude 1680 days ago
              Wouldn't MIR be more portable to the future as well? Building on Rust's own equity and all that, because future Rust will probably still use MIR but could replace LLVM(?)
              • monocasa 1679 days ago
                They wouldn't want to do MIR without upstreaming the Rust frontend, which I don't see happening anytime soon.
                • kzrdude 1679 days ago
                  MIR should stay with Rustc, and that's the point — to work on optimizations that happen in Rustc, and not later when the code is turned over to llvm, or other backend.
        • fluffything 1680 days ago
          That's often done, but Rust is shipped on all major Linux distros using the system LLVM, which is often at least 6 months old, or years old, so it needs the workarounds anyways to be able to work with those. The LLVM fixes take a while to percolate back, and Rust supports up to LLVM versions that are ~2 years old (LTS linux distros). The workarounds can only be removed once versions without the fix are no longer supported.
        • throwupaway123 1680 days ago
          They actually do sometimes AFAIK
      • Piezoid 1680 days ago
        Do you think it's planned to bring back noalias in rustc ?

        This, along with const generics and simd, would make rust the perfect language for me.

        • adrianN 1680 days ago
          • pcwalton 1680 days ago
            The upstream fix seems to be blocked on DannyBee in fact…
            • DannyBee 1679 days ago
              ???????

              I stopped working on LLVM about 2 years ago (give or take), as i now have way too many reports to be able to do any effective patch or design review, or honestly, keep up with the mailing list. I'm also too far divorced from work being done.

              (I unsubscribed late last year)

              I specifically reviewed and approved the llvm.noalias patches before i stopped, which is why they are marked as accepted by hal, back in 2016.

              More than that, i was one of the people who basically showed that !noalias/etc is fundamentally broken and can't be fixed.

              Nothing should be blocked on me at this point, and my reviews account is deliberately disabled so that people can't assign/add me to things.

              If something is blocked on me, hal certainly hasn't let me know :)

              • pcwalton 1678 days ago
                Understood. Sorry, it was not clear to me what was going on. In any case, the fix seems to be blocked on something, as it hasn't landed yet, and it's unclear what.
    • aseipp 1680 days ago
      It's just how it works out in practice, in my experience. LLVM is a large, fast moving target that's incredibly complex to understand, because it has a complex job. It resolves many issues for you when you're developing a compiler, but you are gifted issues in return. One of those is that understanding, diagnosing, and properly fixing problems can take a very large amount of work. It has bugs! I mean, LTO has historically been fragile for a single source language when enough code gets thrown at it, much less two languages!

      Another is that compiler developers often don't have infinite amounts of time to sort out shenanigans like this when they come up. Users generally prefer the compiler to work, even if suboptimally, when compared to "not working", so there is some tension between things and how long they take. So landing workarounds in various ways -- sending patches upstream, using custom builds with bespoke patches, code generation workarounds -- all have to happen on a case by case basis. Many LLVM clients do this to varying degrees, and a lot of features like LTO start off fragile due to these kinds of things. Over time things will get tightened up, hopefully.

      When I worked on GHC (admittedly several years ago now), the #1 class of problems for the LLVM backend were toolchain compatibility issues above all else, because we relied on the users to provision it. At the time it was nothing short of a nightmare -- various incompatible versions between various platforms causing compilation failures or outright miscompilation, requiring custom patches or backports at times (complicated by distros backporting -- or not backporting -- their own fixes), some platforms needed version X while others were better served by Y, flat-out code generation buts in GHC and LLVM, etc. It's all much better these days, and many features/fixes got landed upstream to make it all happen, but that's just how it works. Rust made several design choices with their packaging/LLVM use that we didn't that I think were the right ones, but I'm not surprised they've had a host of challenges of their own to address. TINSTAAFL.

    • pornel 1680 days ago
      The long-term plan for Rust is to treat its standard library (almost) like any other crate, so it could be recompiled with your custom compiler settings if necessary.

      However, libstd is by necessity tied quite closely to the compiler, it's one of the oldest and most fundamental parts of the stack, and there are tons of little complications around making it "just" a crate, so changes to libstd/rustc/Cargo necessary to make that possible will take longer.

      In the meantime, changing one problematic compiler flag seems like a sensible solution.

    • dblohm7 1680 days ago
      The low-level tools team consistently reports bugs against the upstream projects. I have no doubt that they did so during the course of this project.
  • zelly 1679 days ago
    Rust needs easy interop with C++'s ABI. Easy as in I should be able to "import Boost" and have it all mapped to Rust structures without doing anything.

    A big reason C++ took off was backward compatibility with C. Network effects. Today C++ has the role that C had in the 80s.

    No one uses any other compiler but LLVM for Rust anyway, so who cares about compatibility with MSVC and others. This will also force adoption of LLVM, which can be a good incentive for LLVM to support it.

    • pornel 1679 days ago
      The thorny issue is C++ templates, which are instantiated in a complicated way that depends on C++ syntax. You can't "just" compile and link Boost in (as you'd do with a C library). You have to compile a fragment of C++ source code for every line of Rust code that uses a Boost template, and you have to translate Rust code used in template arguments to valid C++ code, so that template substitutions will be valid.

      This seems doable for simple cases, but libraries like Boost push templates to their limits, so they're not the easy cases.

      • zelly 1679 days ago
        Yes I realized porting a template library to Rust would basically require implementing a whole new compiler (easily more difficult to implement than all of Rust). Like you said, the only workable way is to link to C++ source code (fully translated units/obj files) compiled by clang. Rust would have to learn how to use C++'s name mangling, data alignment, and calling conventions to be able to invoke C++ functions. See my other post in this thread.

        Basically it would be a "extern C++". The FFI to C++ would hopefully have as low overhead as "extern C" while making it easier for C++ codebases to gradually adopt Rust.

        • pornel 1679 days ago
          > Rust would have to learn how to use C++'s name mangling, data alignment, and calling conventions to be able to invoke C++ functions.

          This is already supported (via Bindgen).

          • zelly 1679 days ago
            Thanks for pointing me to this. Wow. Looks like I can finish learning Rust.
    • comex 1679 days ago
      I think this is both possible and desirable. Basically, create a bridge between rustc and Clang, kind of like Swift's C importer, but far more complex in order to be able to do things like instantiate C++ templates on demand (perhaps even with Rust types as arguments!).

      However, it would also be extremely hard; I don't think any programming language has ever created such a tight bridge to C++. The existing approaches I've seen are:

      - Swig, rust-bindgen, etc.: Pass through auto-generated C ABI wrappers; support for generics is limited and requires declaring up front what types you want to instantiate them with.

      - D: You can bind directly to C++ if you rewrite your C++ header in D, generics and all... including the implementations of all inline functions.

      Both very limited, especially in template-happy modern C++.

      • pjmlp 1679 days ago
        You forgot about COM/UWP, which has taken the role originally though for .NET on Longhorn.

        The whole point of UWP was to improve COM to make it even better for language interop, increasing the kind of language features that can get exposed as COM libraries.

        Also the number one reason that if Rust wants to succeed as system language on Windows it needs to have first class support for COM/UWP.

      • Jaxan 1679 days ago
        There was Objective-C++, which combined Objective-C and C++. You could literally write both languages mixed together. Honestly, it was a Frankenstein monster. The semantics were very vague (especially since C++ uses raii and obj-c uses reference counting in a garbage pool).
      • zelly 1679 days ago
        I'm no compiler expert, but I don't see why it has to be so difficult.

        Just make Rust use the C++ ABI.

        C++ code is compiled in separate translation units as C++ by the C++ compiler. The linker statically links calls to C++ code from object files created by the C++ compiler. That's it. Rust doesn't need to know how to deal with templates because templates are all instantiated by the time this happens. This means you have to write a lot of your fancy C++ logic in separate C++ source files (not headers), but that's the whole point, to use already written code.

        The only thing Rust has to do is pass pointers to static code and follow the calling and name mangling convention. Zero calling overhead.

    • khuey 1679 days ago
      All FFI in Rust requires `unsafe` so it's impossible to do this "without doing anything".
  • jokoon 1680 days ago
    I'm not a huge fan of rust, but that would make rust much more attractive and simple to use.

    Keeping C++ software while making sure important parts are bug free sounds awesome...

    • mlindner 1679 days ago
      You don't have to love Rust to still use it in specific narrow areas.