I really can't emphasize enough how much I love using Bazel. The ability to tell a less technical user "just run `bazel run //amazing/server`" regardless of language and know that everything will magically work (toolchain installation, future toolchain upgrades, incremental rebuilds) is really freeing. The actions graph with rules and aspects is quite powerful, so you can do things like add Java nullability checks or Python type checking remarkably easily. Recently I put together a simple build rule that strips external dependencies, archives the rest, and uploads it to a cache. Then we can easily run that archive against a pre-built container (which contains the external dependencies) in our cluster, enabling a very fast ML iteration loop on beefy cluster machines. I've also done a lot of work to enable middle-ground environments, so my users can run Python scripts like they're used to (`python script.py`) while inside of a Bazel environment, which makes it easy for them to develop quickly and then create a BUILD file when they're ready.
The major downside I've experienced is that any time you're trying to do something in a less-than-Bazel way (for example relying on binaries built outside of Bazel) things can get really hairy. My containers often need various things from apt repositories, so I had to give up on rules_docker and made my own rules for Podman. I think you need someone who understands aspects and rules before adopting it, or else the sharp edges of Bazel will keep cutting you until you drop it.
> Recently I put together a simple build rule that strips external dependencies, archives the rest, and uploads it to a cache. Then we can easily run that archive against a pre-built container (which contains the external dependencies) in our cluster, enabling a very fast ML iteration loop on beefy cluster machines.
Can you tell us more about this? We're using bazel at $day_job and it's about as pleasant as gouging my eyes out with a rusty spoon. Building docker images using bazel takes forever.
Sure! So the overall goal is to prebuild a runfiles tree containing all the external dependencies into a Docker container, and then when the user wants to run something we build a runfiles tree with all the non-external code. Then in the cluster we want to extract the user's runfiles tree on top of the prebuilt runfiles tree, and then execute the user's code.
* I have an archive Starlark function that I use for both this and containers. It sets up a folder structure similar to <target>.runfiles with everything symlinked to the actual location, then it tars the whole thing following symlinks. It has a parameter to include files that start with external/ or not.
* This archive function is used by my Bazel container rules, so I simply made a runner.py target that depends on every possible external Python dependency and made a Docker image with it.
* I then made a Bazel rule that uses the archive function to archive a given executable without external/ and uploads it to a shared location.
* At runtime runner.py is given the location as an argument, downloads it, extracts it, and then execv's it.
My impression of bazel has been the same. It makes working with polyglot tech stack a breeze.
Do you have an example of the rule that strips and archives your dependencies? Or an example of being able to invoke the python interpreter in a bazel context? I haven't seen anything like that, and want to try it out in my own project.
Unfortunately I can't share code at this time but I just described the archiving in more detail in a sibling comment.
The Python interpreter is also quite simple. There's several ways you can do it, but the simplest thing to imagine is if you make a launcher.py script that just invokes Bash as a foreground subprocess. The pstree is kind of funky (bash -> python -> bash), but inside that shell PYTHONPATH will be set approximately correctly. There are reasons to prefer an approach that works with sourcing (eg so you can set PS1), but it's a little harder to describe. You can do some acrobatics to make runfiles (mostly) work, and my recollection is that PATH mostly works but that may require some more work. We do the same thing for Jupyterlab and IPython.
As a nixpkgs maintainer I can't tell you how painful Bazel is for packagers. The difficulty of substituting dependencies. Unstable hashes of fetched dependency sources. The infeasibility of building bazel itself fully from source...
Ah the upstream "we know everything and we will control everything" provider. Who doesn't maintain stable branches with security backports for their bundled dependencies, includes mystery binary dependencies, doesn't care about anything but the main two architectures, expects users to be fine with having seventeen copies of ffmpeg on their systems, will not let their tool work on a musl-based system...
Found it the opposite and it completely put me out of work on NixOS as it starts pulling in a separate JVM and its internal Python was missing libraries and it then pulled unpatchelf'd bins from around the net. The Python-like config language is off-putting too.
I tried Bazel once about 5 years ago for a Python project and it definitely wasn't up to task then, and I kind of wrote it off for a while (it was a bad experience). I do like the idea of tools like Bazel and Nix, and I've since moved toward Go and Rust which I think are more of a happy path for Bazel. I wouldn't mind giving it another try, but I'm not eager to bite off a big learning curve (limited spare time, other hobbies, etc). If anyone has any recommendations for gentle introductions to Bazel (ideally for Go), I would appreciate them.
I have a few annoyances with Bazel, but the biggest is quite fundamental. I really dislike how all rules share the same namespace for outputs. You can’t define two rules that output to the same path, and there is really little reason for this. Buck does not have this issue.
I've mostly associated Tweag with Nix-related content, so this is an interesting change.
That said, it's interesting to me that there hasn't been a better Nix-Bazel bridge. Right now the story for packaging Bazel projects in Nix is really awful. You basically have Bazel run its "deps" phase, and all that stuff gets stored as a singular, gargantuan, fixed output derivation, and if you ever try to change the deps, you have to just know that you have to twiddle this magic hash  or it'll happily go on with the cached deps. But then, of course, every build is from scratch.
A "real" Bazel story for Nix would integrate the two binary caches together, so that Nix would be aware of what Bazel was building, and the individual cached elements of it could go into the Nix store as separate entries— then you'd be able to actually get incremental builds. But I assume this would require Bazel's binary cache implementation to be pluggable.
I have some adjacent experience. Not exactly with that, but I have migrated large C++ build systems, migrated systems to Bazel, etc. I've also written a bunch of Bazel build scripts for various open-source C++ libraries so I can better integrate them into projects that use Bazel.
Bazel is opinionated. The tradeoff here is that if you can make your project match Bazel's opinions, you get a very good experience--but you can have a bad experience if you disagree with Bazel. If you have Bazel experience, you can look at a project and get a quick sense of the distance between how the project is built and how Bazel "wants" to build the project.
The payoff is that once you get your Bazel build system, everything seems a lot more trustworthy. No more "make clean". When you run a Bazel command, it just gives you the correct output, very fast, without worrying about what state your build tree is in. You can make any change to your build scripts and just "bazel build" and get the correct result immediately, as long as you aren't trying to bypass how Bazel works. I never have to do anything like run "make" twice. This is something I've never gotten with systems like Make or CMake, which put more of the onus on individual developers to get things correct.
So I can kind of shut off my brain when using Bazel.
Depending on the particulars of your project, the most straightforward migration path to Bazel will not be obvious. You may need to make certain choices about how much you modify your project to fit Bazel's expectations, versus how much you adapt Bazel to fit your existing project. One example is include paths... do you modify all of your '#include' directives to match how Bazel expects you to write them? Or do you adapt Bazel to do things your way?
The difficulty and payoff is highly variable. My experience has generally been positive, but I also have a lot of familiarity with Bazel. It's easy enough to find an example of a project where I'd just never bother migrating to Bazel, or to find examples of projects (even large ones) where migrating is super easy.
It delivers on its premise of always correct, incremental builds but it is extremely opinionated. I don’t blame it for that, maybe having truly hermetic and reproducible builds requires that level of structure. It is almost magical changing a linker flag in some bazel config and see it _only_ relink affected targets.
If you need to do cross-compilation then I feel like it is extremely overengineered with the whole platform/toolchain concepts, and after _years_ the docs are still incredibly lacking on this aspect. I almost prefer the previous approach with the semi-documented protobuf as JSON crosstool file.
If you need the safety guarantees or the reproducibility there’s no other build system out there. If you don’t then you will be inclined to hate it because you are not extracting value of it.
Yeah the cross-compilation thing is definitely a rough spot. I have one project that's able to work around it via extensive hacks with macros, but at some point I'll need to do it "the right way."
Honestly if the docs had a canonical example of e.g. using unix_cc_toolchain_config (example: ) + Bootlin to compile for aarch64, it'd probably go a long way to making things understandable. Because say what you will about the old CROSSTOOL approach, at least there was a nice tutorial for it.
My normal workflow to bootstrap cross-compilation with bazel is to create a dummy project with some dummy C/C++ file and build it. Then go into whatever bazel-X internal folder and extract the autogenerated bzl for the local system’s compiler. Then update it with my toolchain and strip it down (I hate the “features” feature) until it is somewhat understandable.
> It is almost magical changing a linker flag in some bazel config and see it _only_ relink affected targets.
I think this should be expected from any modern build system. Now, if you make a whitespace change in your source file and the build system recognized this and skips recompiling it, that could pass for magic (build2 does this for C/C++ sources).
I migrated a mid-size polyglot project from Makefiles to Bazel and C++ was a large component of the project.
1. Building with QT5 MOC & UI files. There is a great library for it but it has hardcoded paths to the QT binaries and header files assuming a system-wide installation. I had to patch the rule to point to our QT location. Then it worked fine.
2. There is no rule to build a fully static library. Since we were shipping a static library to clients via our Makefile system, that was somewhat annoying.
3. We were using system links like `$PROJECT_ROOT/links/GCC/vX.Y.Z/ -> /opt/gcc/...` to point to all the build tools, but these didn't work in Bazel I think because it required absolute paths for any binaries it calls. We ended up putting them in a .bazelrc but we would need a different one for Windows and Linux.
4. Not good integration with IDEs
5. (edit) The Bazel toolchain system is confusing and I couldn't understand it after reading all its docs
Ultimately we did not keep using Bazel because we were building Python binaries and py_binary was too slow on Windows. And we didn't have enough time to write a PyInstaller rule.
Regarding #3, my approach to solve problems like this is to make a custom repository rule which creates the desired symlinks. The repository rule can invoke external programs or examine the environment as necessary to figure out how this should be created.
Basically, you create a repository rule that symlinks your $PROJECT_ROOT/links/GCC/vX.Y.Z/ to $repo/... somewhere, and then generates a BUILD file for the repository.
Writing your own repository rule is not especially difficult and they do have a lot of power not available to ordinary rules. This is the API that you can use from within repository rules--you can see that it lets you run arbitrary programs, create files and symlinks, download files, etc.
#2: To be fair it is reasonably easy to make a cc_static_library_binary-ish rule which merges all transitive .a-s (just generate an ar script and call archiver). But I have to admit that I spent non-trivial time on maintaining our "CROSSTOOL in skylark" (forgot the term) for 20+ target platforms before and it helped a lot on understanding the (still incomplete) C++ sandwich.
I’ve not got specifically what you’re describing but I’ve encountered Bazel twice in my career - both in very large Java/Scala/Go code bases (distinct repo for Go but a lot of code).
Bazel is extremely underwhelming. I’ve worked with crusty ancient systems that built huge systems and Bazel is just the most clown shoes build tool in comparison.
Something didn’t work? Try typing the same command repeatedly and hope that this time it sticks. Multiple commands to achieve a seemingly straightforward task? Why isn’t there a single one that will get us there.
FWIW - I suspect like most tools that Google open sources, the tool makes far more sense in the context of Google’s systems and architecture. If you’re adhering to that, it’s probably coherent.
Full admission - I don’t own or have responsibility for how the build system I work with is setup.
I’m on parental leave so details are fuzzy but roughly:
- run Bazel build
- expect dependencies to be built
- they weren’t built
- run Bazel build again
- something else decides to be built
- repeat ad nauseum
This is within the same project directory, etc. It’s entirely possible that the project is setup in some pathological way but I’ve encountered this enough times in two companies that it’s stuck in my head.
That sounds like some one hacked up an existing project into Bazel without resolving the opinionated differences. I've worked on Bazel projects at multiple companies and I've seen things go off the rails like that a couple times before someone that actually understands the tool rewrites the problematic build process. It's usually some nasty stuff where someone tried to work outside of Bazel because they didn't understand it and created a bunch of impedance mismatches. New people doing this instead of the "right way" is the most valid criticism of Bazel IMO.
Sounds like you're working in the worst of both worlds right now.
This sounds highly broken somehow. Bazel’s raison d'etre is rebuilding only the necessary dependencies. Maybe this is a case of someone trying to force-fit a process that doesn’t look like bazel’s opinions with some custom rules that break it at the core?
Pretty much all my bazel/blaze experience has been at google or in other side projects built from the start with bazel, but I have never encountered anything like that. The only complaints I’ve had is that building python slows the scripting loop, and that for side projects without google’s build infrastructure the rebuild-the-world-at-head-from-source method can get very expensive. But this sounds really unfortunate and nothing like the tool I’ve used :/
Whoa, that's tough. Sounds like you have indeed hit a case where
> the tool makes far more sense in the context of Google’s systems and architecture
Bazel should not behave like what you described (at least in my experience) for in-repo sources and build rules. Except that the world does not work in this way, so they added a duct tape called "using workspace rules to fetch and potentially build external dependencies", which is as fragile as a ./build.sh pulling in all your dependencies.
And Google? They did vendor everything they use at //third_party in their monorepo, so (╯‵□′)╯︵┻━┻
Having gone through this twice, I'd say it is not that difficult, but could take reasonably huge effort. On par with turning your crufty CMakeLists to be so called "Modern CMake" (whatever that means).
But, why? Bazel is very opinionated on how you layout your C++ source code in the repository, and it's something which could not be retrofitted easily.
A little better when TH is involved but incremental isn't as good as ghc/cabal yet:
> When doing incremental builds, though, both stack and cabal-install can use the recompilation checker, and for changes deep in the dependency graph with little propagation, haskell_module is not able to beat them yet. For changes near the build targets, or which force more recompilation, haskell_module would be more competitive.