I’ve worked with a few developers that have been adamant about developing inside of containers. What I’ve noticed:
Terrible performance. One of my engineers recently thought that 150ms was terrific for a HTTP request. Break out of the container and it was <10ms. YMMV.
Fragile everything: Because one expects a “pristine” environment, often any slight change causes the entire stack to fall apart. This doesn’t happen at the start, but creeps in over time, until you can’t even update base images. I’ve seen it a lot. It ends up only adding an additional layer of complication.
There are definitely reasons to do this... But when a pedantic developer that needs everything to be “just right” does it, it often becomes a disaster, leading to shortcuts and a lack of adaptability.
There’s also the developer that has no idea WTF is going on. They use a standard Rails/PHP/NodeJS/etc container and don’t understand how it works. Sometimes, they don’t even know that their system can run their stack natively. I’ve been on teams that have said “Let’s just use Docker because X doesn’t know how to install Y.”
Docker is fantastic for many things, but let’s stop throwing it at everything.
Maybe you're talking about non-native containers (i.e. not Linux), but there's no technical merit to the idea that a container by itself could introduce 15x latency on a Linux host for something like a web request, unless something like network namespaces, tc, etc was being used very improperly.
You also point to a lot of problems that are container-independent and lay them at the feet of docker, which is unfair.
Upgrading the OS is always hard unless you have some awesome, declarative config and you managed to depend on zero of the features that have changed. It doesn't matter if you're in a container or not, switching from iptables in Centos 7 to nftables in Centos 8 is going to introduce some pain.
And somehow we get mad at people for not knowing how to install things, but the complexity of installing them is itself a problem. More steps means more inconsistency, which means it's more likely that "it works on my machine, but breaks on yours."
> often any slight change causes the entire stack to fall apart
Yes, but this is true generally; it's not specific to containers. Any dev environment naturally tends disorder with unsynchronized versions, implicit dependencies, platform-specific quirks, etc. It takes an effort to keep chaos at bay.
At least with containers you have a chance of fully capturing the complete list of dev dependencies & installed software. I'm interested in how CodeSpaces/Coder.com solves these issues.
Counter-argument(?): I've seen a product where production ran in Docker, but development was a mix of Mac, Windows and every popular Linux distribution, on laptops, on-prem servers and in the cloud, as per each devs preference. Components could be run separately. The product could run anywhere.
Then standardization crept into development. Two years later, it was essentially impossible to run it outside Docker built by Bamboo, deployed by Jenkins in on-prem OpenStack, components were tightly coupled (database wasn't configurable anymore, filesystem had to look a certain way, etc.), and it required very specific library versions, which largely haven't been updated ever again, and cannot be updated easily anymore by now. No individual team had an overview of everything inside the container anymore (we ended up with 3 Redis, 1 Mongo and 1 Postgres in that container. The project to split it apart again was cancelled after a while). Production and development were the same container images, but in completely different environments.
If you want code paths to work, you need to exercise them regularly through tests. Likewise, if you want a flexible codebase, you need to use that flexibility constantly. Control what goes into production, but be flexible during development.
The same mistake can be done outside of containers though. Any software needs to be maintained and its dependancies kept up to date. Containers might give the feeling that it's not a necessity anymore as it allows to spin up an environment in one command, but in the end those dependencies are still there.
My experience is the opposite. I once had started a job with totally outdated software that couldn't be run anywhere else than the old server it was currently running and had never been touched since 2008. We were able in the end to bring everything back up to date and create containers that are:
- easy to update
- allow devs to work on their favourite os (windows, linux or macos)
- does not require someone help devs to fix their dev environment regularly
Sure, keeping a stack clean is always difficult. But I think OPs point was that programming in a container encourages a more fragile setup.
On a native setup, you get a feel for the fact that X config file might be in different places, or that Y lib is more robust and more widely available than lib Z. You end up with a more robust application because you have been "testing" it on a wide range of systems from day one.
When developing inside docker, you are fooled into thinking that various things about your environment are constants. When it comes time to update your base image, all these constants change, and your application breaks.
> When developing inside docker, you are fooled into thinking that various things about your environment are constants.
No, you really aren't. You're just using a self-contained environment. That's it. If somehow you fool yourself into assuming your ad-hoc changes you made to your dev environment will be present in your prod environment although you did zero to ensure they exist then the problem lies with you and your broken deployment process, not the tools you chose to adopt.
> Yes, but this is true generally; it's not specific to containers.
Always using containers make it harder for you to tell when you're making your setup brittle. If your environment always is exactly the same, how will you notice when you introduce dependencies on particular quirks with that environment? If your developers use different operating systems, different compilers, etc., you have a better shot at noticing undesirable coupling between the system and its environment.
The most obvious and critical reason is because of security. You don't want your app to be stuck on Ubuntu 12.04 forever, but that's exactly what can happen. If you're not incrementally updating and fixing your stuff, you end up facing 5+ years of accumulated problems, at which point many people will take door #2: keep using the broken environment until somebody forces you not to; or door #3: start from scratch.
The upgrade treadmill is exactly that, a treadmill--it's exercise. The alternative to not exercising is poor health and an early death.
You're arguing "containers give you a chance of keeping everything pristine" but the claim was "you end up with a more robust system if you don't thing 'everything should be pristine' should be a precondition."
I'm not sure I agree with the original poster though. I both dislike doing dev inside a container and dislike complicated manual dev environment setups. Containers for deps like dbs are more reasonable. This is faster perf-wise, more friendly for fancy tooling/debuggers and such, and it introduces just enough heterogeneity that you may catch weird quirks that could bite you on update in the future.
But you should be able to spin up/down new deploys easily, without having to do manual provisioning and such, which means the env on your servers should be container-like, even if it's not directly a container. Pristine and freshly-initialized. And then if you regularly upgrade the dependency versions, from linux version to third part lib versions to runtime versions, then you will still avoid the brittleness.
> Terrible performance. One of my engineers recently thought that 150ms was terrific for a HTTP request. Break out of the container and it was <10ms. YMMV.
If you're not using Linux (presumably you're using MacOS), your "containers" are actually VMs so it's unsurprising that the performance suffers somewhat (not to mention that file accesses are especially slow with the Docker-on-Linux setup). The performance impact of being inside a container on Linux is as close to zero as you can get.
> Terrible performance. One of my engineers recently thought that 150ms was terrific for a HTTP request. Break out of the container and it was <10ms. YMMV.
It's not that bad for everyone.
For example on my Windows dev box, I have HTTP endpoints in volume mounted Flask and Phoenix applications that respond in microsends (ie. less than 1 millisecond). This is on 6 year old hardware and the source code isn't even mounted from an SSD (although Docker Desktop is installed on an SSD).
On Linux, I have not noticed any runtime differences in speed, except that starting a container with Docker takes quite a bit longer than starting the same process without Docker. Apparently there's a regression: https://github.com/moby/moby/issues/38077
This feels like very much a YMMV situation. I think my own personal thinking is mostly the same as yours.
But for the OP it may be perfect. On the blog he indicates that he's a CS professor. I could imagine that in a research environment maybe he gets better mileage out of this than someone coding in a for-money work environment.
Docker is fantastic at precisely this use case: capturing tool and build dependencies in a reproducible way. I am not sure what performance issues you are complaining about. We run high speed trading services with single digit microsecond latency on docker just fine.
LXC (not LXD) is an utter delight compared to the competition. If you have IPv6 it works without any hacks and it’s like having a whole datacenter all controllable at the command line.
Anything that inserts itself into iptables feels like a no no. That’s meant I’ve not really put much effort into LXD or Docker beyond discovering they are kind of heavy.
The latter had poor IPv6 support the last time I tried to use it (9 months ago.). It’s there, but it felt like a second class citizen.
LXD just feels like Ubuntu to LXC’s Debian, so I also didn’t play with it beyond the initial few hours.
LXC itself is a joy. I run Alpine, Debian, and Ubuntu depending on my needs. Everything is disposable, with actual data either in github or a filer. I don’t even bother changing the hostname on the VPSs I provision your use it. Boot, install a firewall, create containers, and forget the original host OS even exists.
I’d really recommend getting to grips with the low lever (non-LXD) LXC stuff, especially when you are one “vagrant up” away from having LXC tools on your non-Linux OS!
I totally agree. People have it stuck in their heads that container == cgroup, and they don't realize how much cruft e.g. docker layers on top. Anything that requires you to do networking on your CPU absolutely kills performance.
> Anything that inserts itself into iptables feels like a no no.
I have a setup which works great with VirtualBox. I have pfSense installed in a guest VM and the host machine routes through it -- without that guest VM running the host machine can't connect anywhere. It's really handy to have a consistent firewall interface despite every host OS having a different idea of what a firewall should do or look like.
Docker works with the VirtualBox setup too.
I tried to do the same thing with libvirt for four weekends and eventually gave up. I couldn't get libvirt to play nice with iptables at all.
LXC is for containers, of course, instead of VMs. But since Docker works with the VM-guest-as-a-router setup, perhaps I'll really give LXC a try too.
LXC has been a life changer for me. Making stuff compatible with multiple distros, making sure the dependency list is complete and experimenting something very quickly without taking the risk to compromise my main system are solved problems, now. Just install the distro with the most recent kernel on the host (you do already if you use arch) and spawn your Debians, Ubuntus, CentOSes in seconds…
A bit of a funny thing to say, given that effectively the same group of developers work on LXD and LXC -- and most are employed by Canonical. LXD was never meant to replace LXC, it solves its own set of problems.
I personally like using LXD because it reduces the need for me to write my own scripts to do trivial container management (and it can manage images and containers on a ZFS pool by itself), but if you're more comfortable with LXC then you do you. But I disagree that LXC is significantly more "low level" than LXD -- it just requires more manual work and the configuration format is more transparent about the container setup, but you ultimately have the same capabilities in LXD.
Well, that's because the "lxc" command-line tool manages LXD -- you can't use the "lxc" commands without using LXD.
Arguably it should've been called "lxd-client" or "lxdc" but they probably just felt it was too wordy -- most admins that used LXC and wanted to switch to LXD felt more comfortable typing "lxc <command>". And those who used LXC had no need for a top-level "lxc" command because they used the individual programs directly. Again, since the people working on the projects are the same there's nothing wrong with borrowing the name from the other project. :D
Anecdotically, I develop inside a container using Docker for Mac, and don't have any performance issues. I think it depends on what exactlty you're developing, in particular filesystem access patterns.
VMs can provide native performance. Docker sucks because Docker sucks. MacOS' Hypervisor.framework doesn't help, either.
Linux' VM hosting subsystem, KVM, as well as its VM guest drivers, support all the features needed for zero-overhead VM environments out-of-the-box, including PCI passthrough if you want completely native disk and network I/O.
The problem with containers as a performance and behavior testing environment is that everything is using the same kernel. Kernel behavior is a significant factor, and sometimes a huge factor, in the performance of various applications.
Then in a project, simply `ctrl+shift+p` -> Add container configuration files. Then `ctrl+shift+p` -> Rebuild Container
Couldn't be easier.
The only downsides are lack of GUI, often in python it's nice to just do e.g. plot.show() rather then export to a file and view that file. But for most of my non-visual programming work, a docker is amazing. Once it works for me, I can guarantee it will work for all other developers, and more importantly, my dev environment is really similar to my prod environment.
It's handy to be able to do this with things you already know you want to be able to do inside your container. The downside for me though, has been having to throw away all the careful tweaks I have made to my development environment over the years. The muscle memory of my long list of .bash_aliases. My custom git config with shortcuts. My scripts in ~/bin. Hell even installed software that doesn't fit in with the purpose of the project, think things like `jq`.
Also worth noting that the "permissions issues" he mentions are handled automatically by Docker for Mac. I had the exact problem of files created inside the container being owned by root on Linux, and all my mac-using colleagues just stared at me like I had horns on my head. "It works on my machine" still exists, even with Docker.
You could probably do fine with just chroot. Docker is overkill for most use cases, but one advantage is that it works on Windows and Mac too. Using the same uid for your user on both the host and the container will likely fix the weird user issue.
One thing I miss when using Docker is the pipe command in unix like shells.
And if you need a step up from chroot, systemd-nspawn is a good next step. Just as light, but supports many more isolated features including all kinds of networking. No need for go runtime as with docker, the same convenience of use via machinectl command.
Regarding the fiddly permissions between host and container, at least on Linux podman seems to be the solution.
It is command line compatible to docker (and uses the same image format), but instead of launching containers through a demon, it launches container directly as the calling user (needs user namespaces, Linux 3.8+).
Now you can share volumes between host and container, and don't run into those pesky permission problems that come from different UIDs on the inside and outside.
Very useful in CI contexts, for development environment and so on.
We do this very thing at my work. When you have a million and one guidelines around what can and can't be installed on a device. Especially on one that has access to the internet, it is easy to say here is this container that has no access to the internet. That seems to make infosec happy. Also, makes local testing difficult if you need access to a web based API.
The other issue you do have is the issue of configuring them for each application, but luckily most teams have had at least one or two people take up the role of maintaining the images and the startup scripts.
On one of my Windows work laptops, the restrictions placed on devs are ridiculous, they won't even grant admin rights. Instead they have some bug-ridden software installed that let's you elevate specific apps. You need to request access to those apps, and your request goes through a weeks or even months long process before you get a response. If granted, you need to enter your username and password for each elevation, every single time.
This machine has hyper-v installed, so I created a Windows 10 VM and worked in that fullscreen, permanently.
I am the opposite. I develop on Linux via VMWare Player running on a Windows host. I don't want Linux as the main OS on my personal desktop and I can keep work related stuff on its own VM. I've never noticed any non-bare metal speed issues.
As for distro versioning... surely you only get the user-land package versions, and the kernel is still the version of your main OS?
When I left my previous job, I was starting to think it might be a good idea to start using docker containers for dev. We were using chef kitchen to maintain virtual machines, but we had just enough production systems, that depending on what you were working on, you might need 3 virtual machines running at once. I didn't care on my desktop with 16GB of ram, but we had a number of devs using a laptop with only 8GB, and 3 virtual machines was just enough to cause headaches.
It was the first use case for docker that I thought might make sense. The second was the headache we were likely to start running into on our CI server when dealing with different versions of Node/NPM for different codebases, a drift that is inevitable as older codebases get less support.
The script and Dockerfile I use are pretty long, but they set up everything needed to map in the most common stuff like ssh and aws creds, ptrace, sudo support, uid/gid runtime mapping, etc.
Once you start developing your shared repos from a container, you realize that it's much easier to automate running things from it than to develop inside of it, so it's actually easier to get people to build CI/CD pipelines now where before they'd wait for someone else to do it for them.
And the only problem there is there's no easy + good + free self-hosted CI/CD software out there. Yes, you probably use X just fine, but X probably doesn't scale, or isn't enterprise compatible, etc. The biggest barrier to automation is both a business problem and a technical problem.
> These glitches come from the strange way in which Docker deals with permissions and security. Contrary that what you mean read, it is not a simple matter of setting user and group identifiers: it may be sufficient on some systems but not on systems supporting Security-Enhanced Linux which require additional care.
Does anyone have any general advice on the best way to deal with Docker file permissions issues on recent Ubuntu LTS distros? As a new Docker user this has proven surprisingly intractable. As mentioned in the quoted text, simply setting UID / GUID to the same doesn’t seem to do the trick most of the time. I see podman has been mentioned in this thread but is there a native, no-dependency, no-new-tool way to handle this? I feel like I must be missing something simple. Grateful to hear anyone’s experience.
I probably edit ~/.bash_profile every couple of days, but reimaging my machine is a big deal. Code can live in a git remote, but what happens to my homedir and notes and ad hoc analyses if I recreate the container that often?
Nothing. None of those things would be kept in the container. The only thing you would keep in your container would be the tools and libraries you need to build your project. You mount your home file system inside the container and just use it to build/debug. Everything else you keep outside.
> The idea of a container approach is to always start from a pristine state. So you define the configuration that your database server needs to have, and you launch it, in this precise state each time. This makes your infrastructure predictable.
I’ve tried this before, but there’s still a lot of overhead maintained the pristine state. For example troubleshooting why a python package won’t run, you end up I installing and upgrading a lot of other packages. You’re not sure if that helped or it was something else — now what? You’ll spend time wondering if you want to carry your changes over or deal with the drift.
And now we've gone full circle. One of the proposed use cases for containers when Solaris Containers came about (and probably also FreeBSD Jails before it) was that you could easily give a software developer a full OS instance to play about and break stuff or to do whatever they desire, instead of restricting them to the confinement of their home directory.
Developing a Java app in devcontainer on Linux using VSCode. I absolutely love it. I need to use an SDK that does ugly things to your local maven installation so I put everything in container, VSCode start it automatically and embedded terminal is opened inside that container, the only difference I notices that VSCode is not Idea.
I have done this from time to time when debugging applications that run inside Docker. I'd bash into the container as root and then download and install my development tools and check out the repo where the code lives. Once the code is functionally correct, then I can worry about performance optimizations given Docker...
I like the sandboxing effect of using Docker. Having node_modules (or any kind of dependencies) not being able to access all files on the host is a big relief - especially since there were attacks on popular dependencies already, stealing wallets or similar.
After coming to a shuddering halt trying to get a build env setup for a trivial python script (poetry couldn't install the requirements because pip.... SSL... Blah blah blah) I just used our image used for the drone step to run and test it.
Different packages being installed on Windows vs Linux.
Different packages being installed pip vs (some other package manager)
Simple user mistakes in the requirements file (not strict version).
Going to docker fixed #1 and #2, and the constant rebuilding of the environment meant we quickly identified issues in requirement files. Working in docker is the equivalent of fail-fast programming imo, its just applied to the environment.
I think the challenge with this is most our workstations are either macs or windows, in which case your containers are running in a VM anyway (so, sluggish). You only really get native performance if youre running it in linux, which is fine but.. the year of the linux desktop has not quite arrived yet.