Unfortunately, this reads like a 100 foot marketing document for Sysdig, not actual container security best practices.
If you want to look at actual container security best practices, check out CIS  & DISA , and NSA , with some theory at NIST , as well as the documentation from your preferred cloud vendors, be it AWS, Azure, GCP, or other, as well as the specific container security practices.
(disclaimer: I know the company and some of the early founders)
I wish all "marketing documents" were this detailed. In other words, I disagree with you. I've read the blog post and it doesn't seem too high level. The resources you indicate are nice, but a 60-pages kubernetes hardening guide by the US Government is perhaps one level deeper than a blog post on internet.
Clearly sounds like a marketing document. Cites a survey from "Cloud Native Computing Foundation" and claims "92 percent of companies are using containers in production" + "Thus, Kubernetes, Openshift, and other container technologies are present everywhere" while ignoring the fact that the survey is heavily biased towards companies that run containers, of course.
Their own services and blog posts is also referenced in almost every section of the post, even when better external resources exists. Zero competitors are listed in any section. Doesn't sound very neutral to me.
In this sense, yes, I agree with you. But a "100 foot marketing document" offers a certain negative connotation that reads like "no content, just fluff"; the content is there, and yes, it is biased, and yes, no competitors are mentioned.
I also agree with you on the fact that a "smarter" kind of content marketing would go beyond these limitations; it would mention competitors, or alternatives; and it wouldn't highlight its company's own services too much.
If someone from Sysdig is reading, these are suggestions for you, guys.
Container security should start with image security. Instead of runtime security stuff, you can statically analysis images before they are running somewhere and find what known exploits might exist in them. This is also easier to scale.
One of the hardest things to get any dev organization to start taking seriously is supply chain security. That first scan which lights up like a Christmas Tree is always such a daunting obstacle to get over. It's a shame because it is probably the highest value SDLC practice that many are not doing.
Yet, the base Debian image _does_ light up like a Christmas tree when you run a snyk scan. Mostly with incorrect issues (version number causes a flag but the fix is backported) or are considered low priority and thus WONTFIX by upstream.
If you’re writing software against, say, dotnet3 (which has a docker image based on Debian) then you’re basically noised out.
It's funny that you use the term "actual" to describe the guidance from the US government. They don't really know what they are talking about. Their release process for guidance takes so long that by the time it's release, it's out of date. This is absolutely true for k8s guidance. Last I checked, they were suggesting everyone use "Docker Enterprise" on their guidance long after it no longer existed (are vendors supposed to magically know mirantis is now an option?)
I'm scoping my statement to container security & orchestration best practices, not their competency as a whole. I know the specifics of their guidance due to the industry I work in, so I feel comfortable speaking generally about specific guidance in regards to specific technology.
Yep I've always had read only root filesystems down as a good control and one that's often not too tough to implement.
Another favourite of mine would be using multi-stage builds and minimal base images in production (FROM Scratch, where possible). having limited or no tooling in the running container makes an attackers life trickier for sure.
My home k8s cluster is now "locked down" using micro-vms (kata-containers), pod level firewalling (cilium), permission-limited container users, mostly immutable environments, and distroless base images (not even a shell is inside!). Given how quickly I rolled this out; the tools to enhance cluster environment security seem more accessible now than my previous research a few years ago.
I know it's not exactly a production setup, but I really do feel that it's atleast the most secure runtime environment I've ever had accessible at home. Probably more so than my desktops, which you could argue undermines most of my effort, but I like to think I'm pretty careful.
In the beginning I was very skeptical, but being able to just build a docker/OCI image and then manage its relationships with other services with "one pane of glass" that I can commit to git is so much simpler to me than my previous workflows. My previous setup involved messing with a bunch of tools like packer, cloud-init, terraform, ansible, libvirt, whatever firewall frontend was on the OS, and occasionally sshing in for anything not covered. And now I can feel even more comfortable than when I was running a traditional VM+VLAN per exposed service.
Using a sidecar is also an option for debugging stuff involving shared storage, yes. The distroless project also ships aptly named "debug" containers that have BusyBox if you want a minimal shell for debugging something in the container filesystem itself. I've also made use of self-made "debug" containers with go-delve or the JVM in their respecting over-the-network debugging modes and a kubectl port forward, for anything written by me.
For network observability I'm using Cilium's Hubble, which I will soon figure out how to get into a greylog setup or something. For container image vulnerability interrogation I'm running Harbor with Trivy enabled, initial motivation was to have an effective pull through cache for multiple registries because I got rate limited by AWS ECR (due to a misconfigured CI pipeline, oops), but it ended up killing two birds with 1 stone.
Next on my list is writing an admission controller to modify supported registry targets to match my pull through cache configuration.
Inside the cluster my containers are Linux only. I don't believe kata-containers supports Windows containers as I don't think rust-vmm, which is used by CloudHypervisor, or the kata internal execution agent support it.
If I wanted to run Windows in the cluster I'd probably have to look at KubeVirt. KubeVirt is oriented towards getting traditional VM workloads (ones you'd run in QEMU, Hyper-V, etc) functioning in a Kubernetes environment. While kata-containers is oriented towards giving container runtime based workloads (images that run on docker, containerd, CRI-O) the protection of virtualization, with minimal friction.
Previously external to the cluster I had some Windows VMs hosted on QEMU/KVM + libvirt for experimentation with Linux and Active Directory integration, but they've since been deleted. The only remaining traditional VMs I have are 2 DNS servers and one OpenBSD server for serving up update images to my routers.
For network infra I have a number of VyOS firewalls both at the edge and between VLANs, and Mikrotik devices for switching.
The thing that kills me about all of this is how hard it is to do it right. I wish there were a dumbed down version of containers and orchestrators for people trying to do basic multi-tenant compute in a SaaS and don't care a ton about the best performance.
Would I be generally ok if I use gvisor to give a shell environment to customers and just keep the host up to date?
Or is using containers just relatively pointless for multitenant compute in a SaaS compared to giving customers virtual machines?
If you can't imagine the kind of SaaS I'm talking about, think something along the lines of Github's new online IDE, CodeSpaces.
Multitenancy is difficult with containerization and not something I would recommend. It isn't what the technology is intended for. The ultimate example of multitenancy is actual platform and infrastructure providers and they all do it by giving you VMs because type I hypervisors are actually designed to do this kind of thing. Breakouts are always still possible when two processes are on the same physical server, but it's never as trivial as figuring out how to mount the kernel virtual filesystems.
I say this as a Kubernetes consultant. If you want "multitenancy" in the sense of distinct product or application teams all employed by the same parent company or organization, it's fine. But if you're talking truly different organizations with no implied trust between them, don't put them on a shared cluster.
I'm kind of curious how Github does this, because you can still get very minimalistic with VMs. Make the startup script for your application something that also mounts the filesystems it needs and name it /sbin/init and you just made yourself a poor man's unikernel.
That's very true, although I think there's a difference in attack surface size between the three isolation options (process based, sandbox based, hypervisor based).
I think the challenge for process isolation container based stacks (as I'm sure you know :) ) is that there's multiple components/groups involved in security and then there's co-ordination with the underlying Linux kernel as well, which makes things tricky, as Linux kernel devs will have potentially differing goals to the container people (e.g. the challenges about how to handle the interaction of new syscalls and seccomp filters)
If you compare that to something like gVisor, where there's essentially a single group responsible for creating/maintaining the sandbox, it's an easier task for them.
I think "dumbed down" and "multi-tenant compute" aren't compatible. No company needs to do multi-tenant compute by default. If you do, you are in the cloud hosting/infrastructure business (whether you like it or not) and should be expected to have the knowledge necessary to run such an operation.
Calling your guide the ‘ultimate guide’ is disingenuous marketing. No single guide can cover all security concepts in all contexts. Every time I see that sorta wording I just assume the writer doesn’t actually know what they’re talking about
Continued: and given the writer seems to be all about tools the article fails to highlight that static (and automated dynamic) tools are limited in their ability to detect some classes of vulnerabilities and need to be backed with experience manual testing. This almost feels like it’s been written by a devops engineer who has a vague understanding about containerisation doesn’t have a clue about real and practical mechanisms to secure applications and services hosted inside containers.
I’m not saying the article is totally bad, but calling it an ‘Ultimate Guide’ makes the author a charlatan.
I'm always a bit confused about the CPU limit (for the pod), some guides (and tools) advice to always set one, but this one  doesn't.
Ops people I worked with almost always want to lower that limit and I have to insist for raising it (no way they disable it).
Is there an ultimate best practice for that?
CPU limits are harmful if they strand resources that could have been applied. I usually skip them for batch deployments, use them for latency-sensitive services. Doesn’t seem like a security topic though.
Interesting. It's my impression too. I understand that CPU limit will artificially throttle CPU, when not necessarily needed, wasting CPU cycles I could use.
(Java programs in my case but I imagine it's comparable to Go ones)
Do you recommend to disable CPU limit? In the general case.
I think this is backwards. How are you planning on “sticking to it” when you’re serving unpredictable user traffic? If requests are set appropriately everywhere then it won’t really starve batch as kernel would just scale everything to their respective cpu.shares when cpu is fully saturated. This would allow you to weather spiky load with minimum latency impact and minimize spend
Microsoft's guidance (last I looked) was that Windows containers (e.g. the non Hyper-V ones) were not a security boundary, only Hyper-V based Windows containers should be considered to provide isolation.
It just changes complexity. The difference between a container on bare metal where the target is an adjacent application (or container image), and a container inside a vm where the target is an adjacent application on the host (or inside a vm/vm+Container) the attack chain includes a container breakout and* a hypervisor breakout, which is harder to do, but probably not beyond highly sophisticated threat actors.
Virtualization-backed container technologies are a definite security improvement over traditional containers (including Hyper-V), but most of the measures in this article are still important. Remember, security-in-depth. Virtualization mainly protects against zero-day kernel exploits, limiting the "blast radius" to a single container. You still need to monitor dependencies, isolation, signing, scanning, and have a vulnerability management program, among other things.
Production host root fs should be mounted ro. Check out Linux IMA and how to only allow specific executables by hash. Centrally forward container logs. Use a VCS for container/workload templates and routinely audit for misconfig. Sysdig/falco and related tools are nice, but containers and their prod hosts are easier to harden