Nice, exactly what I needed today, after firefighting a latency spike because of a compute node after a separate ingress node was restarted.
I would make the taxionomy a bit more precise: failures at running software on the platform and failures at running the platform itself (which of course affect the software running over it: high latency, network packets dropped, DNS not reolved, ...)
In general I find that when the platform works as expected, it is not that easy to make software run on it fail. That is, it is harder than without what Kubernetes provides (granted: you can have it without Kubernetes, but many of us didn't bother to have capable admins setting things up the right way).
What I find extremely fragile is the platform itself. It works, but you have to start and stop things in the right order, navigate a sea of self-signed TLS certificates that can expire, iptable rules and services and logs.
All have failure modes you need to learn: it takes a dedicated team. And once you have that, you'll need another team and cluster to perform an upgrade.
But hey, when it works, it is really cool to deploy software on it.
I feel like their needs to be a sanity check. Realistically what does kubernetes do to "not make software fail"? Health checks? Autorestarting containers that crash? Enforce various timeouts? Relatively quick distributed configuration deployments? These can be hard problems for sure, but do you need all of kubernetes to get these benefits in a production application?
Is it worth going with those? I'm torn between deploying stuff on Kubernetes (on a managed control plane, of course, which you can get for free nowadays with providers as long as you use their own machines) and rolling my own with the Hashistack.
Depends on your use-case, really. IMO, Hashistack is fine if you have well-defined requirements. You can pick and choose the components you need (including non Hashi stuff) and integrate them with relative ease, keeping resource footprint and operational complexity at a minimum. I’ve found that it’s quite easy to pinpoint problems when they arise.
Kubernetes is more of an everything-and-the-kitchen-sink approach that you can’t really outgrow but because of its monolithic design, debugging can be quite challenging. It does, however, come with a much larger and much more active community that you can go to for help.
Load balancing should be a solved problem already. Swarm and Kubernetes should be using dead simple off-the-shelf software for ingress and load balancing. Any competitor should be able to use the same solutions. To put it another way, this shouldn't be a differentiator.
The problem is that the functionality in tools like nginx are still tied to static network architectures that evolve slowly, and don't take advantage of things like diurnal variability in workloads.
Kubernetes does use dead simple off-the-shelf software for ingress and load balancing. That software though, unfortunately, has a lot of knobs, and what "Ingress" and "Service" resources do is make sure those knobs are turned to the right settings.
The nginx ingress controller for example, under the hood, just generates and applies a big ol' nginx config! You can extract it and view it yourself, or extend it with a Lua script if you want to be fancy and do some FFI inline with processing the request, etc.
The easy way to do this is with NodePorts, wherein you configure your LB with all the nodes in your cluster being app servers on a certain port for a certain app. However you will lose some performance as there's some iptables magic under the hood.
Beyond that there's a sea of options for more native integrations that will depend on whether your LB vendor has an K8s integration, how friendly your networking team is, and how much code you're willing to write.
> navigate a sea of self-signed TLS certificates that can expire
When certificates become a logistics category of their own, you really do need some monitoring software to warn you when certificates are winding down.
Years ago I worked on a complex code signing system and one set of users were adamant we have alerts starting at 90 days out. It took some convincing to get me to agree it was a priority, but some of my coworkers were never convinced.
> a sea of self-signed TLS certificates that can expire
I would like to know more about what's going on here. Is this just a sloppy description and in fact Kubernetes uses a private PKI, so that the certs you're using aren't in fact self-signed but signed by a private CA?
Kubernetes and cloud native software make a lot of use of TLS for mutual auth.
A standard Kubeadm cluster (very vanilla k8s) has 3 distinct Certificate authorities, all with the CA root keys online.
On top of that things like Helm, FluentD and Istio will make use of their own distinct TLS certs.
One of the most "fun" pieces is that k8s does not support certificate revocation, so if you use client certs for AuthN, then a user leaving/changing job/losing their laptop can lead to a full re-roll of that certificate authority :)
I've seen short-lived certs as a suggested workaround for the lack of revocation and as a user that might be the best option.
That said the distributions I've seen that make use of client certs, don't do that (typical lifetime for a client cert is 1 year), so I'm guessing a load of people using k8s will have these certs floating about...
Using a similar pattern (tickets I guess) doesn’t mean you’re reinventing something.
You might equally say we’re reinventing “authenticated sessions”. More in common with jwt cookies tbh.
We don’t want to run krb infrastructure so we don’t do that.
The runtime dependency on ldap or NIS, plus keeping krb itself HA, fed and happy plus OS dependent PAM setup make krb fairly undesirable in a production cloud environment if ssh certs and kube certs would suffice.
> so that the certs you're using aren't in fact self-signed but signed by a private CA?
I find that most people confuse or combine "self-signed" with "signed by a private CA". For a lot of uses, the configuration pains are the same to the user: "I have to load this cert into the CA root trust store". They don't realize how much better a private CA really is.
And of course, PKI would be so much more useful with "name constraints" so you don't have to trust a private CA for all domains just the one domain you care about.
As a technology, properly-implemented self-signed certs are totally fine. The problem is that k8s does not have the features necessary to use self-signed certs securely. Instead, it expects you to create your own CA (or CAs: you can use separate ones for different kinds of communication if you want extra bulkheads) and then to share out your private CA's cert to all the k8s components. This achieves your goal of cutting out MITM attacks via unscrupulous commercial CAs while also making it possible to trust families of certs for a given purpose, rather than having to whitelist every single consumer's private key.
Think of it as a cost and effort threshold. Prevents the dragnet / fishing methods from eavesdropping. It's trivial to force $Company to let you in with letter. The effort to break encryption is not trivial. You have to be doing something wrong to get specific attention.
You know what's worse than Kubernetes complexity? A shallow abstraction of it by some mediocre team to get promotion and leave everyone else struggle with something that you can't get help from anywhere else. This happened in past two unicorns that I worked for. They template the templates and wrapped the `kubectl` with some leaky abstraction.
A long time back , I had gone down a rabbit hole to get Kubernetes manage my Mongodb. Thankfully , I got bitch slapped into reality within a few days and gave up because that rabbit hole goes deep. Now my mongodb stands outside of the K8 and I have no issues with it that cant be solved with a rebuild of a replica node.
Honestly I'd only think of running Mongo on Kubernetes if my organization is thoroughly bored with how well they've managed Mongo upgrades, scaling, outages, migrations, and tuning over the last year on a traditional VM setup.
Even so, if everyone's bored with the stability, why change it?
I'm an advocate for databases in dev/test environments on K8s due to ease of deployment, but there's too many moving pieces around storage for it to be a great idea for production out the gate.
Well. The benefits (monitoring, telemetry/observability) are nice enough that I might think about running it on k8s in prod. But only after sufficient number of dev/staging clusters have been sacrificed.
Don't use anything that doesn't come with a very-heavily battle-tested Operator (that does periodic backups and does at can restore). Simple StatefulSets are ... cute, but are very far from a reliable solution. Because when the shit hits the fan you won't be able to efficiently fiddle with containers, persistent volumes, secrets and configs (tons of YAML and so on) to try to somehow salvage the setup.
In one of my ex-companies I worked with, we had a newbie coder who was just getting fascinated by Kubernetes. Unfortunately, every project this person touched would be left half-way by him and he'd move on to another project.
This company invested around $1 million into a very new Saas product that had a lot of potential. Now, I'm an old school guy. I like systems where I don't touch devops. The cost for such Paas is usually high, but it saves a lot of my time, which I value more than money (and it is valuable, in many cases). I advised the management to use something like AppEngine which is freakin' awesome for such kind of large scale projects.
Unfortunately, this newbie coder was management's pet. They bit the bullet and went on with his advice to use Kubernetes instead. The system randomly failed, they spent tons of time doing devops and microservices on what should have been a simplified monolith. The development time elongated to almost 2 years. By this time, there were competing offerings in the market for much cheaper.
This costed an entire team's morale, which lead to missed deadlines and product launches. This lead to many missed opportunities. For instance, there was a HUGE client we were demo'ing this Saas product. It failed spectacularly on a relatively light load (I've hosted Rails apps on AppEngine that can handle FAR better) and we lost that product and the client.
It was still not too late and I insisted on management identifying the root cause to switch to a managed Paas stack given how resource constrained we were. But this newbie coder turned it into a political situation ('mine vs yours'). I tried various ways to resolve this, but it didn't work out.
As a result, I said 'fuck this, I'm out of here' and I quit. In less than 3 months of my departure, about 10 people quit including one of the senior managers. In about 6 months, the company shrunk from a double digit to single digit company. The company almost went into bankruptcy.
The company lost its $1 million investment. Everyone left. The product development was put on indefinite hold. Finally, this newbie coder left as well. The founders had to rebuild their companies from scratch.
Co-incidentally, I went on to become an AppEngine consultant. I'm able to run unbelievably large monoliths in production, with almost zero downtime for 3 years in a row simply because I chose to avoid devops, and most importantly infrastructure complexity. I pay more, but it easily is worth my ROI. Most of the time, microservices aren't required and we can get away with a simplified monolith. And if you can get away with monoliths, you probably don't need Kubernetes.
> if you can get away with monoliths, you probably don't need Kubernetes
Right, especially if you're doing microservices the "right way" which means, each service has not only its own app setup but its own data store. I am responsible for (among other things) maintaining and scaling my company's infrastructure, and while we do have some vague concept of microservices (app, api (monolith), queuing system, various ML apis, headless browser, etc) it is always always ALWAYS "what's the simplest way to do this?" Microservices are a ton of moving parts, and adding in their own data stores is another can of worms.
We have a need for a scheduler. We are using Nomad internally. I chose this over Kubernetes. Why? I know that Kubernetes has more muscle behind it, can do more "things" we might need in the future, etc. I chose it because of its operational simplicity. It strikes a perfect balance of "scale these three things up" without needing a dedicated ops team to run it.
So far it has performed great, and I really think it comes down to limiting the number of moving pieces. Don't get me wrong, microservices are great if you have an ops team and a 50+ dev team all working on different things. But I often see microservices being pushed without warning people of the inherent costs involved.
I'm sure kubernetes is great, and I'm continuously re-evaluating where we are and what we need, but for now simpler is always better. If we do go with Kube, it will likely be a managed version so I don't have to touch the internals...not because I don't want to learn them (I definitely would want to know all the grimy details) but because I just don't have the fucking time to deal with it.
I think programming as a profession provides three main services:
1. Implementing user/client features (meeting demand)
2. Designing and maintaining an architecture flexible enough to continue adding features at a reasonable pace
3. Reducing complexity, which primarily involves picking the right primitives and reducing the number of variables as much as possible
2 and 3 are really just in service of 1, but the fact is that if you don't invest in flexibility/leverage/optionality and proactively keep yourself from getting swamped by an overwhelming number of "moving parts," your productivity will creep to a halt and formerly cost-effective features will no longer be economical, destroying your capability to provide customer value.
How do you use Nomad? Do you run Docker containers with it? I don't need the scaling, I need the part where new versions of code are autodeployed automatically, services are connected together, and basically things update without having to run ops checklists every time.
Kubernetes feels too heavyweight and deploying machines feels too snowflakey...
We use nomad a few different ways. One is system jobs which run DNS and Fabio (the networking fabric layer) on the host machines. Then, yeah, for all the apps and services we use Docker containers which I've found runs pretty well...with one caveat: make sure if you're spinning jobs up/down a lot you get a fast disk. For instance, we were running on 64gb ebs gpio volumes and Docker was starting to grind to a halt on them (we spin workers for our queue up/down quite a lot). We started using instances with ephemeral SSDs instead which has a bit more operational complexity on init, but overall works really nicely.
We also use Nomad for deployments as well. I wrote some deploy scripts by hand that create the Docker containers and load the templated Nomad jobs ie
I'd say as far as "take this container and run N instances of it and load balance requests at this hostname to them" Nomad has been pretty great. There is definitely some work involved getting the comm fabric all set up exactly how you want (Fabio does make this easier but it's still work). Consul now has Connect (https://www.consul.io/docs/connect/index.html) which I haven't looked at yet which might alleviate a lot of this. I think some of our complexity also came fromt he fact that we do have TCP services we needed to load balance and most fo the service fabric stuff forces HTTP onto everything.
Overall my experience with Nomad has been great. It's capable and really not too difficult to operate for one person who also has tons of other stuff going as well =].
Not sure, honestly. I've never used Kube, just taken a preliminary look at the docs and been scared away by how much abtraction there is. While providers may manage it for us, I'm not sure to what extent they manage it. We're on AWS and I haven't been super happy with the responses/response times of their support, so when dealing with unknowns I'd rather not rely on someone else.
That said, Nomad hasn't been without problems. It's just that the problems seem to be easier for one person to solve. I set all this up almost a year and a half ago and haven't touched it much since, so it's possible both Nomad, Kube, and and managed services have come a long way and now is a good time to re-evaluate.
I can't see how Kubernetes belongs to this problem. It's just the tool for getting specific things done, not some kind of silver bullet. You can build your application layer around Kubernetes, leveraging its primitives and integrating your services tightly, or you can leave it as be (wow).
I currently maintain few clusters with managed control plane for hosting some monolith apps. Just a couple of fat worker nodes for failover, ingress with nice metrics and benefits of containerization... and that's it. No need to push for fully decoupled microservice architecture with service mesh that eats more CPU than your app (sarcasm, but not so far from the reality). Just use your tool right.
I got enough of random 503’s, network errors, slowdowns on mid-to-high traffic and dropped TCP connections yesterday. I setup dedicated VPS’s for our main app servers using Ansible and good ol’ fashion Capistrano. It feels almost old-school, but so much more stable.
I'm torn on this list. It is important to learn from the mistakes from others, so I like it (postmortems are great for this reason), BUT it feels like folks are using these examples as reasons to stay away from Kubernetes. There could be a significantly larger list of system failures where K8s is not involved. Similarly there could probably be a list of "Encryption Failure Stories" but that doesn't mean we shouldn't encrypt things.
As an industry, one of the things we do pretty well is identify the most viable patterns to solve a problem and then develop and adopt the best primitives of those patterns. This is what Kubernetes is for creating reliable, scalable, distributed systems.
> k8s will only redeploy containers when a node fail.
That's not true at all. Kubernetes monitors containers' health and automatically scales deployments according to the demand. Thus quite obviously kubernetes does help make your system distributable, scalable and reliable. In fact, that was the design goal of kubernetes.
There are distributed primitives in Kubernetes (such as leader lock), so you won't have to roll your own "architecture". For example, look at Patroni, it can be run natively on Kube, using its resources for all distributed and redundancy goodness.
In my opinion many of these issues with Kubernetes arise because it is not the right abstraction for application developers. You are forced to think about too many low level details to get a stable application running.
Many companies forget that they are incharge of getting way more intricate details of the application runtime right when they shift their application to Kubernetes.
In some of my recent experience I have seen a company shift an application from app engine to Kubernetes. However they forgot that appengine is using an optimized jvm to run your application. When they deployed to Kubernetes their app grinded to a halt. They forgot about all the observability that comes out of the box in app engine, thus resulting in an unobservable badly performing application.
Hopefully more domain specific runtimes build on top of Kubernetes can help application developers deploy to a kubernetes cluster in a sane way. I am putting some of my money on knative, however it is still quite hard to convince clients to invest in this area.
Kubernetes does not prevent you from using an optimized JVM to run your application. In fact it doesn't even involve itself at that level rather it focuses just on managing containers which can have anything inside them.
And Kubernetes actually has far more observability than most distributed applications since you can get plugins which trace/monitor every connection between containers as well as monitor the health of each container.
I think what you're after is a PaaS on top of Kubernetes ? Well there are many of those around as well.
Generally, things that answer questions along the lines of:
* is my app running
* how long for?
* if there are associated jobs/worker processes, did they start/complete successfully and how long did they run for
* what’s the state of the pod(s) + host instanced my application is running on.
* are all the necessary sub components up and running and communicating properly (and are they staying up).
It’s not quite monitoring/logging, although they are related.
These are all very informative and even amusing. But this is to be expected. Kubernetes is an enormous system (or ecosystem of systems actually). It's really hard to understand all the pieces and components, even the built-ins.
I still struggle (after having set up 2 clusters from scratch) with understanding networking, especially Ingresses and exposing things to the public internet.
As the owner of the linked GitHub repo (also rendered on https://k8s.af --- thanks to Joe Beda), I highly encourage everyone to contribute their failure stories (I'm still looking for the first production service mesh failure story..).
Also be aware of availability bias: Kubernetes enables us to collect failure stories in a (more or less) consistent way, this was previously not easily possible (think about on-premise failures, other fragmented orchestration frameworks, etc) --- I'm pretty sure there are much more failure stories in total about other things (like enterprise software), but we will never hear about them as they are buried inside orgs..
After a couple of years of massive frustration with the entire direction of the "devops" segment, I think I'm resolved to get out of it and either move further up back to ordinary application development or further down into the actual kernel, preferably working on FreeBSD or some other OS that's more sane and focused than Linux.
Kubernetes represents the complete "operationalization" of the devops space. As companies have built out "devops" teams, they've mostly re-used their existing ops people, plus some stragglers from the dev side. These are the people you hear talking about how great Kubernetes is, because for them, they see it as "run a Helm chart and all done!". Which makes sense, since they were, not too long ago, the same guys fired up about all the super-neato buttons to click in the Control Panel. 90% of "devops" people at non-FAANG companies are operations people who just think of it as a new name for their old job.
Among this set, there's no recognition of the massive needless complexity that permeates all the way through Kubernetes, no recognition of the tried-and-tested toolkit thrown away and left behind, no recognition of the fact that we're working so hard to get things that we've had as built-in pieces of any decent server OS for decades. No recognition that Kubernetes exists so it can serve as Google's wedge in the Rent-Your-Server-From-Me Wars, and no awareness that just in general, there's no reason it should be this hard.
Of course, to them, it's not hard. They have an interface with buttons, they can run `helm install`, they get pretty graphs via their "service mesh". That's what I mean by "operationalized"; Kuberenetes is meant to be consumed, not configured. You don't ask how or why. You run the Minikube VM image locally and you rent GKE or EKS and go on your merry way. The intricacies are for the geniuses who've blursed us with this death trap to worry about! Worst-case, you use something like kops. Start asking questions or putting pieces together beyond this, and you're starting to sound like you don't have very much of that much-coveted "devops mindset" anymore.
"What happens if there's a security issue?" Oh, silly, security issues are a thing of the past in the day and age where we blindly `FROM crazy-joes-used-mongodb-emporium:latest-built-2-years-ago`. Containers don't need updates, you goose. They're beautiful, blubber-powered magic, and the Great Googly Kubernetes in the sky is managing "all that" for us. Right on.
I'm picking on Kubernetes specifically here because it's the epitome of all this, but really everything in "devops" world has become this way, and combined with the head-over-heels "omg get us into the cloud right now" mentality that's overtaking virtually every company, it's a bad scene.
Systems have gotten so much more convoluted and so much dumber over the last 5 years. The industry has a lot to be embarrassed about right now.
> After a couple of years of massive frustration with the entire direction of the "devops" segment,...
> Kubernetes represents the complete "operationalization" of the devops space.
It's funny that I have all the exact same concerns and misgivings as you, but the exact opposite feeling (ie reactionary bias) of where it has come from :)
To me in my less charitable curmudgeonly cynical moments (using when tracking down a Docker or Kubernetes problems), it feels like "ops" was overrun by talented but inexperienced devs adding too many layers of abstraction too quickly CADT style funded by large valley tech companies and VC money. And constantly changing them as fashions come and go. It's like the React ecosystem having a go at Go.
For all the faults of the culture of slower moving traditional systems software written in C etc, we didn't run into so many issues with the level of bugginess or with tooling that gets abandoned/replaced before we've even finished evaluating it. It's like Go has done for systems software quality what PHP did for web software.
To be fair Docker is maturing now, and Kubernetes itself has too now, but the gap between the now relatively stable low level functionality of core kubernetes and what app developers need at a higher level is a gulf of churning crap.
Yes, there's certainly a landgrab occurring in this space, and Google is certainly pouring a massive heaping of G-Juice on k8s in order to get in front of the user. In software, controlling the user interface is controlling everything.
Ultimately, it's to their credit, because in some small measure, it counteracts an Amazon-controlled dystopia headed by Galactic Emperor Bezos, so perhaps we should be glad for it. It's just sad that so many systems have to be the collateral damage.
> ..people at non-FAANG companies are operations people who just think of it as a new name for their old job. Among this set, there's no recognition of the massive needless complexity that permeates all the way through Kubernetes.
Kubernetes was born out of ideas from Google's own internal systems. I think this discounts the complexity of operational platforms at the large companies. Companies where they build operational APIs straight into their services. It may not be the right tool for every job, but being so overly dismissive of complex operations platforms comes off as extremely pretentious.
The implicit message is that there are 5 about companies in the world who need Kubernetes' (and friends) inherent complexity. If your company is not named "Amazon", "Google", "Facebook", or "Netflix", it's probably not one of them.
I am not convinced any of them need k8s or even containers.
I am convinced many people use these technologies because it is simple and abstracts away the required skill sets to do complex operational tasks but only covers about 85% of those, leaving operation teams with limited skill sets exposed when that other 15% happens.
Same here. I do devops for 10 years (used to be called systems engineering) and have implemented containers once and Kubernetes zero times. I successfully made companies avoid Kubernetes several times. Most of the time tool focused people think that the next thing is a miracle that makes every problem go away, which is certainly not true. If you define the problems first and try to find solutions than you get much better results. Choosing a tool and try to find problems it solves is idiotic. Just like using Kubernetes instead of something much simpler.
You are implying that we need container migration. I give you one example of how you do not need that. Let's set up an autoscaling group with node count = 10. On node error you terminate the node in trouble. Autoscaling detects that you have less capacity than required. Creates a new instance. You are good. This is a much simpler solution to the problem than trying to think where you container fits, avoiding cascading problems and trying to track the state of your cluster.
And to be clear, "container migration" is nothing more advanced than this "take the failed system out of service and spawn it somewhere else" routine that we've been doing since time immemorial.
Things like VMware's vMotion or KVM's live migration can detect trouble and transparently move the running VM -- network connections, memory state, and all -- onto a different underlying host, leaving end users none the wiser.
That's the kind of thing noobs throw out the window when they come in fired up about how k8s have health checks, because they've never heard of health checks before.
People are trying to recreate container-level mobility with things like CRI-U, which is very cool, very cutting edge stuff, but right now it's so unpredictable even the k8s crowd is afraid to touch it. ;)
Once these things eventually pan out and the functionality gets tangled up into some contrived k8s-ified YAML abomination (aka "blessed by the Great Googly Appendage"), when the k8s fanboys start hailing it as the savior of all computerdom, I'm sure I'll be making posts like "Uh, hi, this is awesome tech, but live migration anyone? We've had this for a long time..."
CRI-U has the potential to open many doors, and while true live container migration may be one of them, it's not really the exciting one, primarily because we've already had fatter-but-functionally-similar support for server workloads through VM migration/snapshots. CRI-U's promise is in all the things we can do with process-level "save states" that aren't just "moving 300MB of RAM at a time instead of 8G".
I’m sure the team I’m in (10 total, of which none of us are “devils”, and only half are devs) could probably do our stuff without containers and Kubernetes, but honestly, the fact that I can make a commit to a repo, and in a couple of minutes the latest version of my code and dependencies is up and running, with things like SSL certs, DNS records and domain names sorted, all logging is sorted (I don’t have to worry about logging libs and connections to logging services, I just write to the console and it’s all collected and indexed in ElasticSearch; if my service goes down Kubernetes brings it right back up in a matter of moments, I don’t have to provision anything, underlying instances are automatically replaced if they fail, and resource requirements are automatically handled, as is horizontal scaling.
So, so, so much is done (near) out of the box that we don’t need to worry about, it’s amazing. I’m sure there’s a pre-Kubernetes way of doing it, but I don’t imagine it’s nearly as low friction.
When that 15% occurrence happens, community forums exist, failing that, we pay for great support from our cloud provider, and absolutely worst comes to worst, we can just blow away the whole thing and redeploy because our whole environment is setup so that you can go from blank slate to apps running production again in just under an hour.
Ideally your automation and everything else is done so you have a reproducible environment. If that is the case and you recreate the environment when you run into that 15% problem you should be recreating the problem.
Yes there were methods of doing the same thing pre containers. smoother and simpler IMO
Simplest was the ship everything in a tarball using embedded deps so everything was inclusive. Production was pointed to the latest release via a link. New release was done and you simply changed the link. Rollback was as simple as changing the link back to the previous version.
> So, so, so much is done (near) out of the box that we don’t need to worry about, it’s amazing. I’m sure there’s a pre-Kubernetes way of doing it, but I don’t imagine it’s nearly as low friction.
My suggestion would be that you not rely on your imagination, but actually look into some non-k8s options. NixOps is a good place to start. You'll be pleasantly surprised how much simpler it can be, and may even get an improved emotional connection with the greybeards out of it.
I'm not sure I communicated that clearly -- I meant that people at NOT Google just upgraded the job title. There's no question that some extremely qualified people work on the Kubernetes platform itself.
The issue is that they've thrust it upon the rest of us lowly mortals as a general toolkit, when it's only potentially-appropriate for companies at Google scale, in terms of both traffic and talent.
I don't think Kubernetes is necessarily overly complex. I use it for a side project, and knowing the config primitives, it's been pretty easy to set up a web app with postgres, redis and a load balancer on a single node hosted on DigitalOcean. Since I'm already familiar with k8s from work, I find the maintenance of the mini cluster to be pretty hands-off.
> - Straightforward upgrades of the environment to incorporate security patches
How do you ensure that your exposed containers have all the relevant security patches, especially if the images aren't uniform? Are you using something like Watchtower to monitor for vulnerable packages and automatically rebuild and redeploy the containers when e.g. the underlying Ubuntu or Alpine image uses a vulnerable library?
Lots of people have the mistaken impression that containerization inherently protects their application from running vulnerable code. If you already have this built in to your pipeline, I'll be impressed!
The funny thing being that all of these things are applicable in a k8s-backed environment anyway. k8s is a container orchestrator and scheduler. You still need a platform (VMWare, AWS, GCloud) and configuration management separately. So, I guess the answer is "yes"?
>How are we dealing with node failures
First, in an ordinary system, "node failures" are rare. This thing that k8s encourages where the first response to a problem is "just kill the pod and hope it comes back OK on its own" is distasteful in many ways, and only further cements why MS-centric ops people are gung-ho. In my day, when there's a problem with a system, you divert traffic, take snapshot, analyze to figure out the reason it failed, and then prevent failures. Even in k8s-world, failures cost something, and it's bad practice to allow them to occur regularly in production (note that making "just reboot^H^H^H^Hschedule it" a routine is separate from being able to tolerate failures transparently).
Second, you use the same systems that k8s ingresses use internally: load balancers with health checks like HAPROXY, nginx, Envoy, etc.
> How are you making good useage of your compute, VMs?
First, not dedicating a bunch of it to redundant and unnecessary distributed systems mechanisms. k8s is not cheap or easy to run.
Second, there are robust toolkits and monitoring options for managing all kinds of workloads, including controlling scale. Depending on your platform and use case, there are a plethora of options. These problems are not new.
The crux of the issue is that k8s is misunderstood as a generalized toolkit for every common-but-not-trivial operational task, because that's how Google promotes it to maximize adoption and thus wedge themselves into a position of more control v. AWS. k8s is a complex multi-node container orchestrator and scheduler. If you didn't need one of those before you knew what k8s was, you probably don't need one now.
You didn't really answer the question and honestly I don't think you entirely understand the proposition, but it doesn't really matter...
Honestly, I disagree with you. Putting k8s on tin is a fairly nice system that gives you much of the power of IaaS providers that you potentially don't need anymore. It makes it easy to create logical environments, allows packing of compute, and gives you standard APIs to develop against, allowing you to move providers, share platforms, etc.
Now I do agree, you shouldn't generally have services that restart constantly, but some of the biggest headaches I've came across is sysadmins and that assumed their systems will just work, and now they don't and it's a clusterfuck.
With regards to cheap, it doesn't cost any extra to get k8s as a managed service by google. It's fairly cheap to use on DO, and if you need to run it on tin, it's a damn sight easier than bootstrapping the underlying IaaS in my opinion, so frankly I don't think your rant represents reality.
Theres a lot of hackers in the industry that are new to running systems. I'd rather they use a framework than build bespoke, but that's me. And thus I ask again, what's better?
I mean, if you're asking me to prescribe a one-size-fits-all infrastructure solution for any problem, that's not a thing. The answer is "it depends". What's better is to understand the mechanisms at work and use what you need to get a robust, reliable solution.
If I were to prescribe a general infrastructure platform for your generic web app, the high-level would be
a) use something for infrastructure-as-code to define the resources needed, so they can be rebuilt/spawned in new environments on demand;
b) use something for config management and image construction, ideally something like NixOS that has reproduceability and dependency encapsulation as a fundamental part of the OS;
c) use a production-grade load balancer like haproxy or Envoy and configure it properly for the infrastructure that it sits on top of;
d) use a production-grade web server to serve requests, which is probably either Apache or nginx.
Note that just saying "boom k8s" doesn't really resolve these concerns. k8s schedules, routes, and executes arbitrary containers, which have usually been built by developers who don't know what they're doing, and which are likely to contain tons of random outdated junk as a result of cascading FROMs in the Dockerfile and stray files in the developer and/or CI pipeline's build context (which are frequently sensitive btw). Container images should be just like any other image running in prod: constructed by competent admins and controlled with an appropriate configuration and patch management mechanism. The fact that applying those changes executes an entirely new image instead of just restarting a service is really an implementation detail, it's not a solution to anything.
Your chosen k8s ingress is probably using nginx, haproxy, or Envoy under the covers, and you have to tune that either way, whether you call that a livenessProbe or a health check. There's nothing fundamentally better about the k8s YAML for this than the actual configuration file (and indeed, in the early days of k8s, I was hacking through the alpha-stage ingress plugin and editing haproxy configs by hand anyway), though I suppose if you have a use case where half of your applications need one load balancer and the other half need another, you may get some benefit here. That's pretty rare, though.
If you ever do hit appreciable load, you may find that your container's `npm start` server has some inadequacies for which "omg pay $CLOUDPROVIDER more money and spin more pods" may not be an adequate solution.
And lastly, k8s doesn't enter the picture until you have something to run it on so it doesn't do anything for your infrastructure-as-code problem, even if the answer to that is just "rent a Kube cluster from Google" -- something still has to actually go rent that, and it should be scripted.
So I mean, yeah, if you like Kubernetes because Helm charts are convenient, by all means go ahead and run Kubernetes. Just don't pretend like that magically solves the problems involved in a robust architecture, especially if you ever expect to tune or profile your systems.
You don't really have a solution to: Persistent Storage, Pod Security Policies, Admission Controllers (such as checking your deps have been security checked), Resource Utilization or Creating environments (you mentioned it but never specified it).
You don't actually understand how k8s does healthProbes.
Most of the problems you're leveling at k8s still exist with your technology choices, or any really. An incompetent admin is incompetent whatever the tool.
Containers lacking reproducible builds are a bit of a problem, but I doubt it's a problem solved by many/any tools in it's entirety.
For the record, I've actually used k8s to run critical national infrastructure for major government services with considerable load, that categorically should not go down.
I've also used it as a shared platform for CI/CD in major orgs with lots of seperate delivery teams. It's literally bliss compared to trying to create that with openstack, VMWare or an IaaS provider.
You should really try the tool in anger before you dismiss it, because most of your points aren't legit pointing at what k8s gives you.
Also, as someone who is a competent admin, who has also been a software developer, having a sysadmin who can barely code trying to debug someone elses at 4am is also a broken model. Having developers involved in support and building of things, is so they can actually suffer the pain of code that isn't reliable, doesn't log or event, doesn't tell you it's started, doesn't have health checks, isn't easy to deploy, etc, etc.
> You don't really have a solution to: Persistent Storage, Pod Security Policies, Admission Controllers (such as checking your deps have been security checked), Resource Utilization or Creating environments (you mentioned it but never specified it).
Again, I'm not going to draft out a complete architecture for some hypothetical application. The point is that k8s leaves the fundamental questions unsolved, just like non-k8s. This is noteworthy because many people are apparently operating under the belief that k8s will "reduce complexity" by handling these core infrastructure problems transparently and intrinsically, and it doesn't.
> Most of the problems you're leveling at k8s still exist with your technology choices, or any really.
Right, that's exactly the point. Most people say k8s is worthwhile because they think it's a magic bullet that has self-contained and automatic remediations for core infrastructure concerns. That's because there's no way that the complexity is worthwhile if it doesn't. If you're left with the same basic set of problems regardless, what are you getting by putting k8s in there after all?
k8s is probably not the wrong choice 100% of the time. It's just that most people who are jumping on that bandwagon are simply jumping on a bandwagon, and flailing and screaming to everyone else that the bandwagon is a magical land of fairy tales and unicorns. If there is legitimate, real, well-vetted engineering rationale for selecting k8s for a particular use case, it should be selected, of course. Vague statements alluding to its mystic "literally bliss"-inducing powers do not comprise this, despite popular opinion to the contrary.
> You should really try the tool in anger before you dismiss it, because most of your points aren't legit pointing at what k8s gives you.
I have used it, repeatedly. Granted that the last cluster I ran in prod was a couple of years ago, and I'm sure things have improved in that time. But it doesn't change the fundamental equation of the cost/benefit tradeoff.
> having a sysadmin who can barely code trying to debug someone elses at 4am is also a broken model.
Agreed, obviously. What does that have to do with Kubernetes?
You didn’t explain how nose failures can be handled. In k8s a new pod is started ( with a storage migration story)
Without that and obviously people have been doing h/a before k8s you have To write your own fragile buggy scripts or use software that’s more just centric than container centric or and this was the question what do you do ?
"Node failures" have been handled much more elegantly than k8s's simple "kill and rebuild" approach via live migration for at least a dozen years.  Wikipedia lists 15 hypervisors that support it. 
If you're interested, Red Hat has a very thorough guide on how to achieve this with free and open-source software. While you have to run VMs rather than containers, it's much more robust.  There are proprietary options too.
And then, there's also the good old fallback method that k8s uses: just divert traffic to healthy nodes and fire up a replacement. There are many frameworks for that simple model, and there's no reason to pretend that it's done exclusively with handcrafted "buggy shell scripts", nor to pretend that "buggy shell scripts" are inherently worse than "incorrect YAML configs that confused k8s and killed everything in our cluster" (see OP for a compendium of such incidents).
I would really like to hear what's your proposed alternative for managing software. The world is digitising at a staggering pace and we have to deal with ever increasing density and complexity in software deployments. How would this look in your ideal world?
In my ideal world, it'd look a lot like SmartOS + NixOS, but that's an ideal. There is a massive middle ground between k8s and some hypothetical ideal, and k8s is completely on the "horrifying monstrosity that you shouldn't touch with a ten-foot pole unless you really have no other options" side of things.
Most server-grade operating systems include facilities that are robust, mature, compact, performant, and reasonably well-integrated, and for the things that aren't part of the OS, there is a long and glorious lineage of applications that can lay claim to those same virtues. Kubernetes makes use of many of them to do its work.
Those of us who've configured a router, load balancer, or application server independently are just perplexed when someone acts like k8s is the only way to handle these very common concerns. We're left asking "Yeah, but... why all this, when I could've just configured [nginx/Apache/haproxy/iptables/fstab]?"
The naive admin will say "because then you just have to configure Kubernetes!", but unfortunately, stacking more moving parts on top of a complex system typically hurts more than it helps. You'll still need to learn the underlying systems to understand or troubleshoot what your cluster is doing -- but then, I think part of what Google et al are going for here is that instead of that, you'll just rent a newer and bigger cluster. And I guess there's no better way to ensure that happens than to keep the skillset out of the hoi polloi's hands.
I assume that many "devops" people are coming from non-nix backgrounds, and therefore take k8s as a default because they're new to the space and it's a hot ticket that Google has lavishly sworn will make you blessed with G-Glory. But systems have been running high-traffic production workloads for a very long time. Load balancing, failover, and host colocation have all been occurring at most shops for +/- 20 years before k8s released in 2014. These aren't new problems.
Alan Kay has called compsci "half a field" because we're just continually ignoring everything that's been done before and reinventing the wheel rather than studying and iterating upon our legacy and history. If anything is the embodiment of that, it's Kubernetes.
I appreciate the lengthy reply and I sympathise with your concern regarding the cargo culting of technology trends. It's not the first time it happens nor the last. I disagree in general with your view and I think that past few years have brought tremendous innovation in the space: software defined networking, storage, compute; all available to the "hoi polloi", as open source high quality projects that are interoperable with each other and ready to deploy at the push of a button. And you know why this has happened? It's because Kubernetes, with all it's complexity, has become the defacto standard in workload orchestration, and has brought all the large players to the same tables, scrambling to compete on creating the best tools for the ecosystem. I am not naive to think Google didn't strategise on this outcome but the result is a net positive for infrastructure management.
I also sense a very machine centric view in your message, and there is a certain beauty in well designed systems like SmartOS and NixOS. But you are missing the point. The container orchestration ecosystem, for all it's faulty underpinnings (Linux, Docker, Kubernetes), is moving to an application centric view that allows the application layer to more intelligently interact with and manipulate the infrastructure it is running on. Taking into consideration the Cambrian explosion in software and the exponential usage of this software (tell me an industry that is not digitising?), this transition is not surprising at all.
Regarding the complexity of Kubernetes, some of it is unavoidable, especially considering everything it does and the move from machine centric management to cluster centric management. There are other tools that are operationally simpler (Docker Swarm, Nomad), but they definitely don't offer the same set of features out of the box. By the time you customise Nomad or Swarm to feature parity with Kubernetes you will end up with a similarly looking system, perhaps a bit more well suited for your use case. The good part is that once an abstraction becomes a standard, the layers underneath can be simplified and improved. Just take a look at excellent projects like k3s, Cilium, Linuxkit and you will see that the operational complexity can be reduced, while the platform interface maintained.
To summarise, I am very happy that Kubernetes is becoming a standard, and I am convinced that 30-50 years from now we will look at it as we look now at the modernisation of the supply chain which was triggered by the creation of the shipping container.
First, thanks for your response, it's been a good discussion.
I agree that there's a great deal more technology being made publicly available and that said technology is beneficial in the public eye. I don't necessarily agree that that technology is needed by most people, despite the Cambrian explosion you reference.
On machine centrism, if anything, containers have increased the importance of systems concerns, because now every application ships an entire userland into deployment with its codebase. As long as containers come wrapped in their own little distribution, each code deployment now needs to be aware of its own OS-level concerns. This is anything but application centric. If you want application-centric deployment, just deploy the application! True application-centric deployments are something like CGI.
A better understanding of the hardware+software stack and a return to fundamentals instead of piling frameworks sky-high on top of each other making it very hard to impossible to debug which layer is buggy when things go sideways.
> It is designed to solve a very real and very hard problem.
Completely agree. The thing is that the problem it's designed to solve has been badly misrepresented.
If your problem is "At Google, we have fleets of thousands of machines that need to cooperate to run the world's busiest web properties, and we need to allow our teams of thousands of world-class computer scientists and luminaries to deploy any payload to the system on-demand and have it run on the network", then something like Kubernetes might be a reasonable amount of complexity to introduce.
If your problem is "I need to expose my node.js app to the internet and serve our 500 customers", it's really, really not.
I actually agree with most of this. However, one must recognize that Google is not solely responsible for Kubernetes at this point. Under the banner of the CNCF, K8s is worked on by literally hundreds of companies and thousands of developers whose interests are not necessarily all aligned with the "Rent-Your-Server-From-Me" lock-in doctrine espoused by the large cloud providers. Many companies have an interest in making K8s easier to use from an operational perspective, which gives me hope for the platform's future.
I gotta say, I feel like I was having a much better time of it when Continuous Delivery was a logical outgrowth of Continuous Integration.
You just keep building farther and farther out from compilation through testing and packaging and installation and deployment. Everyone on the Dev team can keep up with the changes and understands where there code goes and when. That's what I thought DevOps was about.
Now it's a separate division that treats me like a mushroom: kept in the dark and fed bullshit.
In my experience I haven't had that much difficulty finding problems that k8s solves and what problems it doesnt, for me. I eased into k8s using rancher and groked stuff as I went along, a little ansible and you can go a very long way if you want to use your own hardware.
Not everything has to be run like dedicated servers in a closet named after lion king characters that you SSH into every day "just to check", and keeping software up to date is the same its ever been, pay attention and apply patch, lol.
There seems to be a lot of people who are building large scale, well behaved applications on NoSQL systems that are finding out "you know.. for a lot of these applications you don't need 'MVCC/ACID or nothing.'"
Like I said, there's a place for those systems. Redis in particular is close to my heart. But many devs hear "NoSQL" and think "Great, I love JSON! Who cares about SQL, SQL is for fogies!"
I have little doubt that the staff at places like Google and Facebook are qualified to weigh the concerns and choose the system that works best for their use case. The concern is about the rest of us, and the rapid proliferation of experimental systems developed for internal projects at MegaCos whose demands make any off-the-shelf system untenable.
The fact that Google or Facebook release a project essentially as an academic exercise doesn't mean that it should come anywhere near mainstream use. People see "Facebook" or "Google" on the front matter and want to be like the cool kids on the block, hardly realizing the quagmire they're plunging into.
MongoDB gained prominence by claiming massively improved write performance and encouraged devs to switch off real databases for a system of record and then, woops, it turns out that "write performance" is sans-`fsync`. That's sort of fixed now, many years after the controversy, but they still built the entire NoSQL story on the back of that deceit.
As an autodidact who detests formal schooling, I hate to say it, but sooner or later, there is going to need to be some type of vetting/licensing at play here, both to stop the mindless proliferation of "grab anything Googly" and to hold those who engage in predatory behavior accountable.
Note that "fall over" in the context of a mature RDBMS means that the database got slow because it wasn't properly tuned or administered, whereas in the context of NoSQL datastores, it may well mean "didn't write any data for the last five minutes, try to rebuild transactions from customer support tickets and your card processor's logs I guess????" or "woops, I just lost about $20 million in bitcoin". I think the potential tradeoff there is clear.
Not every application will need MVCC/ACID eventually. As another commenter pointed out, you can always assert that RDBMS like features will always be needed and that NoSQL is considered "shiny thats why people choose it" but some people actually know what they're getting into and have made a reasonable informed decision on what kind of persistence their application will need and how to scale it.