Why Always Docker?
I love Docker. I’ve recently spent a lot of time learning about both Docker & Kubernetes. Combined with stateless containers they provide fantastic scalability, service discovery and near-instant deploy times (excluding initial image build!).
There is a trend, however, of using Docker containers for everything, and this makes no sense to me.
Let’s look at an example - running a Docker Registry (v2). I want to:
- Run a single instance of the Go binary
- On a box with huge disk space & bandwidth
- And relatively low CPU/memory
I don’t want such a box in my Kubernetes cluster (it’s a one-off), and I need none of Dockers scaling properties, so I’ll run it direct on hardware.
Well, guess what? There’s no install instructions for that. In fact, the “official” way is use the Docker image. Luckily the
Dockerfile isn’t much more than a limited shell script, so following the trail of docker/distribution -> Registry Image -> Dockerfile I was able to recover manual install instructions (all two of them).
While we’re discussing the
Dockerfile, let’s look at some other services better suited off-Docker: datastores. Say you want to run an Elasticsearch or Galera cluster - Docker containers might offer a ridiculously quick setup and look awfully tempting.
But wait - how do we configure these services for multiple environments (test/prod clusters)? They don’t read our
ENVvars, nor do they know of our internal service discovery tools. These kind of systems have their own configs, be it
Dockerfile format is completely fucking useless at this kind of thing.
Unfortunately it would appear the popular solution is simply to install other utilities within your image, and have them “bootstrap” the configuration before running the service. That’s mental, and a massive middle finger to the idea of containers without non-production-dependent software. Tools like pyinfra and Ansible are much more suitable for this kind of work (and don’t install useless crap to generate a config file).
On the flipside - having readily available instances of Elasticsearch/Galera/etc is incredibly useful in the development stages of a product. The ability to rapidly bring up a single Elasticsearch instance attached to some branch of app is a huge time saver. It’s by far the best way to deploy stateless apps one has control over. Just don’t bother building clusters of complex third party software with Docker containers.