One of the most common discussion points I hear on a day-to-day basis about Docker is the lack of networking. It comes from virtually all corners of the the tech ranks, the extolling of virtues of having dedicated IP addresses to every container, to keep “consistent” with the currently deployed infrastructure, and continued use of well known ports. That is just to name a few. OK.
Sure, there may be good reasons to have dedicated IPs and ports for each of your containers. But most people asking for software defined networking overlays with Docker (heck, demanding it) are usually attempting to shoehorn monolithic applications into a container. So for that use case I say, sure. But even with your IP and known port combo you will be running into other problems…so you may be better off just using a VM like you are today, and use an image as your new code deployment artifact - instead of the RPMs you are used to.
By now you are thinking this guy has gone off the deep end. No dedicated IPs? How do we use Docker in production?
There is a way: Get yourself some service discovery and DNS. That’s how.
Using Docker has many benefits well beyond just moving code around in a somewhat more efficient way. What it allows us to do is build a system which can be highly automated. One the key attributes to a distributed system is a well documented API. Having that API, a few tools have been developed to help with the registration of containers automatically and the publishing of the container data to a single authoritative repository. Once there, we can do all sorts of magic. Win.
In this post let’s start with a high-level description of a the tools which encompass this distributed system. In later posts we will cover the ins/outs of setting all this up and securing communications.
Docker Engine Runtime
This is a pretty straight forward tool. The standard Docker Engine, installed on a host which will take our images and instantiate them as a container.
A little less known, and still in beta. Swarm is a clustering tool, which takes multiple Docker Engines and turns them into a single Docker engine to run workloads. With Swarm, we get the benefit of having multiple nodes running the Docker Engine and manage them from a single host, which adheres to the standard Docker API.
One of the key software tools which brings everything together. The Service registry is THE single authoritative repository for the state of system components, provides a fast, highly scalable, and fault tolerant key-value (KV) store and provides a well know API for reading and writing KV pairs and data. There are a few registries that are available, all with different features and benefits. Etcd, Zookeeper, and Consul are the most popular, and each has their own pros and cons. I tend to stick with Consul, because of the simplicity in its design as well as Hashicorp’s contribution to other projects which go towards solving distributed application challenges. Oh, and it provides an integrated DNS service too :)
So this one is much less known outside of the Docker renegades using Docker in a production capacity. The registration tool does one super simple and critical thing…to dynamically add data in an event-driven fashion into another tool. In our case, when we instantiate a container from image or stop a container, we want to read in some of the information you can find from Docker inspect and drop it into the service registry, where it can be consumed by another tool. There is one in particular tool that I am quite fond of - Registrator. It is a lightweight container which performs this function. There are certainly more being developed (keep an eye on Interlock, it combines our #4 and #5) but today Registrator is the bees-knees.
So now that we have a container instantiated on our Docker cluster, with at least a couple KV pair data points (the IP of the node on which the container resides and service port) stored safely in our service registry, we are technically good to go, and can access our containers by name by querying the DNS service of the service registry (Consul). Cool, right? We’ll lets take it one step further. By using consul-template, we can automatically add entries into a nginx or HAProxy configuration file and restart the service whenever there is a change. Both inserting and removing entries. Because consul-template only adds entries once the registered service has proven heathy (at least from a accessibility perspective) no baseline health check is required by a proxy or load balancer.
We’re down to the last two items needed to make this all work. Second to last is a proxy service. Two dominate in the microservice arena: Nginx and HAProxy. Both are very solid performers and well documented. They can be deployed separately, or, they can be combined together to provide both front end proxy services (i.e., nginx) along with back end load balancing (HAProxy). In smaller deployments, they can be run outside of the containerized infrastructure as a service, and setup in a redundant fashion (in the case of HAProxy) to ensure access to our services deployed on the Swarm cluster.
Our final component to make all of this work, without per-container IPs and known ports is DNS. We’re going to attack this from a couple vectors. The first is making sure that whatever “standard” authoritative DNS is in place for a domain has a TTL set to zero (0). That’s right. Why? Because we are going to put all of our proxy nodes as A records in DNS, and when a client asks for an application it will get one of the proxy servers. We’ll balance the request across proxies by round robin. With a TTL of zero, any request is served new. The other DNS vector is providing a DNS service to the container nodes (and as such, their applications) which is aware of the services which have been instantiated in the service registry. This is accomplished through Consul and a second DNS layer provided by dnsmasq. There are other options, like the combo of SkyDNS and Etcd.
Once you have all of these components running and communicating any image launched as a container will be registered with name and port available to DNS and automatically added to your proxy services. Done and done. You have now overcome one of the largest “issues” with using Docker containers, without going to an overlay network. For most cases, this will work just fine without burning IP addresses to have a port 80 open on every container. :)