In the previous post, I went over some basics of how Kubernetes networking works from a fundamental standpoint. The requirements are simple: every pod needs to have connectivity to every other pod. The only differentiation between the many options were how that was achieved.
In this post, I’m going to cover some of the fundamentals of how Calico works. As I mentioned in the previous post, I really don’t like the idea that with these kubernetes deployments, you simple grab a yaml file and deploy it, sometimes with little to no explanation of what’s actually happening. Hopefully, this post will servce to better understand what’s going on.
As before, I’m not by any means a networking expert, so if you spot any mistakes, please send a pull request!
What is Calico?
Calico is a container networking solution created by MetaSwitch. While solutions like Flannel operate over layer 2, Calico makes use of layer 3 to route packets to pods. The way it does this is relatively simple in practice. Calico can also provide network policy for Kubernetes. We’ll ignore this for the time being, and focus purely on how it provides container networking.
Your average calico setup has 4 components:
Etcd is the backend data store for all the information Calico needs. If you’ve deployed Kubernetes already, you already have an etcd deployment, but it’s usually suggested to deploy a separate etcd for production systems, or at the very least deploy it outside of your kubernetes cluster.
You can examine the information that calico provides by using etcdctl. The default location for the calico keys is
The next key component in the calico stack is BIRD. BIRD is a BGP routing daemon which runs on every host. Calico makes uses of BGP to propagate routes between hosts. BGP (if you’re not aware) is widely used to propagate routes over the internet. It’s suggested you make yourself familiar with some of the concepts if you’re using Calico.
Bird runs on every host in the Kubernetes cluster, usually as a DaemonSet. It’s included in the calico/node container.
Confd is a simple configuration management tool. It reads values from etcd and writes them to files on disk. If you take a look inside the calico/node container (where it usually runs) you can get an idea of what’s it doing:
As you can see, it’s connecting to the etcd nodes and reading from there, and it has a confd directory passed to it. The source of that confd directory can be found in the calicoctl github repository.
If you examine the repo, you’ll notice three directories.
Firstly, there’s a
conf.d directory. This directory contains a bunch of toml configuration files. Let’s examine one of them:
This is pretty simple in reality. It has a source file, and then where the file should be written to. Then, there’s some etcd keys that you should read information from. Essentially, confd is what writes the BIRD configuration for Calico. If you examine the keys there, you’ll see the kind of thing it reads:
So in this case, it’s getting the pod cidr we’ve assigned. I’ll cover this in more detail later.
In order to understand what it does with that key, you need to take a look at the src template confd is using.
Now, this at first glance looks a little complicated, but it’s not. It’s writing a file in the Go templating language that confd is familiar with. This is a standard BIRD configuration file, populated with keys from etcd. Take this for example:
This is essentially:
- Looping through all the pools under the key
/v1/ipam/v4/pool- in our case we only have one: 192.168.0.0-16
- Assigning the data in the pools key to a var,
- Then grabbing a value from the JSON that’s been loaded into
$data- in this case the cidr key.
This makes more sense if you look at the values in the etcd key:
So it’s grabbed the cidr value and written it to the file. The end result of the file in the calico/node container brings this all together:
Pretty simple really!
The final component in the calico stack is the calico-felix daemon. This is the tool that performs all the magic in the calico stack. It has multiple responsibilities:
- it writes the routing table of the operating system. You’ll see this in action later
- it manipulates IPtables on the host. Again, you’ll see this in action later.
It does all this by connecting to etcd and reading information from there. It runs inside the calico/node DaemonSet alongside confd and BIRD.
Calico in Action
In order to get started, it’s recommend that you’ve deployed Calico using the installation instructions here. Ensure that:
- you’ve got a calico/node container running on every kubernetes host
- You can see in the calico/node logs that there’s no errors or issues. Use
kubectl get logson a few hosts to ensure it’s working as expected
At this stage, you’ll want to deploy something so that Calico can work it’s magic. I recommend deploying the guestbook to see all this in action.
Once you’ve deployed Calico and your guestbook, get the pod IP of the guestbook using
If everything has worked correctly, you should be able to ping every pod from any host. Test this now:
If you have fping and installed, you can verify all pods in one go:
The real question is, how did this actually work? How come I can ping these endpoints? The answer becomes obvious if you print the routing table:
A lot has happened here, so let’s break it down in sections.
Each host that has calico/node running on it has its own
/26 subnet. You can verify this by looking in etcd:
So in this case, the host node1 has been allocated the subnet
192.168.228.192-26. Any new host that starts up, connects to kubernetes and has a calico/node container running on it, will get one of those subnets. This is a fairly standard model in Kubernetes networking.
What differs here is how Calico handles it. Let’s go back to our routing table and look at the entry for that subnet:
What’s happened here is that calico-felix has read etcd, and determined that the ip address of node1 is
172.29.141.96. Calico now knows the IP address of the host, and also the pod subnet assigned to it. With this information, it programs routes on every node in the kubernetes cluster. It says “if you want to hit something in this subnet, go via the ip address
x over the tunl0 interface.
The tunl0 interface may not be present on your host. It exists here because I’ve enabled IPIP encapsulation in Calico for the sake of testing.
Now, the packets know their destination. They have a route defined and they know they should head directly via the interface of the node. What happens then, when they arrive there?
The answer again, is in the routing table. On the host the pod has been scheduled on, print the routing table again:
There’s an extra route! You can see, there’s the pod IP has the destination and it’s telling the OS to route it via a device,
Let’s have a look at the interfaces:
There’s an interface for our pod! When the container spun up, calico (via CNI) created an interface for us and assigned it to the pod. How did it do that?
The answer lies in the setup of Calico. If you examine the yaml you installed when you installed Calico, you’ll see a setup task which runs on every container. That uses a configmap, which looks like this
This manifests itself in the
/etc/cni/net.d directory on every host:
So essentially, when a new pod starts up, Calico will:
- query the kubernetes API to determine the pod exists and that it’s on this node
- assigns the pod an IP address from within its IPAM
- create an interface on the host so that the container can get an address
- tell the kubernetes API about this new IP
The final piece of the puzzle here is some IPTables magic. As mentioned earlier, Calico has support for network policy. Even if you’re not actively using the policy components, it still exists, and you need some default policy for connectivity is work. If you look at the output of
iptables -L you’ll see a familiar string:
The IPtables chain here has the same string at the calico interface. This iptables rule is vital for calico to pass the packets onto the container. It grabs the packet destined for the container, determines if it should be allowed and sends it on its way if it is.
If this chain doesn’t exist, it gets captured by the default policy, and the packet will be dropped. It’s
calico-felix that programs these rules.
Hopefully, you now have a better knowledge of how exactly Calico gets the job done. At its core, it’s actually relatively simple, simply ip routes on each host. What it does it take the difficult in managing those routes away from you, giving you a simple, easy solution to container networking.