I’ve been building a Kubernetes based platform at $work now for almost a year, and I’ve become a bit of a Kubernetes apologist. It’s true, I think the technology is fantastic. I am however under no illusions about how difficult it is to operate and maintain. I read posts like this one earlier in the year and found myself nodding along to certain aspects of the opinion. If I was in a smaller company, with 10/15 engineers, I’d be horrified if someone suggested managing and maintaining a fleet of Kubernetes clusters. The operational overhead is just too high.

Despite my love for all things Kubernetes at this point, I do remain curious about the notion that “serverless” computing will kill the ops engineer. The main source of intrigue here is the desire to stay gainfully employed in the future - if we aren’t going to need OPS engineers in our glorious future, I’d like to see what all the fuss is about. I’ve done some experimentation in Lamdba and Google Cloud Functions and been impressed by what I saw, but I still firmly believe that serverless solutions only solve a percentage of the problem.

I’ve had my eye on AWS Fargate for some time now and it’s something that developers at $work have been gleefully pointing at as “serverless computing” - mainly because with Fargate, you can run your Docker container without having to manage the underlying nodes. I wanted to see what that actually meant - so I set about trying to get an app running on Fargate from scratch. I defined the success criteria here as something close-ish to a “production ready” application, so I wanted to have the following:

  • A running container on Fargate
  • With configuration pushed down in the form of environment variables
  • “Secrets” should not be in plaintext
  • Behind a loadbalancer
  • TLS enabled with a valid SSL certificate

I approached this whole task from an infrastructure as code mentality, and instead of following the default AWS console wizards, I used terraform to define the infrastructure. It’s very possible this overcomplicated things, but I wanted to make sure any deployment was repeatable and discoverable to anyone else wanting to follow along.

All of the above criteria is generally achieveable with a Kubernetes based platform using a few external add-ons and plugins, so I’m admittedly approaching this whole task with a comparitive mentality - because I’m comparing it with my common workflow. My main goal was to see how easy this was with Fargate, especially when compared with Kubernetes. I was pretty surprised with the outcome.

AWS has overhead

I had a clean AWS account and was determined to go from zero to a deployed webapp. Like any other infrastructure in AWS, I had to get the baseline infrastructure working - so I first had to define a VPC.

I wanted to follow the best practices, so I carved the VPC up into subnets across availability zones, with a public and a private subnet. It occurred to me at this point that as long as this need was always there, I’d probably be able to find a job of some description. The notion that AWS is operationally “free” is something that has irked me for quite some time now. Many people in the developer community take for granted how much work and effort there is in setting up and defining a well designed AWS account and infrastructure. This is before we even start talking about a multi-account architecture - I’m still in a single account here and I’m already having to define infrastructure and traditional network items.

It’s also worth remembering here, I’ve done this quite a few times now, so I knew exactly what to do. I could have used the default VPC in my account, and the pre-provided subnets, which I expect many people who are getting started might do. This took me about half an hour to get running, but I couldn’t help but think here that even if I want to run lambda functions, I still need some kind of connectivity and networking. Defining NAT gateways and routing in a VPC doesn’t feel very serveless at all, but it has to be done to get things moving.

Run my damn container

Once I had the base infrastructure up and running, I now wanted to get my docker container running. I started examining the Fargate docs and browsed through the Getting Started docs and something immediately popped out at me:

Hold on a minute, there’s at least THREE steps here just to get my container up and running? This isn’t quite how this whole thing was sold to me, but let’s get started.

Task Definitions

A task definition defines the actual container you want to run. The problem I ran into immediately here is that this thing is insanely complicated. Lots of the options here are very straightforward, like specifying the docker image and memory limits, but I also had to define a networking model and a variety of other options that I wasn’t really familiar with. Really? If I had come into this process with absolutely no AWS knowledge I’d be incredibly overwhelmed at this stage. A full list of the parameters can be found on the AWS page, and the list is long. I knew my container needed to have some environment variables, and it needed to expose a port. So I defined that first, with the help of a fantastic terraform module which really made this easier. If I didn’t have this, I’d be hand writing JSON to define my container definition.

First, I defined some environment variables:

container_environment_variables = [
    {
      name  = "USER"
      value = "${var.user}"
    },
    {
      name  = "PASSWORD"
      value = "${var.password}"
    }
]

Then I compiled the task definition using the module I mentioned above:

module "container_definition_app" {
  source  = "cloudposse/ecs-container-definition/aws"
  version = "v0.7.0"

  container_name  = "${var.name}"
  container_image = "${var.image}"

  container_cpu                = "${var.ecs_task_cpu}"
  container_memory             = "${var.ecs_task_memory}"
  container_memory_reservation = "${var.container_memory_reservation}"

  port_mappings = [
    {
      containerPort = "${var.app_port}"
      hostPort      = "${var.app_port}"
      protocol      = "tcp"
    },
  ]

  environment = "${local.container_environment_variables}"

}

I was pretty confused at this point - I need to define a lot of configuration here to get this running and I’ve barely even started, but it made a little sense - anything running a docker container needs to have some idea of the configuration values of the docker container. I’ve previously written about the problems with Kubernetes and configuration management and the same problem seemed to be rearing its ugly head again here.

Next, I defined the task definition from the module above (which thankfully abstracted the required JSON away from me - if I had to hand write JSON at this point I’ve have probably given up).

I realised immediately I was missing something as I was defining the module parameters. I need an IAM role as well! Okay, let me define that:

resource "aws_iam_role" "ecs_task_execution" {
  name = "${var.name}-ecs_task_execution"

  assume_role_policy = <<EOF
{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ecs-tasks.amazonaws.com"
      },
      "Effect": "Allow"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy_attachment" "ecs_task_execution" {
  count = "${length(var.policies_arn)}"

  role       = "${aws_iam_role.ecs_task_execution.id}"
  policy_arn = "${element(var.policies_arn, count.index)}"
}

That makes sense, I’d need to define an RBAC policy in Kubernetes, so I’m still not exactly losing or gaining anything here. I am starting to think at this point that this feels very familiar from a Kubernetes perspective.

resource "aws_ecs_task_definition" "app" {
  family                   = "${var.name}"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = "${var.ecs_task_cpu}"
  memory                   = "${var.ecs_task_memory}"
  execution_role_arn       = "${aws_iam_role.ecs_task_execution.arn}"
  task_role_arn            = "${aws_iam_role.ecs_task_execution.arn}"

  container_definitions    = "${module.container_definition_app.json}"

}

At this point, I’ve written quite a few lines of code to get this running, read a lot of ECS documentation and all I’ve done is define a task definition. I still haven’t got this thing running yet. I’m really confused at this point what the value add is here over a Kubernetes based platform, but I continued onwards.

Services

A service is partly how to expose the container to the world, and partly how you define how many replicas it has. My first thought was “Ah! This is like a Kubernetes service!” and I set about mapping the ports and such like. Here was my first run at the terraform:

resource "aws_ecs_service" "app" {
  name                               = "${var.name}"
  cluster                            = "${module.ecs.this_ecs_cluster_id}"
  task_definition                    = "${data.aws_ecs_task_definition.app.family}:${max(aws_ecs_task_definition.app.revision, data.aws_ecs_task_definition.app.revision)}"
  desired_count                      = "${var.ecs_service_desired_count}"
  launch_type                        = "FARGATE"
  deployment_maximum_percent         = "${var.ecs_service_deployment_maximum_percent}"
  deployment_minimum_healthy_percent = "${var.ecs_service_deployment_minimum_healthy_percent}"

  network_configuration {
    subnets          = ["${values(local.private_subnets)}"]
    security_groups  = ["${module.app.this_security_group_id}"]
  }

}

I again got frustrated when I had to define the security group for this that allowed access to the ports needed, but I did so and plugged that into the network configuration. Then I got a smack in the face.

I need to define my own loadbalancer?

What?

Surely not?

LoadBalancers Never Go Away

I was honestly kind floored by this, I’m not even sure why. I’ve gotten so used to Kubernetes services and ingress objects that I completely took for granted how easy it is to get my application on the web with Kubernetes. Of course, we’ve spent months building a platform to make this easier at $work. I’m a heavy user of external-dns and cert-manager to automate populating DNS entries on ingress objects and automating TLS certificates and I am very aware of the work needed to get these set up, but I honestly thought it would be easier to do this on Fargate. I recognise that Fargate isn’t claiming to be the be all and end-all of how to run applications - it’s just abstracting away the node management - but I have been consistently told this is easier than Kubernetes. I really was surprised. Defining a LoadBalancer (even if you don’t want to use Ingresses and Ingress controllers) is part and parcel of deploying a service to Kubernetes, and I had to do the same thing again here. It just all felt so familiar.

I now realised I needed:

  • A loadbalancer
  • A TLS certificate
  • A DNS entry

So I set about making those. I made use of some popular terraform modules, and came up with this:

# Define a wildcard cert for my app
module "acm" {
  source  = "terraform-aws-modules/acm/aws"
  version = "v1.1.0"

  create_certificate = true

  domain_name = "${var.route53_zone_name}"
  zone_id     = "${data.aws_route53_zone.this.id}"

  subject_alternative_names = [
    "*.${var.route53_zone_name}",
  ]


  tags = "${local.tags}"

}
# Define my loadbalancer
resource "aws_lb" "main" {
  name            = "${var.name}"
  subnets         = [ "${values(local.public_subnets)}" ]
  security_groups = ["${module.alb_https_sg.this_security_group_id}", "${module.alb_http_sg.this_security_group_id}"]
}

resource "aws_lb_target_group" "main" {
  name        = "${var.name}"
  port        = "${var.app_port}"
  protocol    = "HTTP"
  vpc_id      = "${local.vpc_id}"
  target_type = "ip"
  depends_on  = [ "aws_lb.main" ]
}

# Redirect all traffic from the ALB to the target group
resource "aws_lb_listener" "main" {
  load_balancer_arn = "${aws_lb.main.id}"
  port              = "80"
  protocol          = "HTTP"

  default_action {
    target_group_arn = "${aws_lb_target_group.main.id}"
    type             = "forward"
  }
}

resource "aws_lb_listener" "main-tls" {
  load_balancer_arn = "${aws_lb.main.id}"
  port              = "443"
  protocol          = "HTTPS"
  certificate_arn   = "${module.acm.this_acm_certificate_arn}"

  default_action {
    target_group_arn = "${aws_lb_target_group.main.id}"
    type             = "forward"
  }
}

I’ll be completely honest here - I screwed this up several times. I had to fish around in the AWS console to figure out what I’d done wrong. It certainly wasn’t an “easy” process - and I’ve done this before - many times. Honestly, at this point, Kubernetes looked positively enticing to me, but I realised it was because I was very familiar with it. If I was lucky enough to be using a managed Kubernetes platform (with external-dns and cert-manager preinstalled) I’d really wonder what value add I was missing from Fargate. It just really didn’t feel that easy.

After a bit of back and forth, I now had a working ECS service. The final definition, including the service, looked a bit like this:

data "aws_ecs_task_definition" "app" {
  task_definition = "${var.name}"
  depends_on      = ["aws_ecs_task_definition.app"]
}

resource "aws_ecs_service" "app" {
  name                               = "${var.name}"
  cluster                            = "${module.ecs.this_ecs_cluster_id}"
  task_definition                    = "${data.aws_ecs_task_definition.app.family}:${max(aws_ecs_task_definition.app.revision, data.aws_ecs_task_definition.app.revision)}"
  desired_count                      = "${var.ecs_service_desired_count}"
  launch_type                        = "FARGATE"
  deployment_maximum_percent         = "${var.ecs_service_deployment_maximum_percent}"
  deployment_minimum_healthy_percent = "${var.ecs_service_deployment_minimum_healthy_percent}"

  network_configuration {
    subnets          = ["${values(local.private_subnets)}"]
    security_groups  = ["${module.app_sg.this_security_group_id}"]
  }

  load_balancer {
    target_group_arn = "${aws_lb_target_group.main.id}"
    container_name   = "app"
    container_port   = "${var.app_port}"
  }

  depends_on = [
    "aws_lb_listener.main",
  ]

}

I felt like it was close at this point, but then I remembered I’d only done 2 of the required 3 steps from the original “Getting Started” document - I still needed to define the ECS cluster.

Clusters

Thanks to a very well defined module, defining the cluster to run all this on was actually very easy.

module "ecs" {
  source  = "terraform-aws-modules/ecs/aws"
  version = "v1.1.0"

  name = "${var.name}"
}

What surprised me the most here is why I had to define a cluster at all. As someone reasonably familiar with ECS it makes some sense you’d need a cluster, but I tried to consider this from the point of view of someone having to go through this process as a complete newcomer - it seems surprising to me that Fargate is billed as “serverless” but you still need to define a cluster. It’s a small detail, but it really stuck in my mind.

Tell me your secrets

At this stage of the process, I was fairly happy I managed to get something running. There was however something missing from my original criteria. If we go all the way back to the task definition, you’ll remember my app has an environment variable for the password:

container_environment_variables = [
    {
      name  = "USER"
      value = "${var.user}"
    },
    {
      name  = "PASSWORD"
      value = "${var.password}"
    }
]

If I looked at my task definition in the AWS console, my password was there, staring at me in plaintext. I wanted this to end, so I set about trying to move this into something else, similar to Kubernetes secrets

AWS SSM

The way Fargate/ECS does the secret management portion is to use AWS SSM (the full name for this service is AWS Systems Manager Parameter Store, but I refuse to use that name because quite frankly it’s stupid)

The AWS documentation covers this fairly well, so I set about converting this to terraform.

Specifying the Secret

First, you have to define a parameter and give it a name. In terraform, it looks like this:

resource "aws_ssm_parameter" "app_password" {
  name  = "${var.app_password_param_name}" # The name of the value in AWS SSM
  type  = "SecureString"
  value = "${var.app_password}" # The actual value of the password, like correct-horse-battery-stable
}

Obviously the key component here is the “SecureString” type. This uses the default AWS KMS key to encrypt the data, something that was not immediately obvious to me. This has a huge advantage over Kubernetes secrets, which aren’t encrypted in etcd by default.

Then I specified another local value map for ECS, and passed that as a secret parameter:

container_secrets = [
    {
      name      = "PASSWORD"
      valueFrom = "${var.app_password_param_name}"
    },
]

module "container_definition_app" {
  source  = "cloudposse/ecs-container-definition/aws"
  version = "v0.7.0"

  container_name  = "${var.name}"
  container_image = "${var.image}"

  container_cpu                = "${var.ecs_task_cpu}"
  container_memory             = "${var.ecs_task_memory}"
  container_memory_reservation = "${var.container_memory_reservation}"

  port_mappings = [
    {
      containerPort = "${var.app_port}"
      hostPort      = "${var.app_port}"
      protocol      = "tcp"
    },
  ]

  environment = "${local.container_environment_variables}"
  secrets     = "${local.container_secrets}"

A problem arises

At this point, I redeployed my task definition, and was very confused. Why isn’t the task rolling out properly? I kept seeing in the console that the running app was still using the previous task definition (version 7) when the new task definition (version 8) was available. This took me way longer than it should have to figure out, but in the events screen on the console, I noticed an IAM error. I had missed a step, and the container couldn’t read the secret from AWS SSM, because it didn’t have the correct IAM permissions. This was the first time I got genuinely frustrated with this whole thing. The feedback here was terrible from a user experience perspective. If I hadn’t known any better, I would have figured everything was fine, because there was still a task running, and my app was still available via the correct URL - I was just getting the old config.

In a Kubernetes world, I would have clearly seen an error in the pod definition. It’s absolutely fantastic that Fargate makes sure my app doesn’t go down, but as an operator I need some actual feedback as to what’s happening. This really wasn’t good enough. I genuinely hope someone from the Fargate team reads this and tries to improve this experience.

That’s a wrap?

This was the end of the road - my app was running and I’d met all my criteria. I did realise that I had some improvements to make, which included:

  • Defining a cloudwatch log group, so I could write logs correctly
  • Add a route53 hosted zone to make the whole thing a little easier to automate from a DNS perspective
  • Fix and rescope the IAM permissions, which were very broad at this point

But honestly at this point, I wanted to reflect on the experience. I threw out a twitter thread about my experience and then spent the rest of the time thinking about what I really felt here.

Table Stakes

What I realised, after an evening of reflection, was that this process is largely the same whether you’re using Fargate or Kubernetes. What surprised me the most was that despite the regular claims I’ve heard that Fargate is “easier” I really just couldn’t see any benefits over a Kubernetes based platform. Now, if you’re in a world where you’re building Kubernetes clusters I can absolutely see the value here - managing nodes and the control plane is just overhead you don’t really need. The problem is - most consumers of a Kubernetes based platform don’t have to do this. If you’re lucky enough to be using GKE, you barely even need to think about the management of the cluster, you can run a cluster with a single gcloud command nowadays. I regularly use Digital Ocean’s managed Kubernetes service and I can safely say that it was as easy as spinning up a Fargate cluster - in fact in some way’s it was easier.

Having to define some infrastructure to run your container is table stakes at this point. Google may have just changed the game this week with their Google Cloud Run product, but they’re massively ahead of everyone else in this field.

What I think can be safely said from this whole experience though is this: Running containers at scale is still hard. It requires thought, it requires domain knowledge, it requires collaboration between Operations and Developers. It also requires a foundation to build on - any AWS based operation is going to need to have some fundamental infrastructure defined and running. I’m very intrigued by the “NoOps” concept that some companies seem to aspire for. I guess if you’re running a stateless application, and you can put it all inside a lambda function and an API gateway you’re probably in a good position, but are we really close to this in any kind of enterprise environment? I really don’t think so.

Fair Comparisons

Another realisation that struck me is that often the comparisons between technology A and technology B sometimes aren’t really fair, and I see this very often with AWS. The reality of the situation is often very different from the Jeff Barr blogpost. If you’re a small enough company that you can deploy your application in AWS using the AWS console and select all of the defaults, this absolutely is easier. However, I didn’t want to use the defaults, because the defaults are almost always not production ready. Once you start to peel back the layers of cloud provider services, you begin to realise that at the end of the day - you’re still running software. It still needs to be designed well, deployed well and operated well. I believe that the value add of AWS and Kubernetes and all the other cloud providers is it makes it much, much easier to run, design and operate things well, but it is definitely not free.

Arguing for Kubernetes

My final takeaway here is this: if you view Kubernetes purely as a container orchestration tool, you’re probably going to love Fargate. However, as I’ve become more familiar with Kubernetes, I’ve come to appreciate just how important it is as a technology - not just because it’s a great container orchestration tool but also because of its design patterns - it’s declarative, API driven platform. A simple thought that occurred to me during all of this Fargate process was that if I deleted any of this stuff, Fargate isn’t necessarily going to recreate it for me. Autoscaling is nice, not having to manage servers and patching and OS updates is awesome, but I felt I’d lost so much by not being able to use Kubernetes self healing and API driven model. Sure, Kubernetes has a learning curve - but from this experience, so does Fargate.

Summary

Despite my confusion during some of this process, I really did enjoy the experience. I still believe Fargate is a fantastic technology, and what the AWS team has done with ECS/Fargate really is nothing short of remarkable. My perspective however is that this is definitely not “easier” than Kubernetes, it’s just.. different.

The problems that arise when running containers in production are largely the same. If you take anything away from this post it should be this: whichever way you choose is going to have operational overhead. Don’t fall into the trap of believing that you can just pick something and your world is going to be easier. My personal opinion is this: If you have an operations team and your company is going to be deploying containers across multiple app teams - pick a technology and build processes and tooling around it to make it easier.

I’m certainly going to take the claims from people that certain technology is easier with a grain of salt from now on. At this stage, when it comes to Fargate, this sums up my feelings:


I was at cfgmgmtcamp 2019 in Ghent, and did a talk which I think was well received about the need for some Kubernetes configuration management as well as the solution we built for it at $work, kr8.

I made a statement during the talk which ignited some fairly fierce discussion both online, and at the conference:

To put this into my own words:

At some point, we decided it was okay for us to template yaml. When did this happen? How is this acceptable?

After some conversation, I figured it was probably best to back up my claims in some way. This blog post is going to try to do that.

The configuration problem

Once the applications and infrastructure you’re going to manage grows past a certain size, you inevitably end up in some form of configuration complexity hell. If you’re only deploying 1 or maybe 2 things, you can write a yaml configuration file and be done with it. However once you grow beyond that, you need to figure out how to manage this complexity. It’s incredibly likely that the reason you have multiple configuration files is because the $thing that uses that config is slightly different from its companions. Examples of this include:

  • Applications deployed in different environments, like dev, stg and prod
  • Applications deployed in different regions, like Europe or North American

Obviously, not all the configuration is different here, but it’s likely the configuration differs enough that you want to be able to differentiate between the two.

This configuration complexity has been well known for Operators (System Administrators, DevOps engineers, whatever you want to call them) for some years now. An entire discpline grew up around this in Configuration Management, and each tool solved this problem in their own way, but ultimately, they used YAML to get the job done.

My favourite method has always been hiera which comes bundled with Puppet. Having the ability to hierarchically look up the variables of specific config needs is incredibly powerful and flexible, and has generally meant you don’t actually need to do any templating of yaml at all, except perhaps for embedding Puppet facts into the yaml.

Did we go backwards?

Then, as our industries’ needs moved above the operating system and into cloud computing, we had a whole new data plane to configure. The tooling to configure this changed, and tools like CloudFormation and Helm appeared. These tools are excellent configuration tools, but I firmly believe we (as an industry) got something really, really wrong when we designed them. To examine that, let’s take a look at example of a helm chart taking a custom parameter

Helm Charts

Helm charts can take external parameters defined by an values.yaml file which you specify when rendering the chart. A simple example might look like this:

Let’s say my external parameter is simple - it’s a string. It’d look a bit like this:

image: "{{ .Values.image }}"

That’s not so bad right? You just specify a value for image in your values.yaml and you’re on your way.

The real problem starts to get highlighted when you want to do more complicated and complex things. In this particular example, you’re doing okay because you know you have to specify an image for a Kubernetes deployment. However, what if you’re working with something like an optional field? Well, then it gets a little more unwieldy:

{{- with .resourceGroup  }}
    resourceGroup: {{ .  }}
{{- end }}

Optional values just make things ugly in templating languages, and you can’t just leave the value blank, so you have to resort to ugly loops and conditionals that are probably going to bite you later.

Let’s say you need to go a step further, and you need to push an array or map into the config. With helm, you’d do something like this.

{{- with .Values.podAnnotations  }}
      annotations:
{{ toYaml . | indent 8  }}
{{- end  }}

Firstly, let’s ignore the madness of having a templating function toYaml to convert yaml to yaml and focus more on the whitespace issue here.

YAML has strict requirements and whitespace implementation rules. The following, for example, is not valid or complete yaml:

something: nothing
  hello: goodbye

Generally, if you’re handwriting something, this isn’t necessarily a problem because you just hit backspace twice and it’s fixed. However, if you’re generating YAML using a templating system, you can’t do that - and if you’re operating above 5 or 10 configuration files, you probably want to be generating your config rather than writing it.

So, in the above example, you want to embed the values of .Values.podAnnotations under the annotations field, which is indented already. So you’re having to not only indent your values, but indent them correctly.

What makes this even more confusing is that the go parser doesn’t actually know anything about YAML at all, so if you try to keep the syntax clean and indent the templates like this:

{{- with .Values.podAnnotations }}
      annotations:
      {{ toYaml . | indent 6 }}
{{- end  }}

You actually can’t do that, because the templating system gets confused. This is a singular example of the complexity and difficulty you end up facing when generating config data in YAML, but when you really start to do more complex work, it really starts to become obvious that this isn’t the way to go.

Needless to say, this isn’t what I want to spend my time doing. If fiddling around with whitespace requirements in a templating system doing something it’s not really designed for is what suits you, then I’m not going to stop you. I also don’t want to spend my time writing configuration in JSON without comments and accidentally missing commas all over the shop. We (as an industry) decided a long time ago that shit wasn’t going to work and that’s why YAML exists.

So what should we do instead? That’s where jsonnet comes in.

JSON, Jsonnet & YAML

Before we actually talk about Jsonnet, it’s worth reminding people of a very important (but oft forgotten point). YAML is a superset of JSON and converting between the two is trivial. Many applications and programming languages will parse JSON and YAML natively, and many can convert between the two very simple. For example, in Python:

python -c 'import json, sys, yaml ; y=yaml.safe_load(sys.stdin.read()) ; print(json.dumps(y))'

So with that in mind, let’s talk about Jsonnet.

Welcome to the church of Jsonnet

Jsonnet is a relatively new, little known (outside the Kubernetes community?) language that calls itself a data templating language. It’s definitely a good exercise to read and consume the Jsonnet design rationale page to get an idea why it exists, but if I was going to define in a nutshell what its purpose is - it’s to generate JSON config.

So, how does it help, exactly?

Well, let’s take our earlier example - we want to generate some JSON config specifying a parameter (ie, the image string). We can do that very very easily with Jsonnet using external variables.

Firstly, let’s define some Jsonnet:

{

  image: std.extVar('image'),

}

Then, we can generate it using the Jsonnet command line tool, passing in the external variable as we need to:

jsonnet image.jsonnet -V image="my-image"
{
   "image": "my-image"
}

Easy!

Optional fields

Before, I noted that if you wanted to define an optional field, with YAML templating you had to define if statements for everything. With Jsonnet, you’re just defining code!

// define a variable - yes, jsonnet also has comments
local rg = null;
{


  image: std.extVar('image'),
  // if the variable is null, this will be blank
  [if rg != null then 'resourceGroup']: rg,

}

The output here, because our variable is null, means that we never actually populate resourceGroup. If you specify a value, it will appear:

jsonnet image.jsonnet -V image="my-image" 
{
   "image": "my-image"
}

Maps and parameters

Okay, now let’s look at our previous annotation example. We want to define some pod annotations, which takes a YAML map as its input. You want this map to be configurable by specifying external data, and obviously doing that on the command line sucks (you’d be very unlikely to specify this with Helm on the command line, for example) so generally you’d use Jsonnet imports to this. I’m going to specify this config as a variable and then load that variable into the annotation:

local annotations = {
  'nginx.ingress.kubernetes.io/app-root': '/',
  'nginx.ingress.kubernetes.io/enable-cors': true,
};

{
  metadata: { // annotations are nested under the metadata of a pod
    annotations: annotations,
  },

}

This might just be my bias towards Jsonnet talking, but this is so dramatically easier than faffing about with indentation that I can’t even begin to describe it.

Additional goodies

The final thing I wanted to quickly explore, which is something that I feel can’t really be done with Helm and other yaml templating tools, is the concept of manipulating existing objects in config.

Let’s take our example above with the annotations, and look at the result file:

{
   "metadata": {
      "annotations": {
         "nginx.ingress.kubernetes.io/app-root": "/",
         "nginx.ingress.kubernetes.io/enable-cors": true
      }
   }
}

Now, let’s say for example I wanted to append a set of annotations to this annotations map. In any templating system, I’d probably have to rewrite the whole map.

Jsonnet makes this trivial. I can simply use the + operator to add something to this. Here’s a (poor) example:

local annotations = {
  'nginx.ingress.kubernetes.io/app-root': '/',
  'nginx.ingress.kubernetes.io/enable-cors': true,
};

{
  metadata: {
    annotations: annotations,
  },
} + { // this adds another JSON object
  metadata+: { // I'm using the + operator, so we'll append to the existing metadata
    annotations+: { // same as above
      something: 'nothing',
    },
  },
}

The end result is this:

{
   "metadata": {
      "annotations": {
         "nginx.ingress.kubernetes.io/app-root": "/",
         "nginx.ingress.kubernetes.io/enable-cors": true,
         "something": "nothing"
      }
   }
}

Obviously, in this case, it’s more code to this, but as your example get more complex, it becomes extremely useful to be able to manipulate objects this way.

Kr8

We use all of these methods in kr8 to make creating and manipulating configuration for multiple Kubernetes clusters easy and simple. I highly recommend you check it out if any of the concepts you’ve found here have found you nodding your head.


TL;DR: - go here

I often spend time in my day job wishing I could implement $newtech. I’m lucky enough to be working on projects right now that many people would find exciting, interesting and challenging, however it’s often the case that I see something I’d like to try, but deploying it at $dayjob requires me to design for large scale and with security and compliance in mind.

When this happens, I generally try it out in my “homelab”. This might mean trying it in a cloud account (I’m particularly fond of DigitalOcean for this) but I also recently reinvested (I moved to another country last year, and had to sell my previous homelab equipment) in a very small homelab consisting of 3 mini PCs and a Dell T30 server, along with some UniFi.

My original intention was to blog about the journey, but I realised this might end up being more time consuming than I’d like, so with that in mind I decided that perhaps the best way to contribute knowledge back to the community was via Github.

I’ve created a new Github Org, lbrlabs to hold all this configuration. Alongside this, I’ve created project boards which will detail my journey as I build out the software in my homelab.

Currently, the Org consists of 3 repos:

  • tf-kubernetes-clusters - a repo containing simple terraform code for Kubernetes clusters for a wide variety of cloud providers. The intention here is to make launching a cluster easy and straightforward for testing purposes
  • puppet-homelab - a Puppet control repo containing roles and profiles for my homelab. This could be used as a starting point for anyone wishing to build out a homelab, I’d encourage forking this and tailoring to your needs
  • kr8-cluster-config - a repo containing configuration for kr8 which allows me to quickly and easily install components inside the Kubernetes clusters I build. As an example I have components like metallb which allow me to have Kubernetes LoadBalancers.

Some of the other tooling I’ve implemented includes:

In the near future, I plan on implementing other tech like:

  • Vault for secret management
  • Prometheus
  • eyaml encryption in Puppet

My hope is that doing this in the open can help other homelabbers learn about enterprise software, specifically DevOps related projects.

I encourage people to open issues in the repos, asking questions about how to implement things. Hopefully this can be my way to give back to the community.


Previous visitors to this blog will remember I wrote about configuration mgmt for Kubernetes clusters, and how the space was lacking. For those not familiar, the problem statement is this: it’s really hard to maintain and manage configuration for components of multiple Kubernetes clusters. As the number of clusters you have starts to scale, keeping the things you need to run in them (such as ingress controllers) configured and in sync, as well as managed the subtle differences that need to be managed across accounts and regions.

With that in mind, it’s my pleasure to announce that at my employer, Apptio we have tried to solve this problem with kr8. kr8 is an opinionated Kubernetes cluster configuration management tool, designed to be simple, flexible and use off the shelf tools where possible. This blog post details some of the design goals of kr8, as well as some of the benefits and a few examples.

Design Goals

The intention when making kr8 was to allow us to generate manifests for a variety of Kubernetes clusters, and give us the ability to template and override yaml parameters where possible. We took inspiration from a variety of different tools such as Kustomize, Kasane, Ksonnet and many others on our journey to creating a configuration management framework that is relatively simple to use, and follows some of the practices we’re used to as Puppet administrators.

Other design goals included:

  • No templating engine
  • Flexibility
  • Compatibility with existing deployment tools, like Helm
  • Small binaries, with off the shelf tools used where needed

The end goal was to be able to take existing helm charts, or yaml installation manifests, then manipulate them to our needs. We chose to use jsonnet as the language of kr8 due to its data templating capabilities and ability to work with both JSON and YAML.

Terminology & Tools

kr8

kr8 itself is the only component of the kr8 framework that we wrote at Apptio. Its purposes are:

You can see the result of these purposes using a few of the tools in the kr8 binary, for example, listing clusters:

kr8 cluster list
+--------------------+--------------------------------------------------------------------+
|        NAME        |                                PATH                                |
+--------------------+--------------------------------------------------------------------+
| do                 | /Users/Lee/github/cluster_config/clusters/do                       |
| gcloud             | /Users/Lee/github/cluster_config/clusters/gcloud                   |
| kops               | /Users/Lee/github/cluster_config/clusters/kops                     |
| docker-for-desktop | /Users/Lee/github/cluster_config/clusters/local/docker-for-desktop |
| minikube           | /Users/Lee/github/cluster_config/clusters/local/minikube           |
+--------------------+--------------------------------------------------------------------+

However, using the kr8 binary alone is probably not what you want to do. We bundle and use a variety of other tools with kr8 to achieve the ability to generate manifests for multiple clusters and deploy them.

Task

Task does a lot of the heavy lifting for kr8. It is a task runner, much like Make but with a more flexible DSL (yep, it’s yaml/json!) and the ability to run tasks in parallel. We use Taskfiles for each component to allow us to build the component config. This gives us the flexibility to use rendering options for each component that make sense, whether it be pulling in a Helm chart or plain yaml. We can then input that yaml with kr8, and manipulate it with jsonnet code to add, modify the resulting kubernetes manifest. Alongside this, we use a taskfile to generate deployment tasks and to generate all components for a Task. This gives us the ability to execute lots of generate manifest jobs in relatively short periods of time.

Kubecfg

We use Kubecfg to do the actual deployment of these manifests. Kubecfg gives us the ability to validate, diff and iteratively deploy Kubernetes manifests which kubectl does not. The kubecfg jobs are generally inside Taskfiles at the cluster level.

Components

Components are very similar to helm charts. They are installable resource collections for something you’d like to deploy to your Kubernetes clusters. A component has 3 minimal requirements:

  • params.jsonnet: contains configurable params for the component
  • Taskfile.yml: Instructions to render the component
  • An installation source: This can be anything from a pure jsonnet file to a helm input values file. Ultimately, this needs to be able to generate some yaml

I intend to write many more kr8 blog posts and docs, detailing how kr8 can work, examples of different components and tutorials. Until then, take a look at these resources:

Thanks

I want to thank Colin Spargo, who came up with the original concept of kr8 and how it might work, as well as contributing large amounts of the kr8 code. I also want to thank Sanyu Melwani, who had valuable input into the concept, as well as writing many kr8 components.

Finally, a thank you to our employer, Apptio, who has allowed us to spend time creating this tool to ease our Kubernetes deployment frustrations. If you’re interested in working on fun projects like this, we are hiring for remote team members


Serverless computing is all the rage at the moment, and why wouldn’t it be? The idea of deploying code without having to worry about anything like servers, or that pesky infrastructure everyone complains about seems pretty appealing. If you’ve ever used AWS lamdba or one of its related cousins, you’ll be able to see the freedom that triggering functions on events brings you.

The increase in excitement around serverless frameworks means that naturally, there’s been an increase in providers in the Kubernetes world. A quick look at the CNCF Landscape page shows just how many options there are to Kubernetes cluster operators.

In this post I wanted to look at Kubeless, a serverless framework written by the awesome people at Bitnami.

Kubeless appealed to me specifically for a few reasons:

  • Native Kubernetes resources (CRDs) for functions, meaning that standard Kubernetes deployment constructs can be used
  • No external dependencies to get started
  • Support for PubSub functions without having to manually bind to messages queues etc
  • Lots of language support with the runtimes

Before you get started with this, you’ll probably want to follow the Quick Start as well as get an understanding of how the PubSub functions work.

To follow along here you’ll need:

  • A working kubeless deployment, including the kubeless cli
  • A working NATS cluster, perhaps using the NATS Operator.
  • You’ll also need the as the Kubeless NATS Trigger installed in your cluster.

In this walkthrough, I wanted to show you how easy it is to get Kubernetes events (in this case, pod creations) and then use kubeless to perform actions on them (like post to a slack channel).

I’m aware there are tools out there that already fulfill this function (ie events to slack) but I figured it was a good showcase of what can be done!

Publishing Kubernetes Events

Before you can trigger kubeless functions, you first need to have events from Kubernetes published to your NATS cluster.

To do this, I used the excellent kubernetes python library

An easy way to do this is simply connect to the API using the in_cluster capabilities and then list all the pods, like so:

from kubernetes import client, config, watch

def main():
  # use your machines kubeconfig
  config.load_kube_config()

  v1 = client.CoreV1Api()

  w = watch.Watch()
  for event in w.stream(v1.list_pod_for_all_namespaces):
    print("Event: %s %s %s" % (event['type'], event['object'].kind, event['object'].metadata.name))


main()

This simple script will log all the information for pods in all namespaces to stdout. It can be run on your local machine, give it a try!

The problem with this is that it’s just spitting information to stdout to test locally, so we need to publish this events to NATS. In order to do this, we’ll use the python aysncio-nats libarary

Now, your script has gotten much more complicated:

import asyncio
import json
import os

from kubernetes import client, config, watch

from nats.aio.client import Client as NATS
from nats.aio.errors import ErrConnectionClosed, ErrTimeout, ErrNoServers

# connect to the cluster defined in $KUBECONFIG
config.load_kube_config()

# set up the kubernetes core API
v1 = client.CoreV1Api()

# created a method that runs async
async def run(loop):
    # set up NATS connection
    nc = NATS()
    try:
        # connect to a NATS cluster on localhost
        await nc.connect('localhost:4222', loop=loop)
    except Exception as e:
        exit(e)

    # method to get pod events async
    async def get_pod_events():
        w = watch.Watch()
        for event in w.stream(v1.list_pod_for_all_namespaces):
            print("Event: %s %s %s" % (event['type'], event['object'].kind, event['object'].metadata.name))
            msg = {'type':event['type'],'object':event['raw_object']}

            # publish the events to a NATS topic k8s_events
            await nc.publish("k8s_events", json.dumps(msg).encode('utf-8'))
            await asyncio.sleep(0.1)

    # run the method
    await get_pod_events()
    await nc.close()

if __name__ == '__main__':

    # set up the async loop
    loop = asyncio.get_event_loop()
    loop.create_task(run(loop))
    try:
        # run until killed
        loop.run_forever()
        print("Event: %s %s %s" % (event['type'], event['object'].kind, event['object'].metadata.name))    
    # if killed by keyboard shutdown, clean up nicely
    except KeyboardInterrupt:
        print('keyboard shutdown')
        tasks = asyncio.gather(*asyncio.Task.all_tasks(loop=loop), loop=loop, return_exceptions=True)
        tasks.add_done_callback(lambda t: loop.stop())
        tasks.cancel()

        # Keep the event loop running until it is either destroyed or all
        # tasks have really terminated
        while not tasks.done() and not loop.is_closed():
            loop.run_forever()
    finally:
        print('closing event loop')
        loop.close()

Okay, so now we have events being pushed to NATS. We need to fancy this up a bit, to allow for running in and out of cluster, as well as building a Docker image. The final script can be found here. The changes are to include a logger module, as well as argparse to allow for running in and out of the cluster, as well as make some options configurable.

You should now deploy this to your cluster using the provided deployment manifests, which also include the (rather permissive!) RBAC configuration needed for the deployment to be able to read pod information from the API.

kubectl apply -f https://raw.githubusercontent.com/jaxxstorm/kubeless-events-example/master/manifests/events.yaml

This will install the built docker container to publish events to the NATS cluster configured earlier. If you need to, modify the environment variable NATS_CLUSTER if you deployed your NATS cluster to another address.

Consuming Events with Kubeless functions

So now the events are being published, we need to actually do something with them. Let’s first make sure the events are coming in.

You should have the kubeless cli downloaded by now, so let’s create a quick example function to make sure the events are being posted.

def dump(event, context):
    print(event)
    return(event)

As you can probably tell, this function just dumps any event sent to it and returns. So let’s try it out. With kubeless, let’s deploy it:

kubeless function deploy test --runtime python3.6 --handler test.dump --from-file functions/test.py -n kubeless

What’s happening here, exactly?

  • runtime: specify a runtime for the function, in this case, python 3.6
  • from-file: path to your file containing your function
  • handler: this is the important part. A handler is the kubeless function to call when an event is received. It’s in the format <filename>.<functionname>. So in our case, our file was called test.py and our function was called dump, so we specify test.dump.
  • namespace: make sure you specify the namespace you deployed kubeless to!

So you should now have a function deployed:

kubeless function ls -n kubeless
NAME	NAMESPACE	HANDLER  	RUNTIME  	DEPENDENCIES	STATUS
test	kubeless 	test.dump	python3.6	            	1/1 READY

So now, we need to have this function be triggered by the NATS messages. To do that, we add a trigger:

kubeless trigger nats create test --function-selector created-by=kubeless,function=test --trigger-topic k8s_events -n kubeless

What’s going on here?

  • We create a trigger with name test
  • function-selector: use labels created by kubeless function to select the function to run
  • trigger-topic: specify a trigger topic of k8s_events (which is specified in the event publisher from earlier)
  • Same namespace!

Okay, so now, let’s cycle the event publisher and test things out!

# delete the current nats event pods
# this will readd all the pods and republish them
kubectl delete po -l app=kubeless-nats-events -n kubeless

# read the pod logs from the kubeless function we created
k logs -l function=test -n kubeless

You should see something like this as an output log:

{'data': {'type': 'MODIFIED', 'object': {'kind': 'Pod', 'apiVersion': 'v1', 'metadata': {'name': 'nats-trigger-controller-66df75894b-mxppc', 'generateName': 'nats-trigger-controller-66df75894b-', 'namespace': 'kubeless', 'selfLink': '/api/v1/namespaces/kubeless/pods/nats-trigger-controller-66df75894b-mxppc', 'uid': '01ead80a-d0ef-11e8-8e5f-42010aa80fc2', 'resourceVersion': '16046597', 'creationTimestamp': '2018-10-16T02:55:58Z', 'labels': {'kubeless': 'nats-trigger-controller', 'pod-template-hash': '2289314506'}, 'ownerReferences': [{'apiVersion': 'extensions/v1beta1', 'kind': 'ReplicaSet', 'name': 'nats-trigger-controller-66df75894b', 'uid': '01e65ec2-d0ef-11e8-8e5f-42010aa80fc2', 'controller': True, 'blockOwnerDeletion': True}]}, 'spec': {'volumes': [{'name': 'controller-acct-token-dwfj5', 'secret': {'secretName': 'controller-acct-token-dwfj5', 'defaultMode': 420}}], 'containers': [{'name': 'nats-trigger-controller', 'image': 'bitnami/nats-trigger-controller:v1.0.0-alpha.9', 'env': [{'name': 'KUBELESS_NAMESPACE', 'valueFrom': {'fieldRef': {'apiVersion': 'v1', 'fieldPath': 'metadata.namespace'}}}, {'name': 'KUBELESS_CONFIG', 'value': 'kubeless-config'}, {'name': 'NATS_URL', 'value': 'nats://nats-cluster.nats-io.svc.cluster.local:4222'}], 'resources': {}, 'volumeMounts': [{'name': 'controller-acct-token-dwfj5', 'readOnly': True, 'mountPath': '/var/run/secrets/kubernetes.io/serviceaccount'}], 'terminationMessagePath': '/dev/termination-log', 'terminationMessagePolicy': 'File', 'imagePullPolicy': 'IfNotPresent'}], 'restartPolicy': 'Always', 'terminationGracePeriodSeconds': 30, 'dnsPolicy': 'ClusterFirst', 'serviceAccountName': 'controller-acct', 'serviceAccount': 'controller-acct', 'nodeName': 'gke-lbr-default-pool-3071fe55-kfm6', 'securityContext': {}, 'schedulerName': 'default-scheduler', 'tolerations': [{'key': 'node.kubernetes.io/not-ready', 'operator': 'Exists', 'effect': 'NoExecute', 'tolerationSeconds': 300}, {'key': 'node.kubernetes.io/unreachable', 'operator': 'Exists', 'effect': 'NoExecute', 'tolerationSeconds': 300}]}, 'status': {'phase': 'Running', 'conditions': [{'type': 'Initialized', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': '2018-10-16T02:55:58Z'}, {'type': 'Ready', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': '2018-10-16T02:56:00Z'}, {'type': 'PodScheduled', 'status': 'True', 'lastProbeTime': None, 'lastTransitionTime': '2018-10-16T02:55:58Z'}], 'hostIP': '10.168.0.5', 'podIP': '10.36.14.29', 'startTime': '2018-10-16T02:55:58Z', 'containerStatuses': [{'name': 'nats-trigger-controller', 'state': {'running': {'startedAt': '2018-10-16T02:55:59Z'}}, 'lastState': {}, 'ready': True, 'restartCount': 0, 'image': 'bitnami/nats-trigger-controller:v1.0.0-alpha.9', 'imageID': 'docker-pullable://bitnami/[email protected]:d04c8141d12838732bdb33b2890eb00e5639c243e33d4ebdad34d2e065c54357', 'containerID': 'docker://d9ff52fa7e5d821c12fd910e85776e2a422098cde37a7f2c8054e7457f41aa1e'}], 'qosClass': 'BestEffort'}}}, 'event-id': 'YMAok3se6ZQyMTM', 'event-type': 'application/json', 'event-time': '2018-10-16 02:56:00.738120124 +0000 UTC', 'event-namespace': 'natstriggers.kubeless.io', 'extensions': {'request': <LocalRequest: POST http://test.kubeless.svc.cluster.local:8080/>}}

This is the modification event for the pod you just cycled. Awesome!

Publish the event to slack

Okay, so now you’ve got some events being shipped, it’s time to get a little bit more creative. Let’s publish some of these events to slack.

You can create a slack.py with your function in, like so:

#!/usr/bin/env python
from slackclient import SlackClient
import os

def slack_message(event, context):
    try:
        # Grab the slack token and channel from the environment vars
        slack_token = os.environ.get('SLACK_TOKEN', None)
        slack_channel = os.environ.get('SLACK_CHANNEL', None)
        if not slack_token:
            msg = 'No slack token set, can\'t do anything'
            print(msg)
            return msg
        else:
            # post a slack message!
            sc = SlackClient(slack_token)
            sc.api_call(
                "chat.postMessage",
                channel=slack_channel,
                attachments=[ { "title": event['data']['type']+" Pod event", "text": str(event)  } ]
            )

            print(event)

    except Exception as inst:
        print(type(inst))
        print(inst)
        print(event.keys())
        print(type(event))
        print(str(event))

You’ll need to deploy your function using the kubeless binary:

export SLACK_TOKEN=<my_slack_token>
export SLACK_CHANNEL=<my_slack_channel>

kubeless function deploy slack --runtime python3.6 --handler slack.slack_message --from-file slack.py -n kubeless --env SLACK_TOKEN=$SLACK_TOKEN --env SLACK_CHANNEL=$SLACK_CHANNEL --dependencies requirements.txt

The only thing you might be confused about here is the --dependencies file. Kubeless uses this to determine which dependencies you need to install for the function runtime. In the python case, it’s a requirements.txt. You can find a working one in the related github repo linked to this post. This example better formats the slack responses into nice slack output, so it’s worth taking a look at.

You’ll obviously need a slack org to try this out, and need to generate a slack token to get API access. However, now, once you cycle the events pod again (or, run another pod of course!) - you’ll now see these events pushed to slack!

slack-events

Wrap up

Obviously this is a trivial example of using these functions, but the power of the event pipeline with kubeless is there to be seen. Anything you might need to happy when certain events happen in your Kubernetes cluster can be automated using this Kubeless event pipeline.

You can check out all the code, deployment and manifests for this post in the github repo that accompanies this post. Pull requests and feedback on my awful Python code are also welcome!