<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
>
	<channel>
		<title></title>
		<description>Engineering, DevOps &amp; Cloud Computing
</description>		
		<sy:updatePeriod>daily</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
		<link>https://leebriggs.co.uk/</link>
		<atom:link href="https://leebriggs.co.uk/feed.xml" rel="self" type="application/rss+xml" />
		<lastBuildDate>Fri, 06 Feb 2026 00:00:00 +0000</lastBuildDate>
		
		
            
                <item>
                    <title>Landlord: a tenancy controller and experiment in AI driven product building</title>
                    
                        <description>&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; The TL;DR of this post is I built a thing with AI, you can check it out &lt;a href=&quot;https://github.com/jaxxstorm/landlord&quot;&gt;here&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;Back in 2012 I joined a growing startup that sold SaaS software to its customers. For those of you reading who didn’t understand how the infrastructure world looked back then, AWS, Google Cloud and Azure were all relatively niche offerings at the time, Docker hadn’t been invented yet and most of the technologies you wanted to use to manage infrastructure were written in Ruby.&lt;/p&gt;

&lt;p&gt;I had never built any SaaS software, but I was still surprised at how things worked behind the scenes when I joined. Provisioning a new customer involved my “TechOps” team running some shell scripts, data was stored in an NFS mount shared between 2 or 3 racks of whitebox servers, and if one customer’s usage got out of control, we would have to rsync the data to another server to avoid any noisy neighbour problems.&lt;/p&gt;

&lt;p&gt;Obviously none of this was tenable in the long term, but it worked for the time we were in. There was an acknowledgement amongst our team of 5 or 6 system administrators (that’s DevOps or Platform engineers to you young ‘uns) that we couldn’t scale the business this way, and one of my more senior colleagues rolled up his sleeves and started automating some of the tasks we had in front of us.&lt;/p&gt;

&lt;p&gt;What emerged over the course of several weeks was a project we dubbed “selfserve”. Its functionality started simple: anything we could automate with a shell script was now executed by a “runner” which was dispatched from a centralized control plane. It would SSH into a box, execute the script and return the results to the control plane, which would then be rendered in PHP to whoever ran it. Our ability to scale our processes accelerated massively. In parallel, I worked with another colleague to automate the art of bringing servers online (with &lt;a href=&quot;https://www.puppet.com/&quot;&gt;Puppet&lt;/a&gt; and &lt;a href=&quot;https://cobbler.github.io/&quot;&gt;Cobbler&lt;/a&gt;, two technologies that seem to be on the path to extinction) and another colleague worked on streamlining our ability to get datacenters off the ground. Within the space of 2 years, we grew from 2 racks in RDU to 7 datacenters around the world with thousands of customers and millions of dollars in revenue.&lt;/p&gt;

&lt;p&gt;Another issue we had to solve was &lt;em&gt;tenancy&lt;/em&gt; - how should we segment our customer’s data so we meet compliance goals and scale effectively? Remember, this is well before Docker is considered the de facto standard, and this wasn’t a problem any of us had expertise in solving.&lt;/p&gt;

&lt;p&gt;Tenancy as a model is relatively common nowadays, and there are many ways to solve it. At the time, the &lt;em&gt;right&lt;/em&gt; way seemed to be to have distinct, segregated compute per tenant - and it scaled &lt;em&gt;remarkably&lt;/em&gt; well. Back in 2012 to 2015, we felt like it was the right approach, but as the company grew, we all held some regrets we hadn’t been more ambitious about solving this problem..&lt;/p&gt;

&lt;p&gt;My career has changed since then, and I work in a more consultative role as a solutions engineer with companies trying to solve technology problems with &lt;a href=&quot;https://tailscale.com/&quot;&gt;Tailscale&lt;/a&gt;. What has consistently surprised me over this time is how many companies and orgs are solving the tenancy model in the same way! When I think about it now, wearing the scars of being paged at 2am for most of my career, the “compute-per-tenant” model actually makes a remarkable amount of sense. Sure, it’s &lt;em&gt;inefficient&lt;/em&gt; but it’s safe, easy and reduces the blast radius of outages. As with all technology decisions, the tradeoffs have to be considered, but I speak to a remarkable amount of customers who have solved the problem the same way.&lt;/p&gt;

&lt;p&gt;Why am I telling you this story in a post with AI in the title? Well, I’ve had in the back of my mind that this problem space could be commoditized because of how common and ubiquitous it is. The problem was always how &lt;em&gt;hard&lt;/em&gt; it seemed to be to try and solve it on my own as a spare time project. If you haven’t solved this problem before, you’d be forgiven for thinking it’s quite easy to solve now: Just have a provisoning system hit the Kubernetes API! It’ll handle all this stuff for you! I don’t blame anyone for thinking that’s the right way to go about it, but it fails to understand that managing the automation itself then becomes the battle. If you’re a SaaS service, you probably want to have an automated onboarding flow that provisions the tenant for you, but you now need to build a durable process that involves lots of distinct steps and assumes the entire process is &lt;em&gt;reliable&lt;/em&gt;. What happens if you run out of compute in your AWS account? What if your Kubernetes API responds slowly? What if the Docker image you’re pulling pulls too slowly and the job times out? What about if the Docker image is wrong in the request, and now you’re in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ImagePullBackOff&lt;/code&gt; and you have to nurse it through automatically.&lt;/p&gt;

&lt;p&gt;You’re essentially in a &lt;a href=&quot;https://www.youtube.com/watch?v=pMh3v23smOk&quot;&gt;reliability maths problem&lt;/a&gt;, and what seemed like a simple problem becomes a complex web of inter-dependencies that all have to have 99.99999% uptime to work and be successful.&lt;/p&gt;

&lt;h2 id=&quot;landlord&quot;&gt;Landlord&lt;/h2&gt;

&lt;p&gt;As I mentioned earlier, I’ve seen this problem be common enough I wanted to build something to solve it for a modern tech stack, but it seemed to incredibly daunting. If I was lucky enough to not have to work and could spend my time writing code all day, I figured I could probably get this done to a reasonable standard. I’m not what you’d consider the most talented software developer (after all, I started writing shell scripts and PHP!) but I had enough information to be dangerous, and most importantly I feel like I know relatively well &lt;em&gt;how&lt;/em&gt; to design a good system.&lt;/p&gt;

&lt;p&gt;Then the world was hit by the meteor now called agentic coding, and I had a little crack at building it. It was an incredibly frustrating experience because the LLM couldn’t hold enough context to build such a complicated system, I felt like all I did was fight with the model.&lt;/p&gt;

&lt;p&gt;Then a few weeks ago, I started experimenting with &lt;a href=&quot;https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html&quot;&gt;spec-driven development&lt;/a&gt;. And things changed overnight.&lt;/p&gt;

&lt;p&gt;What has emerged is &lt;a href=&quot;https://github.com/jaxxstorm/landlord&quot;&gt;landlord&lt;/a&gt;, an experimental, pluggable compute manager that provisions &lt;em&gt;tenants&lt;/em&gt; via a series of different compute providers. It leverages pluggable &lt;em&gt;workflow&lt;/em&gt; providers to handle the durable executions, and looks very similar in design to how the first versions of &lt;em&gt;selfserve&lt;/em&gt; looked back in 2012/2013. Written in Go, it consists of a single API managed control plane that can handle most of the work you need to do to provision tenants in a way that’s reliable and effective.&lt;/p&gt;

&lt;h2 id=&quot;walkthrough&quot;&gt;Walkthrough&lt;/h2&gt;

&lt;p&gt;Landlord currently supports one workflow provider, &lt;a href=&quot;https://restate.dev/&quot;&gt;Restate&lt;/a&gt; which handles the durable execution of the compute. It supports two databases, SQLite and Postgres to store the state of the execution. It currently supports one compute provider, Docker.&lt;/p&gt;

&lt;p&gt;We start by provisioning a tenant. Landlord accepts config for the tenant in the format of the compute provider you’re provisioning.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;go run ./cmd/cli create &lt;span class=&quot;nt&quot;&gt;--tenant-name&lt;/span&gt; lbr &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--config&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;'{
    &quot;image&quot;: &quot;nginx:1.25&quot;,
    &quot;env&quot;: {
      &quot;FOO&quot;: &quot;bar&quot;
    },
    &quot;ports&quot;: [
      {
        &quot;container_port&quot;: 80,
        &quot;host_port&quot;: 8888,
        &quot;protocol&quot;: &quot;tcp&quot;
      }
    ]
  }'&lt;/span&gt;
Tenant created
ID: 0b79eb9b-2c4c-43fb-85f0-aeb0ed63de78
Name: lbr
Status: requested
Config: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Compute: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Created At: 2026-02-06T16:00:09Z
Updated At: 2026-02-06T16:00:09Z
Version: 1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This request then kicks off a workflow to the pluggable workflow provider, which includes a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;worker&lt;/code&gt; that performs the execution. By dispatching this off to a workflow provider, we mitigate against any errors in the request flow, meaning we can provision tenants in parallel without worrying about building durable code.&lt;/p&gt;

&lt;p&gt;The tenant has been requested, meaning the workflow job is dispatching&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;go run ./cmd/cli list
ID                                    Name  Status     Workflow  Retries
0b79eb9b-2c4c-43fb-85f0-aeb0ed63de78  lbr   requested
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The workflow job runs and finally, I can see my tenant is up and running!&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
ID: 0b79eb9b-2c4c-43fb-85f0-aeb0ed63de78
Name: lbr
Status: ready
Status Message: Workflow execution completed: inv_1gD1CwSdyy410oGeothz81216ifhbZkCAx
Workflow Sub-State: succeeded
Config: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Compute: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Created At: 2026-02-06T16:00:09Z
Updated At: 2026-02-06T16:00:22Z
Version: 3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Something I took from modern reconciliation approaches like Kubernetes was the ability to set desired state. If I want to modify the configuration, I can simply issue a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set&lt;/code&gt; command with a new image or config. Let’s add a new environment variable&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;go run ./cmd/cli &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--tenant-name&lt;/span&gt; lbr &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--config&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;'{
    &quot;image&quot;: &quot;nginx:1.25&quot;,
    &quot;env&quot;: {
      &quot;FOO&quot;: &quot;bar&quot;,
      &quot;LAND&quot;: &quot;lord&quot;
    },
    &quot;ports&quot;: [
      {
        &quot;container_port&quot;: 80,
        &quot;host_port&quot;: 8888,
        &quot;protocol&quot;: &quot;tcp&quot;
      }
    ]
  }'&lt;/span&gt;
Tenant updated
ID: 0b79eb9b-2c4c-43fb-85f0-aeb0ed63de78
Name: lbr
Status: updating
Status Message: Update requested
Config: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;LAND&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;lord&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Compute: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;LAND&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;lord&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Created At: 2026-02-06T16:00:09Z
Updated At: 2026-02-06T16:03:36Z
Version: 4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This kicks off a new workflow:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;go run ./cmd/cli get &lt;span class=&quot;nt&quot;&gt;--tenant-name&lt;/span&gt; lbr
Tenant details
ID: 0b79eb9b-2c4c-43fb-85f0-aeb0ed63de78
Name: lbr
Status: ready
Status Message: Workflow execution completed: inv_17jEoSZg1AhW2mztVGxctly24kh6hua6el
Workflow Sub-State: succeeded
Config: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;LAND&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;lord&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Compute: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;env&quot;&lt;/span&gt;:&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;FOO&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;bar&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;LAND&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;lord&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;image&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;nginx:1.25&quot;&lt;/span&gt;,&lt;span class=&quot;s2&quot;&gt;&quot;ports&quot;&lt;/span&gt;:[&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;container_port&quot;&lt;/span&gt;:80,&lt;span class=&quot;s2&quot;&gt;&quot;host_port&quot;&lt;/span&gt;:8888,&lt;span class=&quot;s2&quot;&gt;&quot;protocol&quot;&lt;/span&gt;:&lt;span class=&quot;s2&quot;&gt;&quot;tcp&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}]}&lt;/span&gt;
Created At: 2026-02-06T16:00:09Z
Updated At: 2026-02-06T16:03:52Z
Version: 6
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and uses the underlying compute primitives to replace the existing tenant with new config:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker ps &lt;span class=&quot;nt&quot;&gt;--filter&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;landlord.owner&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;landlord
CONTAINER ID   IMAGE        COMMAND                  CREATED              STATUS              PORTS                  NAMES
0962efd7c26b   nginx:1.25   &lt;span class=&quot;s2&quot;&gt;&quot;/docker-entrypoint.…&quot;&lt;/span&gt;   About a minute ago   Up About a minute   0.0.0.0:8888-&amp;gt;80/tcp   landlord-tenant-0b79eb9b-2c4c-43fb-85f0-aeb0ed63de78
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker inspect 0962efd7c26b | jq &lt;span class=&quot;s1&quot;&gt;'.[0].Config.Env'&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;FOO=bar&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;LAND=lord&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;NGINX_VERSION=1.25.5&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;NJS_VERSION=0.8.4&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;NJS_RELEASE=3~bookworm&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;PKG_RELEASE=1~bookworm&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I personally believe this is tenancy model that is &lt;em&gt;very&lt;/em&gt; common around our industry, and I hope this experimental approach goes some way towards commoditizing this particular part of the stack.&lt;/p&gt;

&lt;h2 id=&quot;the-ai-of-it-all&quot;&gt;The AI of it all&lt;/h2&gt;

&lt;p&gt;I couldn’t in good faith write this post without talking about AI. As I mentioned before, I’ve had aspirations to build something like this quite a while, but the level of effort involved seemed so incredibly daunting, I never really got started. Spec driven development has completely changed what I could achieve in this space, and if you take a look at the &lt;a href=&quot;https://github.com/jaxxstorm/landlord/tree/main/openspec&quot;&gt;openspec directory&lt;/a&gt; in the landlord repo, you’ll be able to see just how long it took me to build something relatively complex.&lt;/p&gt;

&lt;p&gt;Now, how do I feel about the quality of this? Well, as I was reviewing the code generated by Codex 5.2, I did have to corral it a little bit. I’m not particularly enthused about the &lt;em&gt;quality&lt;/em&gt; of the code here. There are several areas in here where the model made trade-offs I would have likely not wanted to make, and a lot of times I had to catch it during the process of building a spec and say “actually no, please don’t do that”. The real thing that’s worth talking about here of course if productivity - the fact I could build a relatively complex system with domain knowledge or a problem I’ve already solved is remarkable to me.&lt;/p&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s next?&lt;/h2&gt;

&lt;p&gt;I’d like to continue adding pluggable interfaces to landlord, support for Amazon Step Functions and Temporal as workflow engines is high on the agenda, as well as expanding the compute support for ECS, Kubernetes and EC2. After that? Who knows, it was fun remiscing about career years gone by.&lt;/p&gt;

&lt;h2 id=&quot;faqs&quot;&gt;FAQs&lt;/h2&gt;

&lt;h3 id=&quot;remind-me-again-why-you-wouldnt-just-use-kubernetes-or-nomad-or-some-other-cluster-scheduler&quot;&gt;Remind me again why you wouldn’t just use Kubernetes or Nomad or some other cluster scheduler?&lt;/h3&gt;

&lt;p&gt;Nomad and Kubernetes solve the scheduling problem. They’re very good at placing workloads once you’ve decided what should exist. Landlord is focused on the control plane problem that sits above that:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;orchestrating multi-step provisioning&lt;/li&gt;
  &lt;li&gt;handling retries and partial failure&lt;/li&gt;
  &lt;li&gt;managing tenant lifecycle as a first-class concept&lt;/li&gt;
  &lt;li&gt;integrating workflow durability with compute execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can absolutely use Kubernetes or Nomad underneath Landlord. The point is to avoid baking all of this logic directly into cluster automation where it becomes fragile and hard to reason about.&lt;/p&gt;

&lt;h3 id=&quot;is-this-production-ready&quot;&gt;Is this production-ready?&lt;/h3&gt;

&lt;p&gt;No - and that’s intentional.&lt;/p&gt;

&lt;p&gt;This is an experiment in system design, workflow durability, and spec-driven development. The abstractions matter more than the current implementations. Some parts are deliberately simple so the seams are visible.&lt;/p&gt;

&lt;p&gt;If this ever becomes “production-ready”, it will be because the design holds up as more providers and workflows are added.&lt;/p&gt;

&lt;h3 id=&quot;why-spec-driven-development-instead-of-just-writing-code&quot;&gt;Why spec-driven development instead of just writing code?&lt;/h3&gt;

&lt;p&gt;Because the hard part here isn’t syntax—it’s decision-making. Specs force clarity:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;what guarantees exist&lt;/li&gt;
  &lt;li&gt;what can fail&lt;/li&gt;
  &lt;li&gt;what is allowed to change&lt;/li&gt;
  &lt;li&gt;what must remain stable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once that’s written down, AI becomes genuinely useful as an accelerator instead of a liability. Without specs, you’re just arguing with the model.&lt;/p&gt;

&lt;h3 id=&quot;what-problem-are-you-actually-trying-to-commoditise&quot;&gt;What problem are you actually trying to commoditise?&lt;/h3&gt;

&lt;p&gt;The boring but critical middle:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;tenant provisioning&lt;/li&gt;
  &lt;li&gt;lifecycle management&lt;/li&gt;
  &lt;li&gt;durable automation&lt;/li&gt;
  &lt;li&gt;“what happens when this goes wrong?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most teams end up re-building this ad-hoc. Landlord is an attempt to make that layer explicit, inspectable, and reusable.&lt;/p&gt;

&lt;h3 id=&quot;what-models-and-tools-did-you-use-to-build-this&quot;&gt;What models and tools did you use to build this?&lt;/h3&gt;

&lt;p&gt;I used a combination of &lt;a href=&quot;https://openai.com/index/introducing-gpt-5-2-codex/&quot;&gt;GPT-Codex 5.2&lt;/a&gt; with the &lt;a href=&quot;https://developers.openai.com/codex/ide/&quot;&gt;Codex VSCode extension&lt;/a&gt;, as well as &lt;a href=&quot;https://github.com/copilot&quot;&gt;GitHub Copilot&lt;/a&gt; with &lt;a href=&quot;https://www.anthropic.com/news/claude-sonnet-4-5&quot;&gt;Claude Sonnet 4.5&lt;/a&gt; for some light fixes and documentation.&lt;/p&gt;
</description>
                    <pubDate>Fri, 06 Feb 2026 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2026/02/06/landlord.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2026/02/06/landlord.html</guid>
                </item>
            
		
            
                <item>
                    <title>The enshittification of enshittification</title>
                    
                        <description>&lt;p&gt;Over the last six months, I’ve spent a lot of time in the Tailscale community - helping users debug issues, answering questions about how we sell the product and explaining, repeatedly, how we think about the business behind it.&lt;/p&gt;

&lt;p&gt;It’s rewarding work. I genuinely enjoy it.&lt;/p&gt;

&lt;p&gt;But threaded through almost every conversation is the same quiet fear:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;“Eventually you’re going to take this away from me, aren’t you?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;enshittification&quot;&gt;Enshittification&lt;/h2&gt;

&lt;p&gt;That fear usually traces back to &lt;a href=&quot;https://en.wikipedia.org/wiki/Cory_Doctorow&quot;&gt;Corey Doctorow’s&lt;/a&gt; concept of &lt;a href=&quot;https://en.wikipedia.org/wiki/Enshittification&quot;&gt;&lt;em&gt;enshittification&lt;/em&gt;&lt;/a&gt;: the idea that any service which genuinely solves a problem will, over time, mutate into a platform that extracts value from the people who depend on it.&lt;/p&gt;

&lt;p&gt;It’s frequently paired with another well-worn belief:&lt;br /&gt;
“If you’re not paying for anything, you’re the product.”&lt;/p&gt;

&lt;p&gt;These ideas have become common shorthand for how people reason about SaaS businesses. They’re not wrong, exactly - but when they’re treated as universal laws rather than patterns, they flatten a lot of important context. Before applying them indiscriminately to everything we like or rely on, it’s worth understanding what actually drives this behavior in the first place.&lt;/p&gt;

&lt;h2 id=&quot;the-vc-backed-engine&quot;&gt;The VC-backed engine&lt;/h2&gt;

&lt;p&gt;Most companies that take venture capital do so for the same basic reason: to grow. Growing a business requires capital, and capital is hard to come by when you’re small. You need to invest in marketing, hire engineers to improve the product, and hire salespeople to bring it to market.&lt;/p&gt;

&lt;p&gt;The tradeoff is straightforward. Once you take venture capital, you are no longer optimizing solely for users. You are also obligated to return capital to investors, and that obligation shapes every decision that follows. Growth in revenue, market share, and valuation isn’t just encouraged. It’s required.&lt;/p&gt;

&lt;p&gt;When it works, this becomes a flywheel. You spend money to grow. Growth improves your market position. That success unlocks more opportunity. Growing companies are fun places to work. I’ve spent most of my career in them because that energy is infectious and motivating.&lt;/p&gt;

&lt;p&gt;What’s discussed far less is how hard sustainable revenue growth actually is. Adding users to a free tier is often easy. Converting those users into paying customers is not, unless your go-to-market motion is exceptional. For a lot of companies, that pressure is where shortcuts start to look tempting.&lt;/p&gt;

&lt;h2 id=&quot;product-led-growth-actually&quot;&gt;Product-led growth, actually&lt;/h2&gt;

&lt;p&gt;Before going further, it’s worth being clear about my vantage point. I’m not the person who ultimately sets pricing or product strategy. But as Director of Solutions Engineering, I sit in the middle of customers, sales, and the community. When decisions land poorly, I feel it immediately.&lt;/p&gt;

&lt;p&gt;From where I sit, Tailscale’s go-to-market motion is product-led growth in a very literal, very unromantic sense. People show up because they have a real connectivity problem in their own lives and want it to stop being annoying. If the product works, they keep using it. If it doesn’t, they leave.&lt;/p&gt;

&lt;p&gt;What makes those users valuable isn’t that we extract revenue from them directly. It’s that those same problems inevitably show up at work. And once someone has used a tool that just works, their tolerance for brittle, frustrating alternatives drops fast.&lt;/p&gt;

&lt;p&gt;If a company like Tailscale were to start degrading the personal tier in pursuit of short-term revenue, that incentive chain would collapse. It wouldn’t unlock some hidden pool of money. It would remove the very top of the funnel that drives our revenue growth in the first place. Individual users aren’t a loss leader - they’re the mechanism by which trust, familiarity, and adoption propagate into larger rollouts inside organizations.&lt;/p&gt;

&lt;p&gt;This is also why comparisons to consumer platforms tend to fall apart. Those businesses operate at enormous scale, where marginal users can be monetized through ads, fees, or data. The market Talscale operates in is different. Secure business connectivity is a multi-tens-of-billions-of-dollars global market, but that revenue comes from companies, not individuals. There simply aren’t enough people with homelabs, side projects, or personal VPN needs to generate meaningful revenue on their own. And trying to squeeze value out of them would actively harm the thing that makes the business work.&lt;/p&gt;

&lt;p&gt;I think that’s why we’re careful about what we do and don’t gate behind a paid plan. In practice, removing useful features from the personal tier doesn’t create value - it just makes the product worse. And making the product worse at the individual level directly undermines the &lt;em&gt;business&lt;/em&gt; trying to grow.&lt;/p&gt;

&lt;p&gt;From where I sit, we don’t want people paying us to make their personal networking tolerable. We want companies paying us because their employees already like using the product, and want that same experience at work. What keeps this model working isn’t clever pricing or gradual value extraction. It’s people who like the product enough to say, “Hey, this would make our lives easier,” even when that’s not their job.&lt;/p&gt;

&lt;p&gt;I’m literally incentivized, as a sales engineer at Tailscale, to sell the product. And I have very little interest in trying to generate revenue from personal users, because unhappy users don’t advocate for anything. Making them happy isn’t a feature - it’s the whole point.&lt;/p&gt;

&lt;h2 id=&quot;inevitability-is-lazy-thinking&quot;&gt;Inevitability is lazy thinking&lt;/h2&gt;

&lt;p&gt;One of the things I struggle with in the enshittification narrative is how quickly it turns into inevitability, like some self-fulfilling prophecy. There’s this pervasive idea that every company will eventually betray its users, so we may as well stop expecting anything better and that’s just capitalism baby.&lt;/p&gt;

&lt;p&gt;I think once we start to bandy around the idea of enshittifiation at literally everything, we’ve enshittified the concept of enshittification. We’ve removed all accountability from the equation.&lt;/p&gt;

&lt;p&gt;Companies don’t enshittify because time passes. They enshittify because incentives change, leadership priorities shift, or short-term outcomes are allowed to override long-term trust. Cynicism feels realistic, but more often than not it’s just resignation dressed up as wisdom.&lt;/p&gt;

&lt;h2 id=&quot;trust-is-a-business-constraint&quot;&gt;Trust is a business constraint&lt;/h2&gt;

&lt;p&gt;Trust isn’t a marketing asset - it’s a constraint. Once users believe you will eventually make their experience worse in pursuit of growth, every decision you make is filtered through that assumption and at that point, even good changes are met with suspicion. I’ve seen Reddit comments recently that have framed “new Tailscale features” as the beginning of Tailscale’s enshittification cycle because &lt;em&gt;any&lt;/em&gt; change is now considered enshittification.&lt;/p&gt;

&lt;p&gt;For a company like Tailscale, trust compounds slowly and breaks quickly. Our product sits directly in the critical path of how people access their networks and their work. If users stop believing we’re acting in good faith, no amount of pricing optimization or feature bundling will fix that and the most important thing about this is that &lt;em&gt;everyone who makes decisions at Tailscale knows it&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;what-would-actually-force-this-to-change&quot;&gt;What would actually force this to change&lt;/h2&gt;

&lt;p&gt;None of this is to say the model is immortal. There are conditions under which it would break. If the personal tier stopped being a meaningful driver of business adoption. If the problems we solve no longer overlapped between individuals and organizations. Or if the economics of running the platform changed in a way that made the current structure unsustainable.&lt;/p&gt;

&lt;p&gt;If that day ever comes, the &lt;em&gt;honest response&lt;/em&gt; wouldn’t be to degrade the experience and hope people don’t notice or don’t care. It would be to explain the tradeoffs openly, change the model explicitly, and accept the consequences of that decision. I suspect in that set of circumstances, Tailscale has enough competitors (including our open source control plane!) that it would lead to a significant drop off in users.&lt;/p&gt;

&lt;p&gt;Until then, it’s worth being honest about the consequences. Calling enshittification prematurely doesn’t protect users, &lt;em&gt;it can accelerate the conditions that make it more likely&lt;/em&gt;. If you scare people away from trying the product, you erode the very trust and adoption that keeps it working. It might make you feel better, but it’s also self-fulfilling cynicism - and it leaves everyone worse off.&lt;/p&gt;
</description>
                    <pubDate>Sun, 18 Jan 2026 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2026/01/18/enshittification.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2026/01/18/enshittification.html</guid>
                </item>
            
		
            
                <item>
                    <title>An easy, realistic model for MCP connectivity</title>
                    
                        <description>&lt;p&gt;You can’t escape it. Everywhere you turn in the tech ecosystem, AI is there.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/dwight-angela.jpeg&quot; alt=&quot;Dwight&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Whether you’re an AI skeptic or an AI convert, you almost certainly understand how explosive the change in the tech ecosystem has been, and how &lt;em&gt;fast&lt;/em&gt; everything is moving right now.&lt;/p&gt;

&lt;p&gt;I’ll keep my personal opinions about AI generally out of this blog most (mostly) but I’ve been staying very familiar with the &lt;a href=&quot;https://modelcontextprotocol.io/&quot;&gt;Model Context Protocol&lt;/a&gt; (MCP) for a few reasons, the primary one being that it seems &lt;em&gt;absolutely terrifying&lt;/em&gt; in ways I can’t really comprehend.&lt;/p&gt;

&lt;p&gt;Many of the security lessons we’ve learned over the years seem to have been overlooked in MCP’s rapid development. “Protect your data, it is what’s unique to you!” was the battle-cry of every tech oriented person for most of my career. I’m old enough to remember when Cambridge Analytica (mainly because it wasn’t that long ago..) was raked over the coals because it weaponised our social media data and now a few years later we seem quite content with the idea letting large VC-funded organisations to slurp up mountains of our private data to train their word guessers.&lt;/p&gt;

&lt;p&gt;I’m being as glib as I always am on this blog with the last paragraph, but in all seriousness, MCP has a problem - you want to get your data into an LLM, but you don’t want everyone else to be able to see it.&lt;/p&gt;

&lt;h2 id=&quot;a-quick-history-of-the-mcp-evolution&quot;&gt;A quick history of the MCP evolution&lt;/h2&gt;

&lt;p&gt;At its core, MCP is a funnel. How do I get local information - or - information that isn’t crawlable on the public internet - into an LLM so it can analyse it. Anthropic defined a spec that will help you do just that, and in its infancy, the trick was simple - run a local server that speaks JSON so that a local client can call and pipe it into the LLM.&lt;/p&gt;

&lt;p&gt;The initial version of the spec seemed destined for local connectivity. The servers were designed to be accessed over a &lt;a href=&quot;https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#stdio&quot;&gt;stdio&lt;/a&gt;, which you’d run locally and then connect to with an MCP client, like Claude Desktop. Stdio is just “standard input and output streams” and isn’t at all designed to be called remotely, so generally your data is pretty safe. It’s really not easy to intentionally expose your data to the big scary world.&lt;/p&gt;

&lt;p&gt;Running a stdio MCP server was pretty straightforward, most of them are written in Typescript or Python, and you’d just add commands that you ran into your MCP client and it’d execute it on startup - easy.&lt;/p&gt;

&lt;p&gt;In Claude Desktop for example, here’s how you’d run a filesystem MCP so Claude can analyse your local filesystem:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;mcpServers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;filesystem&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;command&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;npx&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;args&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;-y&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;@modelcontextprotocol/server-filesystem&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;/Users/username/Desktop&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;/Users/username/Downloads&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Pretty straightforward if you can, you know - easily write JSON and understand what the hell &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;npx&lt;/code&gt; is. Obviously for non-technical users, this looks like a magical incantation, but that’s okay, MCP is early.&lt;/p&gt;

&lt;p&gt;As things have progressed (very very quickly, I might add) it’s become obvious to people that eventually, you want to be able to run these MCP servers somewhere else other than your local machine. Various companies have sprung up around this, with varying degrees of success and questionable security tactics.&lt;/p&gt;

&lt;p&gt;So the spec evolved to meet these needs, and the next iteration introduced a mechanism to access MCP servers remotely, which became &lt;a href=&quot;https://modelcontextprotocol.io/docs/concepts/transports#server-sent-events-sse&quot;&gt;Server Side Events&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can see on this page there’s suggestions for how make sure this doesn’t go badly for you:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Security Warning: DNS Rebinding Attacks
SSE transports can be vulnerable to DNS rebinding attacks if not properly secured. To prevent this:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;ul&gt;
    &lt;li&gt;Always validate Origin headers on incoming SSE connections to ensure they come from expected sources&lt;/li&gt;
    &lt;li&gt;Avoid binding servers to all network interfaces (0.0.0.0) when running locally - bind only to localhost (127.0.0.1) instead&lt;/li&gt;
    &lt;li&gt;Implement proper authentication for all SSE connections&lt;/li&gt;
    &lt;li&gt;Without these protections, attackers could use DNS rebinding to interact with local MCP servers from remote websites.&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;SSE started to take off despite these warnings on the server side, but due to the pace of innovation, clients were surprisingly slow to introduce support for this. At the time of writing, I still can’t find a client that is broadly used that will allow you to connect easily to an SSE server.&lt;/p&gt;

&lt;p&gt;So we started to see a new type of tool appear - the proxy. It would proxy requests from stdio clients into SSE events, allowing people to run those SSE servers remotely.&lt;/p&gt;

&lt;p&gt;Finally, the most recent innovation has been to completely rework SSE (after only 5 months, by my count!) and switch to &lt;a href=&quot;https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http&quot;&gt;Streamable HTTP&lt;/a&gt; as an alternative, which might just set the land speed record for a protocol deprecation, but still, an evolution nonetheless.&lt;/p&gt;

&lt;h2 id=&quot;and-yet-theres-still-a-problem&quot;&gt;And yet, there’s still a problem&lt;/h2&gt;

&lt;p&gt;As all this change has happened, I’ve been watching it and thinking to myself “well, okay, but this is still a security nightmare”. There are obviously things happening in the space that are changing here, and there’s &lt;em&gt;intent&lt;/em&gt; to fix it, but take another look at the &lt;a href=&quot;https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#security-warning&quot;&gt;streamable HTTP&lt;/a&gt; warnings and spec - there is absolutely no written document at the time of writing this blog post to introduce any sort of authentication to the protocol. It &lt;em&gt;has&lt;/em&gt; been &lt;a href=&quot;https://modelcontextprotocol.io/specification/draft/basic/authorization&quot;&gt;drafted&lt;/a&gt; and is following a fairly typical pattern of late - “let’s slap some oauth on top of it and call it good”.&lt;/p&gt;

&lt;p&gt;Personally, that doesn’t make me particularly happy, because I think oauth is really confusing and easy to screw up and secondly you still have a &lt;em&gt;communication&lt;/em&gt; problem. I don’t care how much authentication you put on top of something, if there’s particularly sensitive data behind something, I still don’t want it hanging around on the internet. If you &lt;em&gt;do&lt;/em&gt; feel okay with this, I presume all your databases are on the internet as well, but it’s okay because they have passwords on them. Right?&lt;/p&gt;

&lt;p&gt;So, this got me thinking. I work for Tailscale, Tailscale’s really quite good at protecting your data and connecting things together. How can we improve this situation?&lt;/p&gt;

&lt;h2 id=&quot;my-first-mcp-server&quot;&gt;My first MCP server&lt;/h2&gt;

&lt;p&gt;As things were evolving, &lt;a href=&quot;https://github.com/jaxxstorm/tailscale-mcp&quot;&gt;I wrote a little MCP server for the Tailscale API&lt;/a&gt; that allows you to query a few things there, primarily to try and understand the protocol and see what useful things we could do. I stuck with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stdio&lt;/code&gt; and then introduced SSE, but was pretty unhappy with how things looked at that point, so left it at that.&lt;/p&gt;

&lt;p&gt;However, when streaming HTTP was published a few weeks ago, I realised there was an opportunity here to think differently about the security model to make it a little more robust and considerably more &lt;em&gt;private&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;You can get some of the benefits of Tailscale with streaming HTTP by just having a Tailscale on both ends of the equation. Install Tailscale on your local machine, spin up another one somewhere and run a streamable HTTP server, then configure your MCP client to run an MCP proxy (of which there many, such as &lt;a href=&quot;https://github.com/sparfenyuk/mcp-proxy&quot;&gt;sparfenyuk/mcp-proxy&lt;/a&gt; or if you prefer Typescript, &lt;a href=&quot;https://github.com/punkpeye/mcp-proxy&quot;&gt;punkpye/mcp-proxy&lt;/a&gt;). When you configure the proxy, set your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;endpoint&lt;/code&gt; to the remote Tailscale address and the port/url your MCP server is listening on and you’re golden.&lt;/p&gt;

&lt;p&gt;This is a massive improvement over the “run it on the internet model with oauth” because now you don’t have to run the damn thing on the public internet, which should be fairly obvious. I was going to go with this approach, but then I had an idea - what if we could use Tailscale’s application awareness to improve the security model &lt;em&gt;even more&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;So I did two things:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I wrote a little MCP proxy in Go that forwards Tailscale’s headers like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;X-Tailscale-User&lt;/code&gt; to the remote HTTP MCP server&lt;/li&gt;
  &lt;li&gt;Then I updated my Tailscale MCP server to support reading Tailscale’s grants mechanism to determine &lt;em&gt;what that Tailscale user is allowed to do&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;a-proper-security-model-in-action&quot;&gt;A proper security model in action&lt;/h2&gt;

&lt;p&gt;So how does this look? Well, I can run my Tailscale MCP server on a remote machine. I spun up a VM in digital ocean and fired it up:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl &lt;span class=&quot;nt&quot;&gt;-L&lt;/span&gt; https://github.com/jaxxstorm/tailscale-mcp/releases/download/v0.0.3/tailscale-mcp-v0.0.3-li
nux-amd64.tar.gz | &lt;span class=&quot;nb&quot;&gt;tar&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-xzf&lt;/span&gt; -
&lt;span class=&quot;nv&quot;&gt;TS_AUTHKEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;lt;your-auth-key&amp;gt; ./tailscale-mcp &lt;span class=&quot;nt&quot;&gt;--tailnet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;lt;your-tailnet&amp;gt; &lt;span class=&quot;nt&quot;&gt;--api-key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;lt;tailscale-api-key&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And some output logs&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;4:48:37	INFO	tailscale-mcp/main.go:265	Starting ts-mcp	&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;version&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;0.0.3&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;tailnet&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;lbrlabs.com&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;hostname&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;ts-mcp&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;port&quot;&lt;/span&gt;: 8080, &lt;span class=&quot;s2&quot;&gt;&quot;debug&quot;&lt;/span&gt;: &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;stdio&quot;&lt;/span&gt;: &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
2025/06/09 14:48:37 tsnet running state path /root/.config/tsnet-tailscale-mcp/tailscaled.state
2025/06/09 14:48:37 tsnet starting with &lt;span class=&quot;nb&quot;&gt;hostname&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;ts-mcp&quot;&lt;/span&gt;, varRoot &lt;span class=&quot;s2&quot;&gt;&quot;/root/.config/tsnet-tailscale-mcp&quot;&lt;/span&gt;
2025/06/09 14:48:37 Authkey is &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; but state is NoState. Ignoring authkey. Re-run with &lt;span class=&quot;nv&quot;&gt;TSNET_FORCE_LOGIN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;1 to force use of authkey.
14:48:37	INFO	tailscale-mcp/main.go:558	Serving MCP via Tailscale	&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;address&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;:8080&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
14:48:37	INFO	tailscale-mcp/main.go:586	Serving MCP locally	&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;address&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;127.0.0.1:8080&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
2025/06/09 14:48:42 AuthLoop: state is Running&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, I just need to configure my MCP client (in my case, Claude Desktop) to run my MCP proxy locally.&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;mcpServers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tailscale&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;command&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;/usr/local/bin/tailscale-mcp-proxy&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;args&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;--server&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;http://ts-mcp:8080/mcp&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I need Tailscale running on my machine so they can communicate with each other and capture the information who I am.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tailscale status
100.84.243.110  lbr-macbook-pro      mail@        macOS   -
100.72.57.77    lbr-iphone           mail@        iOS     offline
100.81.81.4     lon-derp1            tagged-devices linux   -
100.105.3.74    sea-derp1            tagged-devices linux   -
100.66.15.114   ts-mcp               mail@        linux   idle&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; offline, tx 52624 rx 72604
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, I can fire up Claude Desktop and ask it questions about my Tailnet:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/claude-access-denied.png&quot; alt=&quot;Access denied&quot; /&gt;&lt;/p&gt;

&lt;p&gt;But wait, it’s telling me I don’t have permission? If we look at the MCP server’s logs, we can see I don’t have the right access to run the query:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;15:17:08	INFO	tailscale-mcp/main.go:159	No MCP capabilities found
15:17:08	WARN	tailscale-mcp/main.go:172	No MCP capabilities found	{&quot;user&quot;: &quot;mail@lbrlabs.com&quot;}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The reason for this is that my Tailscale MCP server is going to use Tailscale’s grants to determine which tools I’m allowed to call. So lets add a grant to my Tailscale ACL to indicate I can call all tools:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;src&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;autogroup:admin&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dst&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;100.66.15.114&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;the&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;address&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;my&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;Tailscale&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;MCP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;could&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;also&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;use&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;ip&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tcp:8080&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;app&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;jaxxstorm.com/cap/mcp&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tools&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;     &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;can&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;call&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tools&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
            &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;resources&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;can&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;call&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;resources&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, let’s try that query again!&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/claude-access-granted.png&quot; alt=&quot;Access granted&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Success!&lt;/p&gt;

&lt;h2 id=&quot;the-caveats&quot;&gt;The caveats&lt;/h2&gt;

&lt;p&gt;This is all well and good, but what are some of the considerations to this approach?&lt;/p&gt;

&lt;p&gt;As it stands, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tsnet&lt;/code&gt; is only really usable for Go servers, so you’d need to write your MCP server in Go, which is not officially supported right now. Most MCP servers are written in Typescript or Python. To each their own.&lt;/p&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; Tailscale &lt;em&gt;does&lt;/em&gt; have a proof of concept &lt;a href=&quot;https://github.com/tailscale/libtailscale&quot;&gt;C based library&lt;/a&gt; which could be used for Python and Typescript based MCP servers, and if you’re interested in using it, you should contact us at Tailscale by posting on &lt;a href=&quot;https://www.reddit.com/r/Tailscale/&quot;&gt;Reddit&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;The other side of this of course is that only local clients can implement these proxies. Anthropic supports calling remote MCP servers on its enterprise plans, but it expects them to be on the public internet. I would personally &lt;em&gt;love&lt;/em&gt; the idea of being able to connect from your Claude team to your Tailnet (imagine just giving Claude your Tailscale oauth credentials and it provisions a private network for you for your MCP connections!) but I’ll need someone from Anthropic to implement that..&lt;/p&gt;

&lt;p&gt;I’m also personally a huge fan of this Tailscale application model of permissions and capabilities, but I suspect the first detraction will be “we want these standards to be open, not have Tailscale in the middle of them!” which I totally get.&lt;/p&gt;

&lt;h2 id=&quot;the-code&quot;&gt;The code&lt;/h2&gt;

&lt;p&gt;Finally, and to close this post down, if you want to try any of this yourself, you can find all the code for the MCP server &lt;a href=&quot;https://github.com/jaxxstorm/tailscale-mcp&quot;&gt;here&lt;/a&gt; and the MCP proxy&lt;a href=&quot;https://github.com/jaxxstorm/tailscale-mcp-proxy&quot;&gt;here&lt;/a&gt; - I hope this inspires you to get &lt;em&gt;something&lt;/em&gt; important off the public internet!&lt;/p&gt;

&lt;h2 id=&quot;hypocrisy&quot;&gt;Hypocrisy&lt;/h2&gt;

&lt;p&gt;You’ll recall at the beginning of this post, I was lamenting the idea that we’re going to let LLMs hoover up all our data, and yet here I am enthusiastically writing MCP servers to make it easier.&lt;/p&gt;

&lt;p&gt;I suppose that’s the thing about technology - you can either participate in shaping how it develops, or you can stand on the sidelines complaining about how everyone else is doing it wrong. I’ve chosen to participate, even if it means being part of a system I have mixed feelings about.&lt;/p&gt;

&lt;p&gt;The best I can do is try to build the secure version of what’s inevitably going to happen anyway. Maybe that makes me a hypocrite, but at least I’m a hypocrite that’s thinking about protecting your data from the scary internet, and &lt;em&gt;only&lt;/em&gt; the LLM can access it. That makes it better…right?&lt;/p&gt;

</description>
                    <pubDate>Sun, 08 Jun 2025 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2025/06/08/secure-mcp-connectivity.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2025/06/08/secure-mcp-connectivity.html</guid>
                </item>
            
		
            
                <item>
                    <title>The Death of Developer Relations</title>
                    
                        <description>&lt;p&gt;Every year, I gear up for “conference season,” which includes KubeCon NA (typically held between mid-October and mid-November) and AWS re:Invent, always the week after Thanksgiving in the US. As a sales engineer, this time of year is exhilarating. It’s a chance to speak with customers, prospects, and technology leaders in the ever-evolving cloud-native and cloud computing spaces.&lt;/p&gt;

&lt;p&gt;While I usually leave these events motivated and energized, something this year was different — marked by a noticeable absence. I could count on one hand the number of conversations I had with anyone in Developer Relations (DevRel) - whether they were community builders, developer advocates, or part of any similar role.&lt;/p&gt;

&lt;p&gt;Admittedly, some of this could be circumstantial. For instance, KubeCon NA was held in Salt Lake City this year, a location that understandably kept some attendees away. But for me, it felt like a reflection of broader, more fundamental shifts happening in the tech industry.&lt;/p&gt;

&lt;h1 id=&quot;what-is-devrel&quot;&gt;What is DevRel?&lt;/h1&gt;

&lt;p&gt;DevRel has always been a role that defies simple definition. It’s not marketing, but it overlaps with marketing. It’s not sales, but it supports sales. It’s not customer success, but it builds relationships with users. And it’s certainly not engineering, though it requires technical skills.&lt;/p&gt;

&lt;p&gt;It demands a rare combination of traits: technical aptitude, charisma, and customer empathy. Having worked in a DevRel role for six months, I can attest - it’s one of the hardest jobs I’ve ever had.&lt;/p&gt;

&lt;p&gt;When done well, DevRel can be transformative. Companies like HashiCorp, Stripe, and Twilio have built exceptional brands in part due to their effective DevRel strategies. But the very ambiguity that has been DevRel’s strength is now starting to feel like a liability.&lt;/p&gt;

&lt;h1 id=&quot;if-you-cant-measure-it-you-cant-manage-it&quot;&gt;If You Can’t Measure It, You Can’t Manage It&lt;/h1&gt;

&lt;p&gt;During my brief stint in DevRel, I often asked my manager, “What does good performance look like in this role?” The answer, more often than not, was fuzzy.&lt;/p&gt;

&lt;p&gt;For example, I spent hours answering questions in community Slack channels. It felt impactful, but we couldn’t quantify how it moved the needle on metrics like sales pipeline or customer retention. I gave virtual conference talks, but we struggled to tie those efforts to lead generation or brand impact.&lt;/p&gt;

&lt;p&gt;I don’t think this is unique to my experience. Many DevRel professionals resist being measured by traditional KPIs, arguing their work is about “building community,” “fostering trust,” or “engaging developers.” While those outcomes are valuable, they’re hard to quantify in ways a CFO or board member cares about.&lt;/p&gt;

&lt;p&gt;Then, external factors came along and forced everyone to take a harder look at roles like DevRel.&lt;/p&gt;

&lt;h1 id=&quot;the-vc-money-printer&quot;&gt;The VC Money Printer&lt;/h1&gt;

&lt;p&gt;For much of the 2010s, startups existed in a world of near-limitless capital. Low interest rates made borrowing cheap, and VCs funded companies generously, hoping to catch the next unicorn. In this environment, initiatives without immediate ROI—like DevRel—had room to thrive.&lt;/p&gt;

&lt;p&gt;DevRel was a long-term investment: build trust, foster adoption, grow communities, and the payoff will come eventually. It worked as long as companies didn’t have to justify those investments on a quarterly basis.&lt;/p&gt;

&lt;p&gt;But the money printer stopped. Rising interest rates and tightening VC funding forced startups to prioritize sustainability over growth at all costs. Every team came under scrutiny, and functions that didn’t directly contribute to revenue found themselves in jeopardy.&lt;/p&gt;

&lt;p&gt;DevRel, for all its intangible benefits, became a glaring target.&lt;/p&gt;

&lt;h1 id=&quot;devrel-as-a-cost-center&quot;&gt;DevRel as a Cost Center&lt;/h1&gt;

&lt;p&gt;In tough times, businesses ask hard questions: “Which teams are directly contributing to the bottom line?” DevRel often struggled to answer.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Sponsoring conferences? Expensive, with unclear ROI.&lt;/li&gt;
  &lt;li&gt;Running Slack communities? Valuable, but hard to tie directly to revenue.&lt;/li&gt;
  &lt;li&gt;Conference talks? Great for brand awareness, but hard to tie to lead generation&lt;/li&gt;
  &lt;li&gt;Social media posts? Effective at generating leads if done from company accounts, but very hard to measure from personal accounts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By contrast, roles like sales, marketing, and engineering have clear outputs tied to outcomes. Sales closes deals. Marketing generates leads. Engineering ships features. DevRel, sitting somewhere between these disciplines, risks being seen as neither fish nor fowl—doing a little of everything but owning none of it.&lt;/p&gt;

&lt;h1 id=&quot;product-led-growth&quot;&gt;Product Led Growth&lt;/h1&gt;

&lt;p&gt;Compounding DevRel’s challenges is the rise of Product-Led Growth (PLG). In a PLG model, the product sells itself through intuitive design, freemium tiers, and frictionless onboarding. Developers don’t need a community advocate or a conference talk—they sign up, try the product, and decide for themselves.&lt;/p&gt;

&lt;p&gt;Now, I have strong opinions on PLG—freemium tiers are becoming harder to sustain as the cost of compute rises. But the theory remains sound. Companies like my current employer, Tailscale, have nailed this approach. You can grow your user base organically without the overhead of a dedicated DevRel team.&lt;/p&gt;

&lt;p&gt;This isn’t to say DevRel is useless in a PLG world, but it does mean the traditional role—focused on advocacy and education—may no longer be as critical.&lt;/p&gt;

&lt;h1 id=&quot;the-future-of-devrel&quot;&gt;The Future of DevRel&lt;/h1&gt;

&lt;p&gt;Despite the title of this post, I don’t think DevRel is truly dead. The need for technical advocacy and community building hasn’t disappeared - it’s just evolving. Many of the developer relations engineers that have survived this change have embraced the shift into marketing, leaning on their expertise as content generators to create documentation, tutorials, write blog posts and generate video content for YouTube. At Tailscale, we often seen incredible engagement with our content team and wonderful feedback from our YouTube channel. What’s been so obvious about the success here is that they’re measurable outcomes, and the people in these roles have embraced the challenges around having measurable goals.&lt;/p&gt;

&lt;p&gt;Content generations is only one part of the DevRel flywheel, and I think I can see a world where we have specific personas tied to different parts of the business that have focus on the parts of a DevRel job that each line of business care about, such as:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Community Solutions Engineer&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A blend of solutions engineering and DevRel, focused on driving adoption at the top of the funnel while contributing to measurable ARR metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Community Customer Success&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ensuring existing customers are successful, reducing churn, and driving expansion revenue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Community Data Engineer&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using analytics to tie activities to outcomes like product adoption, retention, or revenue growth.&lt;/p&gt;

&lt;p&gt;The common thread? Measurability. The days of vague metrics like “community engagement” are over.&lt;/p&gt;

&lt;h1 id=&quot;is-this-really-the-end&quot;&gt;Is This Really the End?&lt;/h1&gt;

&lt;p&gt;What we’re witnessing isn’t the death of DevRel—it’s a wake-up call. The traditional model, reliant on soft metrics and long-term value propositions, can’t survive unchanged.&lt;/p&gt;

&lt;p&gt;The good news is that there’s still a place for DevRel. Professionals who adapt—aligning their work with sales, customer success, or product teams—will continue to thrive. Those who resist change may not.&lt;/p&gt;

&lt;p&gt;The industry has shifted. The question is: will DevRel shift with it?&lt;/p&gt;
</description>
                    <pubDate>Tue, 10 Dec 2024 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2024/12/10/the-death-of-devrel.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2024/12/10/the-death-of-devrel.html</guid>
                </item>
            
		
            
                <item>
                    <title>Why the hell is your Kubernetes API public?</title>
                    
                        <description>&lt;p&gt;Do you ever really think about how you get access to your Kubernetes control plane? Whatever mechanism you use to provision your cluster, you get a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; and usually just go on your merry way to overcomplicating your infrastructure.&lt;/p&gt;

&lt;p&gt;However, if you’ve ever looked at your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; you’ll see you have a server address.&lt;/p&gt;

&lt;p&gt;You can check the health of your cluster by doing the following:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl &lt;span class=&quot;nt&quot;&gt;-k&lt;/span&gt; &lt;span class=&quot;si&quot;&gt;$(&lt;/span&gt;kubectl config view &lt;span class=&quot;nt&quot;&gt;--output&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jsonpath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'{.clusters[*].cluster.server}'&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;)&lt;/span&gt;/healthz
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Assuming everything is working as expected (and if it isn’t, you should probably stop reading and go figure out what), it should return &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ok&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Has it ever come to your attention that this &lt;em&gt;just works&lt;/em&gt; from a networking perspective? More than likely you didn’t have to a connect to a VPN, or  SSH into a bastion/jump host host?&lt;/p&gt;

&lt;p&gt;How did I know that? Well, because the vast majority of Kubernetes clusters are just hanging out on the public internet, without a care in the world.&lt;/p&gt;

&lt;p&gt;If you search &lt;a href=&quot;https://www.shodan.io/search?query=kubernetes&quot;&gt;shodan.io for Kubernetes clusters&lt;/a&gt; you’ll see there’s almost 1.4 million clusters, readily accessible and open to the scary, scary world.&lt;/p&gt;

&lt;p&gt;For reasons I can’t quite understand, we’ve sort of collectively decided that it’s okay to put our Kubernetes control planes on the public internet. At the very least, we’ve sort of decided it’s okay to give them a public IP address - sure you might add some security groups of firewall rules from specific IP addresses, but the control plane is still accessible from the internet.&lt;/p&gt;

&lt;p&gt;To put this into some sort of perspective, how many of the people reading this get a cold shudder when they think about putting their database on the public internet? Or a windows server with RDP? Or a Linux server with SSH?&lt;/p&gt;

&lt;p&gt;Established practices say this is generally not a good idea, and yet in order to make our lives easier, we’ve decided that it’s &lt;em&gt;okay&lt;/em&gt; to let every person and their dog free access to try and make our clusters theirs.&lt;/p&gt;

&lt;h2 id=&quot;private-networks&quot;&gt;Private Networks&lt;/h2&gt;

&lt;p&gt;As soon as you put something in a private subnet in the cloud, you add a layer of complexity to the act of actually using it.
You have to explain to everyone who needs access to it, including that developer who’s just joined the team, where it is and how to use it. You might use an SSH tunnel, or a bastion instance, or god forbid that VPN server someone set up years ago that nobody dares touch.&lt;/p&gt;

&lt;p&gt;We sort of accept these for things like databases because we very rarely need to get into them except in case of emergency, and we think it’s okay to have to route through something else because the data in them is important enough to protect.&lt;/p&gt;

&lt;p&gt;In addition to this, when cloud providers started offering managed Kubernetes servers, most of them didn’t even &lt;em&gt;have&lt;/em&gt; private control planes. It was only in the last few years that they started offering this as a feature, and even then, it’s not the default.&lt;/p&gt;

&lt;p&gt;So the practice of putting a very important API on the internet has proliferated because it’s just &lt;em&gt;easier&lt;/em&gt; to do it that way. The real concern here is that we’re one severe vulnerability from having a bitcoin miner on every Kubernetes cluster on the internet.&lt;/p&gt;

&lt;h2 id=&quot;an-alternative&quot;&gt;An alternative&lt;/h2&gt;

&lt;p&gt;I &lt;a href=&quot;/blog/2024/02/26/cheap-kubernetes-loadbalancers.html&quot;&gt;previously&lt;/a&gt; wrote about the &lt;a href=&quot;https://tailscale.com/kb/1236/kubernetes-operator&quot;&gt;Tailscale Kubernetes Operator&lt;/a&gt; and its ability to expose services running inside your Kubernetes cluster to your Tailnet, but it has another amazing feature:&lt;/p&gt;

&lt;p&gt;It can act as a Kubernetes proxy for your cluster.&lt;/p&gt;

&lt;h2 id=&quot;say-what-now&quot;&gt;Say what now?&lt;/h2&gt;

&lt;p&gt;Well, let’s say you provision a Kubernetes cluster in AWS. You decide that you that in order to give yourself another layer of protection, you’re going to make sure the control plane is only accessible within the VPC.&lt;/p&gt;

&lt;p&gt;If you install the Tailscale Kubernetes operator and set the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;apiServerProxyConfig&lt;/code&gt; flag, it’ll create a device in your Tailnet that makes it accessible to anyone on the tailnet. This means before you’re able to use the cluster, you need to be connected to the Tailnet. All of that pain I mentioned previously with Bastion hosts and networking just vanishes into thin air.&lt;/p&gt;

&lt;p&gt;Let’s take it for a spin!&lt;/p&gt;

&lt;h2 id=&quot;installing-and-using-the-tailscale-operator&quot;&gt;Installing and using the Tailscale Operator&lt;/h2&gt;

&lt;h3 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h3&gt;

&lt;p&gt;You’ll need to have your own Tailnet, and be connected to it. You can &lt;a href=&quot;https://login.tailscale.com/start?utm=leebriggs.co.uk&quot;&gt;sign up for Tailscale&lt;/a&gt; and it’s free for personal use.&lt;/p&gt;

&lt;p&gt;Once that’s done, you’ll need to make a slight change to your ACL. The Kubernetes operator uses Tailscale’s tagging mechanism so let’s create a tag for the operator to use, and then a tag for it to give to client devices it registers:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tagOwners&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tag:k8s-operator&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;           &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;autogroup:admin&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;allow&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;anyone&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;admin&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;own&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;the&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;s-operator&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tag&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tag:k8s&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;                    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tag:k8s-operator&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then, register an oauth client with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Devices&lt;/code&gt; write scope and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tag:k8s-operator&lt;/code&gt; tag.&lt;/p&gt;

&lt;h3 id=&quot;step-1-get-a-kubernetes-cluster&quot;&gt;Step 1: Get a Kubernetes Cluster&lt;/h3&gt;

&lt;p&gt;There’s a lot of ways to do this, choose the way you prefer. If you’re a fan of eksctl, &lt;a href=&quot;https://eksctl.io/usage/eks-private-cluster/&quot;&gt;this page&lt;/a&gt; shows you how to create a fully private cluster.&lt;/p&gt;

&lt;p&gt;You can of course use the defaults and try this out with a public cluster if you like, but I’m going to assume you’re doing this because you want to make your cluster private.&lt;/p&gt;

&lt;p&gt;You may have to do this from inside your actual VPC, because remember, any post install steps that interact with the Kubenrnetes API server won’t work. I leverage a &lt;a href=&quot;https://tailscale.com/kb/1019/subnets&quot;&gt;Tailscale Subnet Router&lt;/a&gt; to make this easier, more on this later.&lt;/p&gt;

&lt;h3 id=&quot;step-2-install-the-tailscale-kubernetes-operator&quot;&gt;Step 2: Install the Tailscale Kubernetes Operator&lt;/h3&gt;

&lt;p&gt;The easiest way to do this is with &lt;a href=&quot;https://tailscale.com/kb/1236/kubernetes-operator#installation&quot;&gt;Helm&lt;/a&gt;. You’ll need to add the Tailscale Helm repository:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;helm repo add tailscale https://pkgs.tailscale.com/helmcharts
helm repo update
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then install the operator with your oauth keys from the prerequites step, and enable the proxy:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;helm upgrade &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--install&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  tailscale-operator &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  tailscale/tailscale-operator &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--namespace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;tailscale &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--create-namespace&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; oauth.clientId&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;OAUTH_CLIENT_ID&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; oauth.clientSecret&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;OAUTH_CLIENT_SECRET&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; apiServerProxyConfig.mode&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;true&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--wait&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, if you look at your Tailscale dashboard, or use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tailscale status&lt;/code&gt; you should see a couple of new devices - the operator and a service just for the API server proxy.&lt;/p&gt;

&lt;p&gt;You can now access your Kubernetes cluster from anywhere that has a Tailscale client installed, no faffing required.&lt;/p&gt;

&lt;h3 id=&quot;step-3-use-it&quot;&gt;Step 3: Use it&lt;/h3&gt;

&lt;p&gt;You’ll need to configure your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; using the following command:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tailscale configure kubeconfig &amp;lt;hostname&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will set up your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; to use the Tailscale API server proxy as the server for your cluster. If you examine your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; you should see something like this:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;clusters&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;server&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;https://eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;contexts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale-auth&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;current-context&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Config&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;users&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale-auth&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;token&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;unused&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;hang-on-a-minute&quot;&gt;Hang on a minute..&lt;/h2&gt;

&lt;p&gt;You probably have some questions. Firstly, what’s that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unused&lt;/code&gt; token all about? What does &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;noauauthmodeth&lt;/code&gt; mean in our operator installation? How does this work?&lt;/p&gt;

&lt;p&gt;Well, if you run a basic &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kubectl&lt;/code&gt; command now (and assuming you’re connected to your Tailnet) you’ll get something back, but it won’t help much:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get nodes
error: You must be logged &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;to the server &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;Unauthorized&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What’s happened here? Well, the good news is, we’ve been able to route to our private Kubernetes control plane, but we’re not sending any information back about who we are. So let’s make a small change to our Tailscale ACL in the Tailscale console. Add the following:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;w&quot;&gt;	&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;grants&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;src&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;autogroup:admin&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;allow&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tailscale&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;admin&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dst&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tag:k8s-operator&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;contact&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tagged&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;s-operator&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;app&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tailscale.com/cap/kubernetes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
					&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;impersonate&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
						&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;groups&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;system:masters&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;use&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;the&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;`system:masters`&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;group&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;the&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
					&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;By adding this grant, I’m assuming you’re an admin of your Tailnet.&lt;/p&gt;

&lt;p&gt;Now, give that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kubectl&lt;/code&gt; command another try.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get nodes
NAME                                              STATUS   ROLES    AGE     VERSION
ip-172-18-183-129.eu-central-1.compute.internal   Ready    &amp;lt;none&amp;gt;   13h     v1.29.0-eks-5e0fdde
ip-172-18-187-244.eu-central-1.compute.internal   Ready    &amp;lt;none&amp;gt;   3d18h   v1.29.0-eks-5e0fdde
ip-172-18-42-205.eu-central-1.compute.internal    Ready    &amp;lt;none&amp;gt;   3d18h   v1.29.0-eks-5e0fdde
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That’s more like it.&lt;/p&gt;

&lt;p&gt;As you can see here, I’ve solved two distinct problems:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I’ve made my public Kubernetes control plane accessible over a VPN, without needing to worry about routing and networking - Tailscale has handled it for me.&lt;/li&gt;
  &lt;li&gt;I’ve also been able to leveraging Tailscale’s ACL mechanism to provide authentication to Kubernetes groups in a clusterrolebinding.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;noauth-mode&quot;&gt;noauth Mode&lt;/h3&gt;

&lt;p&gt;Now, if you’re already happy with your current authorization mode, you can still use Tailscale’s access mechanism to solve the routing problem. In this particular case, you’d install the Operator in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;noauth&lt;/code&gt; mode and then use your cloud providers existing mechanism to retrieve a token.&lt;/p&gt;

&lt;p&gt;Modify your Tailscale Operator installation like so:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;helm upgrade &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--install&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  tailscale-operator &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  tailscale/tailscale-operator &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--namespace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;tailscale &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--create-namespace&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; oauth.clientId&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;OAUTH_CLIENT_ID&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; oauth.clientSecret&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;OAUTH_CLIENT_SECRET&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; apiServerProxyConfig.mode&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;noauth&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\ &lt;/span&gt;&lt;span class=&quot;c&quot;&gt;# using noauth mode&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--wait&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once that’s installed, if you run your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kubectl&lt;/code&gt; command again, you’ll see a different error message:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get nodes
Error from server (Forbidden): nodes is forbidden: User &quot;jaxxstorm@github&quot; cannot list resource &quot;nodes&quot; in API group &quot;&quot; at the cluster scope
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The reason for this is a little obvious, the EKS cluster I’m using has absolutely no idea who &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jaxxstorm@github&lt;/code&gt; is - because it uses IAM to authenticate me to the cluster.&lt;/p&gt;

&lt;p&gt;So let’s modify our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; to retrieve a token as EKS expects. We’ll modify the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;user&lt;/code&gt; section to leverage an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exec&lt;/code&gt; directive - it should look a little bit like this:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;clusters&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;server&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;https://eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;contexts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cluster&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale-auth&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;current-context&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;eks-operator-eu.tail5626a.ts.net&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Config&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;users&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale-auth&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;exec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;client.authentication.k8s.io/v1beta1&quot;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;aws&quot;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;eks&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;get-token&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;--cluster-name&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;kc-eu-central-682884c&quot;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# replace with your cluster name&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;KUBERNETES_EXEC_INFO&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{\&quot;apiVersion\&quot;:&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;\&quot;client.authentication.k8s.io/v1beta1\&quot;}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we’re able to route to the Kubernetes control plane, and authenticate with it using the cloud providers authorization mechanism.&lt;/p&gt;

&lt;h2 id=&quot;some-faqs&quot;&gt;Some FAQs&lt;/h2&gt;

&lt;h3 id=&quot;why-wouldnt-i-just-use-a-subnet-router&quot;&gt;Why wouldn’t I just use a Subnet Router?&lt;/h3&gt;

&lt;p&gt;A common question I get asked is why wouldn’t I just use a subnet router in the VPC to route anything to the private address of the control plane? I leveraged this mechanism when I installed the operator, because my control plane was initially unrouteable from the internet anyway.&lt;/p&gt;

&lt;p&gt;This is a legimate solution to the problem, and if you don’t want to use the operator in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;auth&lt;/code&gt; mode, keep living your life. However, one benefit you get by installing the operator and talking directly to it via your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;KUBECONFIG&lt;/code&gt; is being able to use Tailscale’s ACLs to dictate who can actually communicate with the operator.&lt;/p&gt;

&lt;p&gt;If you recall our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grant&lt;/code&gt; from earlier we were able to dictate who was able to impersonate users inside the cluster in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;auth&lt;/code&gt; mode, but with Tailscale’s ACL system we can also be prescriptive about connectivity. Consider if you removed the default, permissive ACL&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;action&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;accept&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;src&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dst&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;*:*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;commented&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;because&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;hujson&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then defining a group of users who can access the cluster, and adding a more explicit ACL:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nl&quot;&gt;&quot;groups&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;group:engineers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;jaxxstorm@github&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;some&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;stuff&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;here&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;acls&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;Allow&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;all&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;connections.&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;action&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;accept&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;src&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dst&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;*:*&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;action&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;accept&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;src&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;group:engineers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dst&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tag:k8s-operator:443&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I can be much more granular about my access to my cluster, and have a Zero Trust model for my Kubernetes control plane at the &lt;em&gt;network&lt;/em&gt; level as well as the authorization level. Your information security team will &lt;em&gt;love&lt;/em&gt; you for this.&lt;/p&gt;

&lt;p&gt;When you provision the operator in the cluster, you can modify the tags it uses to even further allow you to segment your Tailnet, see the &lt;a href=&quot;https://github.com/tailscale/tailscale/blob/main/cmd/k8s-operator/deploy/chart/values.yaml#L23&quot;&gt;operator config in the Helm chart’s values.yaml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Just remember to ensure your oauth clients has the correct permissions to manage those tags!&lt;/p&gt;

&lt;h3 id=&quot;can-i-scope-the-access-on-the-cluster-side&quot;&gt;Can I scope the access on the cluster side?&lt;/h3&gt;

&lt;p&gt;If you have the operator installed in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;auth&lt;/code&gt; mode, you can scope the access both at the network level (using the aforementioned tags) &lt;em&gt;and&lt;/em&gt; the Kubernetes RBAC system.&lt;/p&gt;

&lt;p&gt;Let’s say we want to give our aforementioned engineers group access to only a single namespace called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;demo&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;First, we’d create the ClusterRole (or Role) and then cluster role binding:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ClusterRole&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;engineers-clusterrole&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;rules&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;apiGroups&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;apps&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;batch&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;extensions&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt; 
  &lt;span class=&quot;na&quot;&gt;resources&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;*&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;verbs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;*&quot;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;RoleBinding&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;engineers-rolebinding&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;demo&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;subjects&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Group&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;engineers&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# note the group name here&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;apiGroup&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;rbac.authorization.k8s.io&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;roleRef&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ClusterRole&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;engineers-clusterrole&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;apiGroup&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;rbac.authorization.k8s.io&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then update our Tailscale ACL to modify the grants:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nl&quot;&gt;&quot;grants&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;src&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;group:engineers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dst&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tag:k8s-operator&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;app&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tailscale.com/cap/kubernetes&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
					&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;impersonate&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
						&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;groups&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;engineers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
					&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
				&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
			&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, if I try to access the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;demo&lt;/code&gt; namespace, I can do the stuff I need to do:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get pods &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; demo
NAME                                     READY   STATUS    RESTARTS   AGE
demo-streamer-e3a170e4-85f4f7b88-gppcn   1/1     Running   0          16h
demo-streamer-e3a170e4-85f4f7b88-xc7gz   1/1     Running   0          16h
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But not in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kube-system&lt;/code&gt; namespace:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; kubectl get pods -n kube-system
Error from server (Forbidden): pods is forbidden: User &quot;jaxxstorm@github&quot; cannot list resource &quot;pods&quot; in API group &quot;&quot; in the namespace &quot;kube-system&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;wrap-up&quot;&gt;Wrap Up&lt;/h2&gt;

&lt;p&gt;As always with the posts I write on here, I’m writing in my personal capacity, but obviously I’m a Tailscale employee with a vested interest in the success of the company. Do I want you to sign up for Tailscale and pay us money? You bet I do. Do I want you to get your Kubernetes clusters off the public internet even if you &lt;em&gt;don’t&lt;/em&gt; want to sign up for Tailscale and pay us money? &lt;strong&gt;Yes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I’ve always had an uneasy feeling about these public clusters, and I can’t help but feel we’re one RCE away from a disaster.&lt;/p&gt;

&lt;p&gt;So now you know how easy it is to get your Kubernetes control plane off the internet, what are you waiting for?&lt;/p&gt;
</description>
                    <pubDate>Sat, 23 Mar 2024 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2024/03/23/why-public-k8s-controlplane-copy.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2024/03/23/why-public-k8s-controlplane-copy.html</guid>
                </item>
            
		
            
                <item>
                    <title>Free Kubernetes Load Balancers with Tailscale</title>
                    
                        <description>&lt;p&gt;Load Balancers are expensive.&lt;/p&gt;

&lt;p&gt;If you’re using Kubernetes, they are also a necessity. Figuring out how to expose a Kubernetes workload to the world without a Load Balancer is a bit like trying to make a sandwich without bread. You can do it, but nobody is going to want to deal with it.&lt;/p&gt;

&lt;p&gt;If you’re a startup trying to get your project off the ground, firstly, why are you using Kubernetes? Stop. However, if that’s the way you’re going - you’re looking at spending at least $10 a month for a load balancer in the cloud before you take data transfer costs into account. If you’re a hobbyist running a small Kubernetes cluster in your home lab, you might read the Metal LB documentation and think “I didn’t realise I needed to have a &lt;a href=&quot;https://en.wikipedia.org/wiki/CCIE_Certification&quot;&gt;CCIE certification&lt;/a&gt; to make this work.”&lt;/p&gt;

&lt;p&gt;Well, guess what! You don’t need to anymore. Thanks to &lt;a href=&quot;https://tailscale.com&quot;&gt;Tailscale&lt;/a&gt; and it’s relatively new &lt;a href=&quot;https://tailscale.com/blog/kubernetes-operator&quot;&gt;Kubernetes Operator&lt;/a&gt; you can now get access to Kubernetes workloads using native Service and Ingress objects without paying those greedy cloud providers a single cent.&lt;/p&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; I recently joined Tailscale as a Solutions Engineer. Make of that what you will.&lt;/div&gt;

&lt;h2 id=&quot;a-cloud-agnostic-load-balancer&quot;&gt;A cloud agnostic load balancer&lt;/h2&gt;

&lt;p&gt;Kubernetes services work on every Kubernetes distribution, but if you want to expose that service to the world, you’ll likely need a service of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;type=LoadBalancer&lt;/code&gt;. Without rehashing Kubernetes networking concepts, this mechanism will ultimately create something with an address that is external to the Kubernetes cluster. Let’s take a look at a very simple example on a &lt;a href=&quot;https://digitalocean.com&quot;&gt;DigitalOcean&lt;/a&gt; Kubernetes cluster:&lt;/p&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; I chose Digital Ocean for this example because it offers a free control plane, but this scenario works for any cloud provider’s Kubernetes offering.&lt;/div&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;apps/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx-deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;replicas&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# tells deployment to run 2 pods matching the template&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;containers&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx:1.14.2&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;containerPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Service&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;LoadBalancer&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here, we provision a standard Kubernetes deployment and service, and give the service &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;type=LoadBalancer&lt;/code&gt;. Inside the Kubernetes cluster, there is a cloud controller manager that watches for services of type &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LoadBalancer&lt;/code&gt; and then provisions a load balancer in the cloud provider’s infrastructure.&lt;/p&gt;

&lt;p&gt;You can see the result of this when the Load Balancer has provisioned:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;k get svc
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
kubernetes   ClusterIP      10.245.0.1      &amp;lt;none&amp;gt;          443/TCP        9m5s
nginx        LoadBalancer   10.245.97.151   &amp;lt;redacted&amp;gt;   80:31554/TCP   2m54s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If you visit the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXTERNAL-IP&lt;/code&gt; you’ll see your application, in this case, nginx.&lt;/p&gt;

&lt;p&gt;This is all well and good, but as mentioned earlier, this has provisioned an actual resource in Digital Ocean, a &lt;a href=&quot;https://www.digitalocean.com/products/load-balancer&quot;&gt;Load Balancer&lt;/a&gt;. This is going to cost me &lt;em&gt;at least&lt;/em&gt; $12 a month &lt;strong&gt;per node&lt;/strong&gt;, which effectively double the prices of running compute.&lt;/p&gt;

&lt;p&gt;As a proud &lt;a href=&quot;https://en.wikipedia.org/wiki/Culture_of_Yorkshire#Traditions_and_stereotypes&quot;&gt;Yorkshireman&lt;/a&gt; the idea of paying for something I don’t need to is anathema to me. Until recently, I didn’t have a choice. I had to pay for a load balancer.&lt;/p&gt;

&lt;h2 id=&quot;tailscale-to-the-rescue&quot;&gt;Tailscale to the rescue&lt;/h2&gt;

&lt;p&gt;So let’s see what this could look like with Tailscale. We’ll install the Tailscale operator into our cluster, but first we need to knock out a few small steps.&lt;/p&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; I am showcasing this with DigitalOcean Kubernetes, but this will work on ANY Kubernetes cluster, even your homelab. Try it!&lt;/div&gt;

&lt;h3 id=&quot;create-your-tailnet&quot;&gt;Create your Tailnet&lt;/h3&gt;

&lt;p&gt;The first step along this journey is to &lt;a href=&quot;https://login.tailscale.com/start?source=leebriggs.co.uk&quot;&gt;sign up for Tailscale&lt;/a&gt; and create a Tailnet. This is a network that your Kubernetes cluster will join, and allows anyone on the Tailnet to access the Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;You should also &lt;a href=&quot;https://tailscale.com/download&quot;&gt;install the Tailscale client on your device of choice&lt;/a&gt;. If you already have a Tailnet, skip to the next section.&lt;/p&gt;

&lt;h3 id=&quot;modify-your-acl-file&quot;&gt;Modify your ACL file&lt;/h3&gt;

&lt;p&gt;Before we connect Tailscale to our cluster, we need to make a few changes to our Tailscale ACL to allow the Tailscale operator to correctly authorize the new devices it’ll create.&lt;/p&gt;

&lt;p&gt;Inside the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tagOwners&lt;/code&gt; section of your Tailnet, you should add the following stanzas:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&quot;tagOwners&quot;: {
    &quot;tag:k8s-operator&quot;: [],
	&quot;tag:k8s&quot;:          [&quot;tag:k8s-operator&quot;],
},
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will allow the Tailscale operator to create new devices and assign them to the correct ACLs.&lt;/p&gt;

&lt;h3 id=&quot;create-an-oauth-client&quot;&gt;Create an Oauth client&lt;/h3&gt;

&lt;p&gt;Next, we need to create some credentials that the Tailscale operator will use to create devices. If you’re wondering why we need to do this, it’ll all make sense shortly. In the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Settings&lt;/code&gt; tab in the Tailscale console, navigate to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Oauth clients&lt;/code&gt; and create a new one, with the following settings:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/tailscale-operator-acl.png&quot; alt=&quot;tailscale-operator-acl&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Make a note of the credentials it returns, as you need them for the next step.&lt;/p&gt;

&lt;h3 id=&quot;install-the-tailscale-operator&quot;&gt;Install the Tailscale Operator&lt;/h3&gt;

&lt;p&gt;Now we have our credentials, we can install the operator. Tailscale handily provides us with a Helm Chart, so let’s go ahead and install it:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;helm repo add tailscale https://pkgs.tailscale.com/helmcharts &lt;span class=&quot;c&quot;&gt;# add the helm chart repo&lt;/span&gt;
helm repo update &lt;span class=&quot;c&quot;&gt;# update the repo&lt;/span&gt;
helm upgrade &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--install&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  tailscale-operator &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  tailscale/tailscale-operator &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--namespace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;tailscale &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--create-namespace&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; oauth.clientId&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;OAUTH_CLIENT_ID&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--set-string&lt;/span&gt; oauth.clientSecret&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;OAUTH_CLIENT_SECRET&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we can make some magic happen - let’s see how we can connect to our nginx deployment without a LoadBalancer.&lt;/p&gt;

&lt;h3 id=&quot;create-a-loadbalancer&quot;&gt;Create a LoadBalancer&lt;/h3&gt;

&lt;p&gt;Let’s deploy an nginx service, but add a single line to the manifest - the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;loadBalancerClass&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;apps/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx-deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;replicas&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# tells deployment to run 2 pods matching the template&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;containers&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx:1.14.2&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;containerPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Service&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;loadBalancerClass&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# add this!&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;LoadBalancer&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Wait a few minutes for the operator reconciliation to happen, then check the service:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;NAME         TYPE           CLUSTER-IP      EXTERNAL-IP                                    PORT(S)        AGE
kubernetes   ClusterIP      10.245.0.1      &amp;lt;none&amp;gt;                                         443/TCP        28m
nginx        LoadBalancer   10.245.97.151   100.102.166.33,default-nginx.tail7fe3.ts.net   80:31404/TCP   22m
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Something truly remarkable has happened here, with absolutely zero input on my part. If you’re on the same Tailnet as the Kubernetes cluster, you can visit the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXTERNAL-IP&lt;/code&gt; and see the nginx deployment.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl default-nginx.tail7fe3.ts.net
&amp;lt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;DOCTYPE html&amp;gt;
&amp;lt;html&amp;gt;
&amp;lt;&lt;span class=&quot;nb&quot;&gt;head&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;style&amp;gt;
    body &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        width: 35em&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        margin: 0 auto&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        font-family: Tahoma, Verdana, Arial, sans-serif&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&amp;lt;/style&amp;gt;
&amp;lt;/head&amp;gt;
&amp;lt;body&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;p&amp;gt;If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.&amp;lt;/p&amp;gt;

&amp;lt;p&amp;gt;For online documentation and support please refer to
&amp;lt;a &lt;span class=&quot;nv&quot;&gt;href&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;http://nginx.org/&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;nginx.org&amp;lt;/a&amp;gt;.&amp;lt;br/&amp;gt;
Commercial support is available at
&amp;lt;a &lt;span class=&quot;nv&quot;&gt;href&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;http://nginx.com/&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;nginx.com&amp;lt;/a&amp;gt;.&amp;lt;/p&amp;gt;

&amp;lt;p&amp;gt;&amp;lt;em&amp;gt;Thank you &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;using nginx.&amp;lt;/em&amp;gt;&amp;lt;/p&amp;gt;
&amp;lt;/body&amp;gt;
&amp;lt;/html&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What exactly has happened here? Let’s examine it a little.&lt;/p&gt;

&lt;h3 id=&quot;the-operator-at-work&quot;&gt;The operator at work&lt;/h3&gt;

&lt;p&gt;If you have the ability to use the Tailscale CLI, you can run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tailscale status&lt;/code&gt; and you’ll notice a few devices have appeared in your Tailnet.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tailscale status
100.109.46.90   macbook-pro-lbr      lee@         macOS   -
100.102.166.33  default-nginx        tagged-devices linux   idle, tx 900 rx 1452
100.95.163.103  doks-py-funnel-tailscale-operator tagged-devices linux   -
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There’s one for the Tailscale operator I installed, but there’s also a distinct device for my nginx service, that has the same IP as the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXTERNAL-IP&lt;/code&gt; of the service. This is the magic of the Tailscale operator - it’s created a device that is accessible from the Tailnet, and it’s done so without needing to provision a cloud resource.&lt;/p&gt;

&lt;p&gt;If I take a look in the Tailscale namespace, I can also see what’s actually happened - the operator installed a pod that is responsible for creating these devices:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;k get po &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; tailscale
NAME                        READY   STATUS    RESTARTS   AGE
operator-6cc69495c6-nck9x   1/1     Running   0          27m
ts-nginx-5jt4h-0            1/1     Running   0          4m1s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Describing that Tailscale pod will give me another interesting nugget of information - check out the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Environment&lt;/code&gt; section&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;po&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;-n&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tailscale&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;ts-nginx&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;-5&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;jt&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;-0&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;-o&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;jq&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;'.spec.containers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.env'&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;TS_USERSPACE&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;false&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;TS_AUTH_ONCE&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;true&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;POD_IP&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;valueFrom&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;fieldRef&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;apiVersion&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;v1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
        &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;fieldPath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;status.podIP&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;TS_KUBE_SECRET&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;ts-nginx-5jt4h-0&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;TS_HOSTNAME&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;default-nginx&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;TS_DEBUG_FIREWALL_MODE&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;auto&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;TS_DEST_IP&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;value&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;10.245.97.151&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Notice that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TS_DEST_IP&lt;/code&gt; variable? It’s the IP of the Kubernetes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ClusterIP&lt;/code&gt; for our service.&lt;/p&gt;

&lt;h2 id=&quot;going-further&quot;&gt;Going further&lt;/h2&gt;

&lt;p&gt;This is all well and good for a simple application, but what if we need HTTPS? How complex can this get? Kubernetes services will do okay for exposing basic TCP passthrough services, but what else can I do.&lt;/p&gt;

&lt;p&gt;Well, this let’s try something a bit more interesting.&lt;/p&gt;

&lt;h3 id=&quot;create-an-ingress&quot;&gt;Create an Ingress&lt;/h3&gt;

&lt;p&gt;The Tailscale operator will &lt;em&gt;also&lt;/em&gt; reconcile Ingress resources, and it’ll handle TLS termination for you as well. You need to &lt;a href=&quot;https://tailscale.com/kb/1153/enabling-https&quot;&gt;allow your Tailnet to create HTTPs certificates first&lt;/a&gt; but once you have, you can deploy an ingress just as easily as a service:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;apps/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx-deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;replicas&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# tells deployment to run 2 pods matching the template&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;containers&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx:1.14.2&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;containerPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Service&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ClusterIP&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# previously was a load balancer&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# also ensure you remove the loadBalancerClass line&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;defaultBackend&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;service&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ingressClassName&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;tls&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, wait for the operator to reconcile, and then examine your ingress:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;s&quot;&gt;NAME    CLASS       HOSTS   ADDRESS                 PORTS     AGE&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;nginx   tailscale   *       nginx.tail7fe3.ts.net   80, 443   22s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; If you request a TLS certificate, it will take a little longer for connectivity to be established while a certificate is provisioned.&lt;/div&gt;

&lt;p&gt;Look at that!&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl https://nginx.tail7fe3.ts.net
&amp;lt;!DOCTYPE html&amp;gt;
&amp;lt;html&amp;gt;
&amp;lt;head&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;style&amp;gt;
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
&amp;lt;/style&amp;gt;
&amp;lt;/head&amp;gt;
&amp;lt;body&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;p&amp;gt;If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.&amp;lt;/p&amp;gt;

&amp;lt;p&amp;gt;For online documentation and support please refer to
&amp;lt;a href=&quot;http://nginx.org/&quot;&amp;gt;nginx.org&amp;lt;/a&amp;gt;.&amp;lt;br/&amp;gt;
Commercial support is available at
&amp;lt;a href=&quot;http://nginx.com/&quot;&amp;gt;nginx.com&amp;lt;/a&amp;gt;.&amp;lt;/p&amp;gt;

&amp;lt;p&amp;gt;&amp;lt;em&amp;gt;Thank you for using nginx.&amp;lt;/em&amp;gt;&amp;lt;/p&amp;gt;
&amp;lt;/body&amp;gt;
&amp;lt;/html&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;No need to install an ingress controller, no need to provision a cloud resource. Just a simple ingress resource and the Tailscale operator does the rest.&lt;/p&gt;

&lt;h2 id=&quot;public-services&quot;&gt;Public Services&lt;/h2&gt;

&lt;p&gt;So far, we’ve provisioned workloads that are only accessible on the same Tailnet. What if you want to expose a service to the wider internet?&lt;/p&gt;

&lt;p&gt;Tailscale already has an amazing feature for this called &lt;a href=&quot;https://tailscale.com/kb/1223/funnel&quot;&gt;Funnel&lt;/a&gt; and you can leverage this feature from the operator to make your applications accessible from anywhere in the world.&lt;/p&gt;

&lt;h3 id=&quot;add-acl-permissions&quot;&gt;Add ACL permissions&lt;/h3&gt;

&lt;p&gt;You need to make a slight modification your Tailscale ACL file to use this feature. Add a block to your ACL like so:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nl&quot;&gt;&quot;nodeAttrs&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;target&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tag:k8s&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tag&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;that&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;Tailscale&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;Operator&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;uses&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;tag&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;proxies;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;defaults&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;'tag:k&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;s'&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
		&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;attr&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;   &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;funnel&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
	&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once you’ve updated your ACL, you need to make a single line change to any service or ingress.&lt;/p&gt;

&lt;h3 id=&quot;modify-your-service-or-ingress&quot;&gt;Modify your service or ingress&lt;/h3&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;apps/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx-deployment&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;replicas&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;containers&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx:1.14.2&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;containerPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Service&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ClusterIP&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;annotations&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;tailscale.com/funnel&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# add this annotation&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;defaultBackend&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;service&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;80&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ingressClassName&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tailscale&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;tls&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This annotation works on both services and ingress objects. Once applied, you can logout of your Tailnet, and still get access to the service:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# log out of my tailnet&lt;/span&gt;
tailscale &lt;span class=&quot;nb&quot;&gt;logout&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# get the ingress url&lt;/span&gt;
 k get ing
NAME    CLASS       HOSTS   ADDRESS                 PORTS     AGE
nginx   tailscale   &lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;       nginx.tail7fe3.ts.net   80, 443   7m53s

&lt;span class=&quot;c&quot;&gt;# check access ingress&lt;/span&gt;
curl https://nginx.tail7fe3.ts.net
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Magic!&lt;/p&gt;

&lt;h2 id=&quot;caveats&quot;&gt;Caveats&lt;/h2&gt;

&lt;p&gt;This is all incredibly powerful, and will save you money. However, there are 2 caveats worth mentioning&lt;/p&gt;

&lt;h3 id=&quot;supported-urls&quot;&gt;Supported URLs&lt;/h3&gt;

&lt;p&gt;Currently, the URLs generated for both services and ingresses are only tailnet addresses, so support for your own domain is currently not possible.&lt;/p&gt;

&lt;h3 id=&quot;ingress-https&quot;&gt;Ingress HTTPs&lt;/h3&gt;

&lt;p&gt;There is currently no mechanism to redirect HTTP requests to HTTPS. If you need that sort of functionality, you’ll need to provision an ingress controller. You can expose the ingress controller with a Tailnet service though, more to come on that in another post.&lt;/p&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping Up&lt;/h2&gt;

&lt;p&gt;Tailscale really has made it incredibly easy to expose your Kubernetes workloads to the world without needing to provision cloud resources. This is only the beginning for what the operator can do, so watch this space for more exciting forays into the world of Tailscale and Kubernetes.&lt;/p&gt;
</description>
                    <pubDate>Mon, 26 Feb 2024 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2024/02/26/cheap-kubernetes-loadbalancers.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2024/02/26/cheap-kubernetes-loadbalancers.html</guid>
                </item>
            
		
            
                <item>
                    <title>The 300% Production Problem</title>
                    
                        <description>&lt;p&gt;Earlier this year, I attended CfgMgmtCamp in Ghent and listened to Adam Jacob’s “What if Infrastructure as Code never existed” keynote.&lt;/p&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; I’d like to extend a huge thanks to Adam for taking the time to review this post, and for an unnamed person for consistently reviewing these posts and for inspiring the thoughts here.&lt;/div&gt;

&lt;p&gt;Not only is Adam an excellent speaker, but he can capture thoughts that most people can’t articulate and explain them in a way that revolutionises people’s thinking.&lt;/p&gt;

&lt;p&gt;This talk was no exception. If you have yet to see it, I’d like to introduce you to something I’ve always known but have yet to appropriately identify: the 200% knowledge problem.&lt;/p&gt;

&lt;p&gt;You can listen to Adam’s excellent explanation of the 200% knowledge problem &lt;a href=&quot;https://youtu.be/5lPa2U239C4?t=2014&quot;&gt;here&lt;/a&gt;. My own explanation of this problem is:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To successfully use an abstraction, you need to understand the problem the abstraction is trying to solve &lt;em&gt;and also&lt;/em&gt; understand how the abstraction has solved the problem.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;terraform-modules&quot;&gt;Terraform Modules&lt;/h2&gt;

&lt;p&gt;My best example of this is when examining the Terraform module ecosystem. Terraform modules, in theory, are designed to solve specific problems in the cloud provider ecosystem. Taking an example like the AWS VPC module removes the need to understand all of the glue that AWS needs to successfully create a VPC, like route tables, subnets and NAT Gateways.&lt;/p&gt;

&lt;p&gt;That’s the theory. The reality of these modules is that you need to understand the magical incantation of random functions and use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dynamic&lt;/code&gt; to succeed.&lt;/p&gt;

&lt;p&gt;In addition, the desire to create Terraform modules that meet &lt;em&gt;every single user’s possible use case&lt;/em&gt; means that often, the module will expose the entire surface area of the APIs the module is managing to the user. Usually, this leaves you in a position of having to painstakingly read the whole module’s code before using it, and if something breaks, you’re shit out of luck.&lt;/p&gt;

&lt;p&gt;Hearing Adam describe this problem has had my brain slowly creaking for a while. We’ve seen lots of literature in the past few years about the explosion of knowledge required to be a successful “DevOps Engineer”, “Site Reliability Engineer” or “Platform Engineer” or whatever that role’s title is this week. As I’ve noodled on this for the past few months, I’ve started to leverage the 200% problem and give it a moniker of my own: “the 300% production problem”.&lt;/p&gt;

&lt;h2 id=&quot;the-300-production-problem&quot;&gt;The 300% Production Problem&lt;/h2&gt;

&lt;p&gt;The definition of the 300% problem is:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To successfully get an application into production, you need to be an expert in the application itself, the deployment target and the deployment methodology.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Each of these expertise layers is a full-time job or undertaking. If you’re lucky, you have more than one person who needs to be an expert on all these 3 layers. If you’re unlucky, you might read this and think, “holy shit, no wonder I’m burned out”.&lt;/p&gt;

&lt;p&gt;Let’s examine these layers and see if we can develop some ideas to reduce the burden.&lt;/p&gt;

&lt;h3 id=&quot;the-application&quot;&gt;The Application&lt;/h3&gt;

&lt;p&gt;Now, if the application you’re deploying is an in-house application written by a team of super-smart developers, you’re in a good spot. If you’re deploying a third-party application, things get a little trickier.&lt;/p&gt;

&lt;p&gt;One of the benefits of open-source third-party software is that it allows you to &lt;em&gt;become&lt;/em&gt; an expert in the application itself, either by reading the code or the documentation. What’s interesting about the application stack is that there are layers of expertise all the way down. The database tier of the application itself, the software framework it’s written in, and how it’s packaged and maintained or performs under load or scale are just a few parts of how this breaks down.&lt;/p&gt;

&lt;p&gt;The responsibility of expertise here essentially belongs to the “Dev” side of the “DevOps Divide”, especially for business logic applications.&lt;/p&gt;

&lt;h3 id=&quot;the-deployment-target&quot;&gt;The Deployment Target&lt;/h3&gt;

&lt;p&gt;The deployment target is where things start to explode in complexity, and often, the decisions &lt;em&gt;you&lt;/em&gt; make as an engineer can directly affect the level of expertise required to succeed.&lt;/p&gt;

&lt;p&gt;When discussing the deployment target, we’re pointing at the cloud provider and the varying layers under that. Cloud providers are &lt;em&gt;hard&lt;/em&gt;, and they get even more challenging if you start to sprinkle Kubernetes on stuff to abstract away cloud provider APIs. That’s just compute, as well. Once you start factoring in the networking, data, queuing system requirements and all that other crucial production-grade stuff, you begin to realise &lt;em&gt;just how hard this is&lt;/em&gt;.&lt;/p&gt;

&lt;h3 id=&quot;the-deployment-methodology&quot;&gt;The Deployment Methodology&lt;/h3&gt;

&lt;p&gt;The deployment methodology refers to the &lt;em&gt;way&lt;/em&gt; you get your application &lt;em&gt;and&lt;/em&gt; your infrastructure into a production level. If you’ve read other posts on this blog, you’ll know that my primary focus is Infrastructure as Code and being an expert in that IaC tool is a &lt;em&gt;requirement&lt;/em&gt; to successfully get things into production. Now, I’m not going to turn this into another rant about why DSLs are stupid, but when you consider here that the language you use to author your infrastructure is part of this expertise, I’d ask myself this question: do you really only want 4 or 5 people in your company to be the experts in the deployment methodology, or do you want lots of experts?&lt;/p&gt;

&lt;h2 id=&quot;multiplying-the-problem&quot;&gt;Multiplying the problem&lt;/h2&gt;

&lt;p&gt;Once you start to think about problems through the lens of the 300% Production Problem, you begin to realise why there’s a burgeoning backlash against some popular ideas in our industry.&lt;/p&gt;

&lt;h3 id=&quot;microservices&quot;&gt;Microservices&lt;/h3&gt;

&lt;p&gt;Microservices took off as an idea because of the enormous scale needed for some organisations, but what ends up happening is that you multiply the 300% problem across a shitload of applications. Give those application teams the full power and responsibility of “owning their applications in production”. You can offset some of this by letting them choose their infrastructure and deployment methodology, but it’s a lot to ask everyone to be an expert in all 3 of these domains.&lt;/p&gt;

&lt;h3 id=&quot;kubernetes&quot;&gt;Kubernetes&lt;/h3&gt;

&lt;p&gt;It’s an established meme now that adding Kubernetes increases complexity, but it multiplies the complexity in two areas of the 300% problem. Not only is Kubernetes layered on top of your existing infrastructure, but it also requires you to be an expert in all of the facets of the deployment methodology, whether it be the abject misery of Go templated Helm Charts or building an operator for everything which is the proposed solution of some people who really just want to watch the world burn.&lt;/p&gt;

&lt;h3 id=&quot;cloud&quot;&gt;Cloud&lt;/h3&gt;

&lt;p&gt;Yes. There’s a cloud backlash slowly starting to form. I’m not going to share my thoughts in this post, but if you consider the surface area of most cloud providers, the sheer amount of services and the idiosyncrasies of those services, it’s unsurprising that people are starting to feel cognitive overload trying to meet their third of the 300% problem on the infrastructure side.&lt;/p&gt;

&lt;h2 id=&quot;what-do-we-do&quot;&gt;What do we do?&lt;/h2&gt;

&lt;p&gt;The ultimate solution to the 300% problem will be the same across the 3 pillars. &lt;em&gt;Simplicity&lt;/em&gt;. Making decisions that reduce the amount of knowledge required to become an expert in one of those 3 pillars will dramatically affect your overall success rate at getting your applications into production.&lt;/p&gt;

&lt;p&gt;This isn’t new ground. People have been writing about keeping “Keep it simple stupid” for a long time. The problem is that the “simplicity” is often defined by the people building the system, and those people are &lt;em&gt;experts&lt;/em&gt; already in the system they’re building.&lt;/p&gt;

&lt;p&gt;When I shared this post with &lt;a href=&quot;https://x.com/adamhjk&quot;&gt;Adam&lt;/a&gt;, he gave me an excellent analogy which helped me round this post out:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;You want simplicity where it benefits the &lt;em&gt;user&lt;/em&gt;, which often requires increased complexity for the &lt;em&gt;developer&lt;/em&gt;. My analogy for this is how complex modern cars are, but you push a button to “start” them. Vs a very simple model t, that breaks your arm if you start it wrong.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As usual, Adam has done a much better job of encapsulating the core ideas in this post than I have, but to expand on it, the missing piece to me is the “producification” of the systems we’re building.&lt;/p&gt;

&lt;p&gt;What is often overlooked is how important it is that decisions and responsibilities that can be shared are made to appease the opinions of a small number of potential experts rather than the broader organisation. Intelligent individuals with political power will happily introduce this framework, that deployment model, or the other tool because they like it and are experts and they &lt;em&gt;believe&lt;/em&gt; it’s simple, which ultimately it is - to them.&lt;/p&gt;

&lt;p&gt;Hopefully, if you’re reading this and you’re trying to build something that simplifies a process in your organisation, you’ll consider the 300% problem, and make it simple for everyone, not just yourself.&lt;/p&gt;

</description>
                    <pubDate>Thu, 28 Sep 2023 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2023/09/28/300_percent_problem.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2023/09/28/300_percent_problem.html</guid>
                </item>
            
		
            
                <item>
                    <title>DSLs are a waste of time</title>
                    
                        <description>&lt;p&gt;If you’ve read this blog before, or are unfortunate enough to have an actual personal relationship with me, you’ll know that I have strong opinions and can be, shall we say, &lt;em&gt;passionate&lt;/em&gt; about them. For posts on this blog, I try to share those opinions and avoid directly addressing the people who hold the opposing view.&lt;/p&gt;

&lt;p&gt;However there’s a schism happening in the infrastructure as code world at the moment with the announcement that HashiCorp is changing the software license that it uses for its products. As a result, and for reasons that have been covered in great depth by people far more qualified than I am, this has resulted in a set of vocal competitors to Terraform Cloud and Terraform Enterprise to “fork” (and that’s a loose interpretation of the word because because all we have at the time of writing is a markdown file and some gifs) called OpenTF.&lt;/p&gt;

&lt;p&gt;As we watch these two competing factions live out &lt;a href=&quot;https://en.wikipedia.org/wiki/Friedrich_Glasl%27s_model_of_conflict_escalation&quot;&gt;Friedrich Glasl’s model of conflict escalation&lt;/a&gt; in public, I have found myself asking a question that has always lingered in the back of mind while working in my day job at &lt;a href=&quot;https://pulumi.com&quot;&gt;Pulumi&lt;/a&gt;..&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why the fuck does everyone love this domain specific language so much?&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;hold-your-horses-terraform-isnt-just-a-dsl&quot;&gt;Hold your horses, Terraform isn’t just a DSL!&lt;/h2&gt;

&lt;p&gt;Your first reaction will undoubtedly be “well, it’s not the DSL we love, it’s Terraform! It’s made me more productive and solves real world problems!”&lt;/p&gt;

&lt;p&gt;I understand the impact Terraform has had on Infrastructure as Code and DevOps in general. I’m lucky enough to consider two of the &lt;a href=&quot;https://github.com/hashicorp/terraform/graphs/contributors&quot;&gt;top 10 contributors of all time to Terraform&lt;/a&gt; close personal friends and we generally hold similar perspectives on things (although, one of them has appaling taste in beer, I know you’re reading this).&lt;/p&gt;

&lt;p&gt;I used Terraform from the very first version, back when every time you ran &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;terraform apply&lt;/code&gt; you sort of closed your eyes and hoped you didn’t break everything. I watched in wonder as Terraform introduced modules, started adding hundreds of providers to support basically every cloud provider you could think of. As our industry started to migrate to “cloud first” or as the marketing behemoth likes to call it, “cloud native” I migrated by world view from one of “configuration management” to “infrastructure as code” and became a subject matter expert.&lt;/p&gt;

&lt;p&gt;All of this is to say, I &lt;em&gt;get it&lt;/em&gt;. I understand why people want to use Terraform, and I can sort of understand why people are so upset about the license change and want to support the forking of the product so it stays “open source”.&lt;/p&gt;

&lt;p&gt;The problem here is that while you can say “Terraform is more than a DSL” - if you ever back the layers or try have a conversation with someone who’s a true dyed in the wool Terraform zealot and start to try and understand where they’re coming from, you begin to realise they don’t love Terraform, they love this weird half language because they’re an expert in it, and they’ve built their entire career on being an expert on something you can only use for one specific use case.&lt;/p&gt;

&lt;h2 id=&quot;a-path-to-nowhere&quot;&gt;A path to nowhere&lt;/h2&gt;

&lt;p&gt;In college I spent hundreds of hours playing Guitar Hero. I would play songs over and over again until I could get through them on expert mode. My ultimate goal was to be able to at least &lt;em&gt;finish&lt;/em&gt; the toughest song on the game, &lt;a href=&quot;https://www.youtube.com/watch?v=cHRfbiwdheg&quot;&gt;Through The Fire and Flames&lt;/a&gt; on the toughest possible difficulty. A friend of mine was an accomplished guitar player and as a passing comment, pointed out “imagine if you’d spent the same amount of time playing an &lt;em&gt;actual&lt;/em&gt; guitar”.&lt;/p&gt;

&lt;p&gt;It never really registered for me at the time what he was saying, but as the years have gone by and I’ve put thousands of hours into being an “expert” in infrastructure, I’ve started to realise I made the same mistakes in my career as I did with guitar hero.&lt;/p&gt;

&lt;h2 id=&quot;dsls-are-everywhere&quot;&gt;DSLs are everywhere&lt;/h2&gt;

&lt;p&gt;I’ve used a DSL before, you see. Even before Terraform. I spent a whole 7 or 8 years writing thousands of lines of DSL.&lt;/p&gt;

&lt;p&gt;Puppet has its own Ruby based DSL it now calls &lt;a href=&quot;https://www.puppet.com/docs/puppet/7/puppet_language.html&quot;&gt;the Puppet language&lt;/a&gt;. When all of the problems in the infrastructure space existed at the operating system instead of the API layer, Puppet was &lt;em&gt;everywhere&lt;/em&gt; and I &lt;em&gt;loved&lt;/em&gt; the language because it meant I didn’t have to be a “software developer”. I didn’t want to write PHP or Python, I wanted to manage infrastructure, and Puppet’s DSL let me do that.&lt;/p&gt;

&lt;p&gt;The problem with this mentality was that when the industry inevitably changed around me, I was left with an expertise and knowledge of a language that was now, &lt;strong&gt;effectively useless&lt;/strong&gt;.  I haven’t written a single line of Puppet language in a long time, and I don’t miss it at all. I don’t miss the syntax, parsing the docs and trying to figure out how these built in functions actually worked and mapping those ideas to problem sets.&lt;/p&gt;

&lt;h2 id=&quot;configuration-complexity-clock&quot;&gt;Configuration complexity clock&lt;/h2&gt;

&lt;p&gt;The fact Puppet and Terraform (and other tools, of course) tend to converge on a DSL to solve infrastructure problems doesn’t appear to be a coincidence. Infrastructure (whether it be at the operating system layer or the API layer) is at its core, a sea of &lt;em&gt;configuration&lt;/em&gt;. When dealing with configuration, the &lt;a href=&quot;http://mikehadlow.blogspot.com/2012/05/configuration-complexity-clock.html&quot;&gt;configuration complexity clock&lt;/a&gt; dictates that eventually, you’re going to try and configure a DSL to manage it.&lt;/p&gt;

&lt;p&gt;If you’re familiar with the configuration complexity clock, you’ll notice that the next step in the evolution of infrastructure configuration is to “hard code” the values again, which isn’t really a possibility because, well, have you seen how much infrastructure we’re dealing with? So what’s really happening here is that the DSLs (and particularly, the HCL DSL) is becoming it’s own progamming language, adding new constructs and methods which increase the amount of complexity in the language itself.&lt;/p&gt;

&lt;p&gt;This is all well and good, because the features solve problems people have, the real issue here is that it’s still a DSL, and you’re still learning to play guitar hero instead of actually learning a useful skill you can learn elsewhere - playing the guitar.&lt;/p&gt;

&lt;h2 id=&quot;complexity-is-in-the-eye-of-the-beholder&quot;&gt;Complexity is in the eye of the beholder&lt;/h2&gt;

&lt;p&gt;One of the arguments I always hear when talking to people about this problem is the argument that Terraform’s DSL reduces the amount of complexity in the code because of its limited feature set. Putting aside the fact these people are arguing that the lack of features &lt;em&gt;is itself a feature&lt;/em&gt; it doesn’t really hold up to any scrutiny.&lt;/p&gt;

&lt;p&gt;If you do a quick search of a sufficiently complex Terraform module, you’ll see &lt;em&gt;all kinds of craziness&lt;/em&gt;. I just picked a random module from the Terraform registry and looked at the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main.tf&lt;/code&gt; and found this:&lt;/p&gt;

&lt;div class=&quot;language-hcl highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nx&quot;&gt;resource&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;aws_route_table_association&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;redshift&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;local&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;create_redshift_subnets&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;enable_public_redshift&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;local&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;len_redshift_subnets&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

  &lt;span class=&quot;nx&quot;&gt;subnet_id&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;aws_subnet&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;redshift&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;route_table_id&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;coalescelist&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;aws_route_table&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;redshift&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;aws_route_table&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;private&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;single_nat_gateway&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;create_redshift_subnet_route_table&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;nx&quot;&gt;resource&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;aws_route_table_association&quot;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;redshift_public&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;local&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;create_redshift_subnets&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;enable_public_redshift&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;local&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;len_redshift_subnets&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;

  &lt;span class=&quot;nx&quot;&gt;subnet_id&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;aws_subnet&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;redshift&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;route_table_id&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;coalescelist&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;aws_route_table&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;redshift&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;aws_route_table&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;public&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;single_nat_gateway&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;create_redshift_subnet_route_table&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, don’t get me wrong, I know what this is doing, and I also understand why it has to be implemented this way, but if you’re not an expert in this space, you probably have a whole bunch of reasonable questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Why are we using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;count&lt;/code&gt; here?&lt;/li&gt;
  &lt;li&gt;What is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;coalescelist&lt;/code&gt;?&lt;/li&gt;
  &lt;li&gt;What is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;element&lt;/code&gt;?&lt;/li&gt;
  &lt;li&gt;What is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;local&lt;/code&gt;?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that’s just the first few lines. I’m not trying to pick on this module, I’m just trying to demonstrate that even in a language that is supposed to be “simple” and “easy to understand” you can still end up with code that is complex and difficult to understand.&lt;/p&gt;

&lt;p&gt;Terraform’s DSL allows you to use abstractions (modules), nest resources inside those abstractions and then create varying levels of indirection when trying to instantiate resources. These things are all necessary when trying to truly create an ecosystem that allows reusability and sharing, but of course that brings with it complexity.&lt;/p&gt;

&lt;p&gt;My personal opinion here is that when people say “it’s less complex” - what they actually mean is “I understand it”. Which is totally fine, and a reasonable position to be in, but it’s not the same thing. If you’re a “devops engineer” or “platform engineer” (or whatever the job title is called this week) and you look at the application code you’re supporting - which is all object orientated and scary and you can’t figure out where this values comes from - put yourself in the position of an application developer trying to look at YOUR Terraform code and think about &lt;em&gt;their&lt;/em&gt; perspective. It’s just as difficult, and it was your job to &lt;em&gt;learn&lt;/em&gt; Terraform’s DSL, the application developer has a full time job AND now you’re expecting them to learn this weird DSL too?&lt;/p&gt;

&lt;h2 id=&quot;theres-more-to-terraform-than-a-dsl&quot;&gt;There’s more to Terraform than a DSL&lt;/h2&gt;

&lt;p&gt;It’s true, that Terraform is more than a DSL. There’s an entire ecosystem of modules, which seem to be just a grouping of Terraform resources (which expose every single possible API element to the user, but that’s a post for another day) and tutorials and even entire companies set up around the Terraform ecosystem. That’s all very compelling when choosing and considering how you’re going to manage your infrastructure, but what’s pretty clear after the last few weeks is that this ecosystem is becoming increasingly &lt;em&gt;fragile&lt;/em&gt;. Once the Terraform fork happens (and again, we don’t have anything yet other than a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;README&lt;/code&gt; and some gifs) - will that ecosystem continue to grow and thrive, or will it split and become two separate ecosystems?&lt;/p&gt;

&lt;p&gt;If you look at the history of software that has been forked, you’ll see that rarely do the two forks grow along the same path. When the maintainers of OpenTF inevitably decide that they want to add enhancements that require &lt;em&gt;language&lt;/em&gt; changes, you as the end user end up in a situation where you’re effectively back to square 1, learning (potentially) a new DSL.&lt;/p&gt;

&lt;h2 id=&quot;whats-the-alternative&quot;&gt;What’s the alternative?&lt;/h2&gt;

&lt;p&gt;As you can probably ascertain, I think there’s a better path forward here. I’ve &lt;a href=&quot;https://leebriggs.co.uk/blog/2022/08/26/choosing-an-iac-tool&quot;&gt;made the case before&lt;/a&gt; that programming languages are the better authoring model for cloud infrastructure because they allow you to express the complexity of your infrastructure in the most flexible, intuitive way. My argument here is that if you want to continue to cling on to DSLs, you’re going to end up in a situation where you’re going to have to learn a new DSL every time you want to do something new.&lt;/p&gt;

&lt;p&gt;I’m not going to make the explicit argument you should use &lt;a href=&quot;https://pulumi.com&quot;&gt;Pulumi&lt;/a&gt; as your IaC tool because that would make me even more of a shill than I am now. What I will say is this: in this pivotal moment for the industry where Terraform gets forked, where the ecosystem is going to fracture and diverge, maybe you as a user can ask yourself the question - should I really go and learn another DSL? My sincere hope is that either Terraform or the OpenTF folks decide to invest in the CDK model. It’s an inferior programming language authoring model to &lt;a href=&quot;https://pulumi.com&quot;&gt;Pulumi’s&lt;/a&gt;, but at least the concepts you learn in your chosen programming language are applicable elsewhere in your career. Understanding how to handle complex data structures in say, Python is something which might help you fix bug’s in application code, you can’t throw a DSL at that.&lt;/p&gt;

</description>
                    <pubDate>Mon, 04 Sep 2023 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2023/09/04/dsl.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2023/09/04/dsl.html</guid>
                </item>
            
		
            
                <item>
                    <title>Structuring your Infrastructure as Code</title>
                    
                        <description>&lt;p&gt;If you’re thinking of migrating to another infrastructure as code tool (and why would you, everything is &lt;em&gt;great&lt;/em&gt; in the IaC world now, right?!) you might find yourself asking yourself a fundamental question when you get started: how do I structure things in a way that scales well and stands the test of time?&lt;/p&gt;

&lt;p&gt;There’s no canonical answer. Everyone does things slightly different, and different tools have different ideas on the best way.&lt;/p&gt;

&lt;p&gt;In my day to day role as a Solutions Engineer at &lt;a href=&quot;https://pulumi.com&quot;&gt;Pulumi&lt;/a&gt; I get to answer this question a &lt;em&gt;lot&lt;/em&gt;. Customers are migrating from other IaC tools and they want to take this opportunity to think about the way they’d like to structure things.&lt;/p&gt;

&lt;p&gt;This blog post is designed to detail my high(ish) level thoughts on the concepts and principles I like to use, and why. As we explore these concepts, I’ll talk about some of the lessons I learned from my time in configuration management and the myriad IaC tools I’ve used before today.&lt;/p&gt;

&lt;p&gt;A lot of the concepts in this post are focused on &lt;a href=&quot;https://pulumi.com&quot;&gt;Pulumi&lt;/a&gt;, but lots are broadly applicable to other tools.&lt;/p&gt;

&lt;h1 id=&quot;layers&quot;&gt;Layers&lt;/h1&gt;

&lt;p&gt;I’m sure my system administrator background is showing, but I like to think about infrastructure through the concept of &lt;em&gt;layers&lt;/em&gt; similar to the &lt;a href=&quot;https://en.wikipedia.org/wiki/OSI_model&quot;&gt;OSI Model&lt;/a&gt;. Most of the layers I’ll outline here closely mirror the OSI model, but what you’ll likely want to do before you create your Git repo or write a single line of code, is group your cloud infrastructure into layers. The reason why will become apparent later.&lt;/p&gt;

&lt;h2 id=&quot;layer-0-billing&quot;&gt;Layer 0: Billing&lt;/h2&gt;

&lt;p&gt;The billing layer is where you sign up or input your credit card. Each cloud provider does this differently&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: Organization&lt;/li&gt;
  &lt;li&gt;Azure: Account&lt;/li&gt;
  &lt;li&gt;Google Cloud: Account&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my experience, while there’s API for this stuff you likely &lt;em&gt;don’t&lt;/em&gt; want to manage this layer with IaC, so do yourself a favour and do it manually.&lt;/p&gt;

&lt;h2 id=&quot;layer-1-privilege&quot;&gt;Layer 1: Privilege&lt;/h2&gt;

&lt;p&gt;The privilege layer is how you fundementally separate access in the cloud provider. Again, each provider does this a little differently.&lt;/p&gt;

&lt;h3 id=&quot;example-resources&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: Account&lt;/li&gt;
  &lt;li&gt;Azure: Subscription&lt;/li&gt;
  &lt;li&gt;Google Cloud: Project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You &lt;em&gt;might&lt;/em&gt; want to manage this layer with IaC, but you need to decide how that’d work. Personally, I find that the API level support for this layer and the rarity of needing to perform this operation means it’s often easier to manage this layer manually.&lt;/p&gt;

&lt;h2 id=&quot;layer-2-network&quot;&gt;Layer 2: Network&lt;/h2&gt;

&lt;p&gt;Now we’re getting to the layers that’ll should &lt;em&gt;definitely&lt;/em&gt; be managed by IaC. The network layer is foundational to how everything will work in your infrastructure, and includes things like a VPC, subnets, NAT Gateways, VPNs, and anything else that facilitates network communication.&lt;/p&gt;

&lt;h3 id=&quot;example-resources-1&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: VPCs, Subnets, Route Tables, Internet Gateways, NAT Gateways, VPNs&lt;/li&gt;
  &lt;li&gt;Azure: Virtual Networks, Subnets, Route Tables, Internet Gateways, NAT Gateways, VPNs&lt;/li&gt;
  &lt;li&gt;Google Cloud: VPCs, Subnets, Route Tables, Internet Gateways, Cloud Nat, VPNs&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;layer-3-permissions&quot;&gt;Layer 3: Permissions&lt;/h2&gt;

&lt;p&gt;Now we’ve laid down a network layer, we need to allow other people or applications to talk to the cloud provider API. IAM roles, or service principals live in this layer.&lt;/p&gt;

&lt;h3 id=&quot;example-resources-2&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: IAM Roles, IAM Users, IAM Groups&lt;/li&gt;
  &lt;li&gt;Azure: Service Principals, Managed Identities&lt;/li&gt;
  &lt;li&gt;Google Cloud: Service Accounts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;layer-4-data&quot;&gt;Layer 4: Data&lt;/h2&gt;

&lt;p&gt;The data layer is where the resources you’re managing really start to open up. This is where you’ll find things like databases, object stores, message queues, and anything else that’s used to store or transfer data.&lt;/p&gt;

&lt;h3 id=&quot;example-resources-3&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: RDS, DynamoDB, S3, SQS, SNS, Kinesis, Redshift, DocumentDB, ElastiCache, DynamoDB&lt;/li&gt;
  &lt;li&gt;Azure: SQL, CosmosDB, Blob Storage, Queue Storage, Event Grid, Event Hubs, Service Bus, Redis Cache&lt;/li&gt;
  &lt;li&gt;Google Cloud: Cloud SQL, Cloud Spanner, Cloud Storage, Cloud Pub/Sub, Cloud Datastore, Cloud Bigtable, Cloud Memorystore&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;layer-5-compute&quot;&gt;Layer 5: Compute&lt;/h2&gt;

&lt;p&gt;The compute layer is where your applications actually run - this is where you’ll find things like virtual machines, containers, and serverless functions.&lt;/p&gt;

&lt;h3 id=&quot;example-resources-4&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: EC2, ECS, EKS, Fargate&lt;/li&gt;
  &lt;li&gt;Azure: Virtual Machines, Container Instances, AKS&lt;/li&gt;
  &lt;li&gt;Google Cloud: Compute Engine, GKE&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;layer-6-ingress&quot;&gt;Layer 6: Ingress&lt;/h2&gt;

&lt;p&gt;Layer 6 is where you’ll find the resources that allow your applications to be accessed by the outside world.&lt;/p&gt;

&lt;h3 id=&quot;example-resources-5&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: Application Load Balancers, Network Load Balancers, Classic Load Balancers, API Gateways&lt;/li&gt;
  &lt;li&gt;Azure: Application Gateways, Load Balancers, API Management&lt;/li&gt;
  &lt;li&gt;Google Cloud: Load Balancers, API Gateways&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;layer-7-application&quot;&gt;Layer 7: Application&lt;/h2&gt;

&lt;p&gt;Once we’ve provisioned all the supporting infrastructure, we now need to actually deploy the application itself. This is where things really get a little tricky and depend entirely on your application’s deployment model, technology and architecture.&lt;/p&gt;

&lt;p&gt;You might choose not to use IaC for application at all, but if you do..&lt;/p&gt;

&lt;h3 id=&quot;example-resources-6&quot;&gt;Example Resources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;AWS: Lambda, ECS Tasks, Kubernetes Manifests, EC2 User Data&lt;/li&gt;
  &lt;li&gt;Azure: Azure Functions, Kubernetes Manifests&lt;/li&gt;
  &lt;li&gt;Google Cloud: Cloud Functions, Kubernetes Manifests&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;visualization&quot;&gt;Visualization&lt;/h2&gt;

&lt;p&gt;If you’re a visual learner like me, you might find this visualization helpful:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Layer&lt;/th&gt;
      &lt;th&gt;Name&lt;/th&gt;
      &lt;th&gt;Example Resources&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;Billing&lt;/td&gt;
      &lt;td&gt;AWS Organization/Azure Account/Google Cloud Account&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;Privilege&lt;/td&gt;
      &lt;td&gt;AWS Account/Azure Subscription/Google Cloud Project&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;Network&lt;/td&gt;
      &lt;td&gt;AWS VPC/Google Cloud VPC/Azure Virtual Network&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;Permissions&lt;/td&gt;
      &lt;td&gt;AWS IAM/Azure Managed Identity/Google Cloud Service Account&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;Data&lt;/td&gt;
      &lt;td&gt;AWS RDS/Azure Cosmos DB/Google Cloud SQL&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt;Compute&lt;/td&gt;
      &lt;td&gt;AWS EC2/Azure Container Instances/GKE&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;6&lt;/td&gt;
      &lt;td&gt;Ingress&lt;/td&gt;
      &lt;td&gt;AWS ELB/Azure Load Balancer/Google Cloud Load Balancer&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;7&lt;/td&gt;
      &lt;td&gt;Application&lt;/td&gt;
      &lt;td&gt;Kubernetes Manifests/Azure Functions/ECS Taks/Google Cloud Functions&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;principle-1-the-rate-of-change&quot;&gt;Principle 1: The Rate of Change&lt;/h2&gt;

&lt;p&gt;One of the difficult concepts to quantify when thinking of these layers is the &lt;em&gt;rate of change&lt;/em&gt; that these resources undergo when you’re managing them. The lower layers generally will change &lt;em&gt;less frequently&lt;/em&gt; than your higher layers, and in addition to this these layers are generally most fraught with risk when changing them.&lt;/p&gt;

&lt;p&gt;You might be wondering why this matter - the answer to this is because when structuring your IaC, you’ll want to consider how you group resources together in your chosen IaC’s encapsulation mechanism. For example, Pulumi uses the concept of &lt;a href=&quot;https://www.pulumi.com/docs/concepts/projects/&quot;&gt;&lt;em&gt;projects&lt;/em&gt;&lt;/a&gt; to group resources together.&lt;/p&gt;

&lt;p&gt;When creating and defining resources in a Pulumi project, the fundamental consideration you need think of when adding a resource is “which layer does this resource live in?”. You generally shouldn’t have resources from different layers in the same project, because the rate of change of those resources will be different and the risk of changing them is different.&lt;/p&gt;

&lt;h2 id=&quot;principle-2-resource-lifecycle&quot;&gt;Principle 2: Resource Lifecycle&lt;/h2&gt;

&lt;div class=&quot;alert alert-info&quot; role=&quot;alert&quot;&gt;&lt;i class=&quot;fa fa-info-circle&quot;&gt;&lt;/i&gt; &lt;b&gt;Note:&lt;/b&gt; This principle comes to you thanks to my wonderful colleague Ringo De Smet, who reminded me of the importance of breaking the rules when reviewing this post&lt;/div&gt;

&lt;p&gt;As with all Principle in life, there are situations where principle 1 doesn’t broadly apply.&lt;/p&gt;

&lt;p&gt;There are resources within the above layers where you might think “ah! this is a network resources so I’ll put it in my network project” but the &lt;em&gt;lifecycle&lt;/em&gt; of the resource doesn’t necessarily fit as a shared resource. A great example of this is an AWS security group.&lt;/p&gt;

&lt;p&gt;Security groups are generally specific to another resource - perhaps an application you’re deploying, a loadbalancer that’s shared or maybe a database in the data layer. With these resources, it’s generally best to consider the overall lifecycle of the dependent resources when deciding where to put it.&lt;/p&gt;

&lt;p&gt;My rule of thumb here is this - if I wanted to provision this resource in a different environment, or better yet, destroy it - what other resources do I want to destroy at the same time?&lt;/p&gt;

&lt;p&gt;Another great considering for this is the permissions layer. I already mentioned when discussing permissions that you’ll need to think about that layer as &lt;em&gt;shared&lt;/em&gt; permissions, application specific permissions are entirely different - they really want to go directly with your application deployment code.&lt;/p&gt;

&lt;p&gt;The summary here is: don’t be afraid to break the first principle, but make sure when you’re doing it you’re thinking about the resource lifecycle.&lt;/p&gt;

&lt;h2 id=&quot;principle-3-repositories&quot;&gt;Principle 3: Repositories&lt;/h2&gt;

&lt;p&gt;The mono-repo vs multi-repo debate is one that’s will rage long after we’re all done with cloud computing and have migrated back to physical infrastructure, and I’m not going to try and solve it here. What I &lt;em&gt;will&lt;/em&gt; say is that I’ve seen both work well, and both work poorly.&lt;/p&gt;

&lt;p&gt;When it comes to IaC repositories, I again come back to our layering system and make differing decisions for where the code for deploying the repo should live is based on the layer.&lt;/p&gt;

&lt;h3 id=&quot;the-control-repo&quot;&gt;The Control Repo&lt;/h3&gt;

&lt;p&gt;For the foundational, shared aspects of the infrastructure, I generally like to include those projects in a single mono-repo which back in my days of using &lt;a href=&quot;https://www.puppet.com/&quot;&gt;Puppet&lt;/a&gt; we called a &lt;a href=&quot;https://github.com/puppetlabs/control-repo&quot;&gt;control repo&lt;/a&gt;. I still like to use this nomenclature.&lt;/p&gt;

&lt;p&gt;If we refer back to our layers, I generally like to ensure layers 1 and 2 in this shared repo. Things get a &lt;em&gt;little&lt;/em&gt; trickier once we get to layer 3, the permissions layer. At this stage, we need to decide if the resource itself is shared or not. A good example of this is an IAM role that might be used for &lt;em&gt;human&lt;/em&gt; users instead of application user. This is generally going to be shared across multiple humans and teams, so it’s a good candidate for the control repo.&lt;/p&gt;

&lt;p&gt;Layer 4 really depends on your application architecture. If you have a message bus spanning multiple applications, putting it in your control repo probably makes sense, but if you have a database that is only used by a single application, you likely don’t want it in the control repo for a variety of reasons.&lt;/p&gt;

&lt;p&gt;Layer 5 again depends on your organisation’s permission model and cloud architecture. It’s not uncommon to share shared compute like an ECS cluster or Kubernetes cluster which spans many applications, so including it in an application repo probably isn’t going to make much sense. However if you’re isolating compute on a per-application basis, you’re almost certainly going to want to make this application specific.&lt;/p&gt;

&lt;p&gt;Layer 6: As it likely becoming an obvious trend, you’ll need to take your application architecture and permission model into account. If you’re using a shared load balancer and routing traffic that way, you’ll likely want to include it in the control repo, but if you’re using a per-application load balancer you’ll want to include it in the application repo.&lt;/p&gt;

&lt;h3 id=&quot;application-repositories&quot;&gt;Application Repositories&lt;/h3&gt;

&lt;p&gt;At the very least, if you’re using IaC to deploy your application, having a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deploy/&lt;/code&gt; directory in your application repo is a great starting point. If you use Pulumi and want to use the same language as your application to do your deployments, you might consider having all of your dependencies in a single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;package.json&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;requirements.txt&lt;/code&gt; depending on your chosen language.&lt;/p&gt;

&lt;p&gt;You’ll need to think about the rate of change here when you’re defining projects to group resources together. Do you perhaps need to separate your database layer and your application layer resources? I’d argue that you do, because the rate of change of your application layer is likely going to be much higher than your database layer, but you’ll need to make a decision that makes sense for your organisation and project.&lt;/p&gt;

&lt;h3 id=&quot;why-do-this&quot;&gt;Why do this?&lt;/h3&gt;

&lt;p&gt;The primary reason for making the decision to use both mono-repos and keeping deployment code with applications is built from a perspective of &lt;em&gt;ownership&lt;/em&gt; and &lt;em&gt;orchestration&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Foundational infrastructure at layers 1, 2 and possibly up to layer 5 is an order of operations problem and a workflow orchestration problem. In most circumstances, you’ll be creating resources that depend on &lt;em&gt;other&lt;/em&gt; resources while building the IaC graph.&lt;/p&gt;

&lt;p&gt;By deciding to break the resources into different projects, you can create a workflow that allows you to deploy the resources in the correct order. You’ll be able to utilize Pulumi &lt;a href=&quot;https://www.pulumi.com/learn/building-with-pulumi/stack-references/&quot;&gt;stack references&lt;/a&gt; to share resources between stacks and projects, but you’ll need to ensure that a resource in a project in layer 2 that depends on a project in layer 1 has been created and resolved first.&lt;/p&gt;

&lt;p&gt;In a mono-repo, this is as simple as ensuring that the workflow or CI/CD tool runs the projects in the correct order, but in a multi-repo implementation, it becomes a complex orchestration problem that likely involves multi repo webhooks and a lot of duct tape.&lt;/p&gt;

&lt;p&gt;Application repos are far enough down the layering system that all of the infrastructure required to run your application will be in place. Placing application deployment infrastructure code in the application repo allows you to give the application developers full ownership of their code from writing and features to getting them into production.&lt;/p&gt;

&lt;h2 id=&quot;principle-3-encapsulation&quot;&gt;Principle 3: Encapsulation&lt;/h2&gt;

&lt;p&gt;Once you’ve made the foundational decisions above, you’ll be well on the way to structuring a well defined set of infrastructure as code patterns, but the final thing you’ll need to consider is how you’ll share resource patterns across your control repo and application repos.&lt;/p&gt;

&lt;p&gt;Every IaC tool has a different way of managing this. In Pulumi you can create a &lt;a href=&quot;https://www.pulumi.com/docs/concepts/resources/components/&quot;&gt;Component Resource&lt;/a&gt; for a single language or if you want to support multiple language, you might want to create a &lt;a href=&quot;https://www.pulumi.com/docs/using-pulumi/pulumi-packages/&quot;&gt;Pulumi Package&lt;/a&gt;, but the reason for doing this is the same: you want to encapsulate a set of best practices that you can share across multiple projects.&lt;/p&gt;

&lt;p&gt;A good consideration for for when to start encapsulating resources is to think about your organisational structure and application architecture. If you’re only one team deploying a single application, you might not need to go down the path of encapsulating anything, but if you’re a platform team that’s likely to support dozens of teams to deploy to a shared layer 5 compute resource, creating a Pulumi package that encapsulates the best practices for deploying your application or creating a package for a best practice object storage bucket which has the required permissions is going to save the teams you’re supporting a &lt;em&gt;lot&lt;/em&gt; of time.&lt;/p&gt;

&lt;p&gt;These encapsulations should be in their own, &lt;em&gt;distinct&lt;/em&gt; repository. You’ll want to version these encapsulations in the same way you version and release your applications - follow semver and make sure you create an API that your downstream users can use.&lt;/p&gt;

&lt;p&gt;As your downstream users start to depend on these encapsulations, you can introduce concepts like &lt;a href=&quot;https://www.pulumi.com/docs/using-pulumi/testing/unit/&quot;&gt;unit testing&lt;/a&gt; to make sure you &lt;a href=&quot;https://lkml.org/lkml/2012/12/23/75&quot;&gt;don’t break userspace&lt;/a&gt; with your infrastructure.&lt;/p&gt;

&lt;h3 id=&quot;pitfalls&quot;&gt;Pitfalls&lt;/h3&gt;

&lt;p&gt;A common mistake I see at the encapsulation layer when adopting Pulumi is trying to avoid object orientated principles and using a what I like to call the “function based approach”.&lt;/p&gt;

&lt;p&gt;As an example of this, you might try and encapsulate some resources into a function. In TypeScript it’d look like this:&lt;/p&gt;

&lt;div class=&quot;language-typescript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;createBucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;aws&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and in Python like so:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;create_bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aws&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aws&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The problem with this implementation of an abstraction is that it creates a nested mechanism that is difficult to manage successfully.&lt;/p&gt;

&lt;p&gt;If you use a component, you get you get an abstraction mechanism that is much more native to the way the language works. In TypeScript, it looks like this:&lt;/p&gt;

&lt;div class=&quot;language-typescript highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Bucket&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;pulumi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ComponentResource&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;readonly&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;aws&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;kd&quot;&gt;constructor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;BucketArgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;?:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;pulumi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ComponentResourceOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;lbrlabs:index:Bucket&lt;/span&gt;&lt;span class=&quot;dl&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{},&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;bucket&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;aws&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;parent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and in Python:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pulumi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ComponentResource&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BucketArgs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Optional&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pulumi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ResourceOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;lbrlabs:index:Bucket&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bucket&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aws&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Bucket&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;While the instantiation of the resource is more complex, the manageability of this over time is exponentially easier. Trust me, I’ve untangled this mess before.&lt;/p&gt;

&lt;h1 id=&quot;putting-it-together&quot;&gt;Putting it together&lt;/h1&gt;

&lt;p&gt;An example is worth a thousand words, so let’s take a look at a hypothetical control repo and application repo.&lt;/p&gt;

&lt;h2 id=&quot;control-repo&quot;&gt;Control Repo&lt;/h2&gt;

&lt;p&gt;Let’s say we’re going to be super original and call our repo &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;infrastructure&lt;/code&gt;. Here’s how that might look:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;├── certs
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── __main__.py
│   ├── requirements.txt
│   └── venv
├── cluster
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── README.md
│   ├── __main__.py
│   ├── requirements.txt
│   └── venv
├── shared_database
│   ├── Pulumi.development.yaml
|   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── __main__.py
│   ├── components
│   ├── requirements.txt
│   └── venv
├── domains
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── __main__.py
│   ├── requirements.txt
│   └── venv
├── cache
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── __main__.py
│   ├── requirements.txt
│   └── venv
├── shared_example_app
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── README.md
│   ├── __main__.py
│   ├── productionapp.py
│   ├── requirements.txt
│   └── venv
├── shared_bucket
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── __main__.py
│   ├── requirements.txt
│   └── venv
├── vpc
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   ├── __main__.py
│   ├── requirements.txt
│   └── venv
└── vpn
    ├── Pulumi.development.yaml
    ├── Pulumi.production.yaml
    ├── Pulumi.yaml
    ├── __main__.py
    ├── requirements.txt
    └── venv
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can see here that we’re using &lt;a href=&quot;https://www.pulumi.com/docs/concepts/stack/&quot;&gt;Pulumi stacks&lt;/a&gt; to target differing environments (in this case, development and production), and creating a new project for different layers and resources.&lt;/p&gt;

&lt;p&gt;You’ll likely also notice that I’ve been quite liberal with my use of directories for each set of services. I’m not grouping all of the network/layer 2 resources into a single project, however I’m following the layering principle by not grouping any resources from different layers into the same project.&lt;/p&gt;

&lt;p&gt;You can definitely reduce the number of projects here (for example, you might choose to groups the VPC and VPN projects together in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;network&lt;/code&gt; project) but I generally find that projects/directories are “free” and reducing the blast radius of changes makes people feel comfortable about contributing to these shared elements.&lt;/p&gt;

&lt;h2 id=&quot;application-repo&quot;&gt;Application Repo&lt;/h2&gt;

&lt;p&gt;Once we get to our application repo, it’s a lot harder to be prescriptive, but let’s say we have a simple Go application called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;example-app&lt;/code&gt;. Here’s how that might look:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;.
├── Dockerfile
├── Makefile
├── docker-compose.yml
├── deploy
│   ├── Pulumi.development.yaml
│   ├── Pulumi.production.yaml
│   ├── Pulumi.yaml
│   └── main.go
├── go.mod
├── go.sum
├── main.go
├── readme.md
└── README.md
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Hopefully this is fairly self explanatory, you’ve got your application and mechanisms for local development with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Dockerfile&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Makefile&lt;/code&gt;, and we can put our Pulumi code in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deploy/&lt;/code&gt; directory.&lt;/p&gt;

&lt;h2 id=&quot;encapsulation-repo&quot;&gt;Encapsulation Repo&lt;/h2&gt;

&lt;p&gt;Finally, let’s take a look at an example encapsulation repo. These repos can be quite complex, so as an example, take a look at this Pulumi package which encapsulates some level compute &lt;a href=&quot;https://github.com/lbrlabs/pulumi-lbrlabs-eks&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;We’ve covered a lot of ground here, but hopefully this has given you some ideas on how you might want to structure your infrastructure as code. If you’re using Pulumi, as with all my content, always open to hearing better ideas!&lt;/p&gt;

</description>
                    <pubDate>Thu, 17 Aug 2023 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2023/08/17/structuring-iac.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2023/08/17/structuring-iac.html</guid>
                </item>
            
		
            
                <item>
                    <title>Authenticating to AWS the right way for (almost) every use-case</title>
                    
                        <description>&lt;p&gt;One of my &lt;em&gt;favourite&lt;/em&gt; things about AWS is their ability to make the wrong decision easy and the right decision hard. Our world is moving towards managed services and “off the shelf” experiences for provisioning infrastructure and deploying applications. AWS itself has gotten the memo with its introduction of services like &lt;a href=&quot;https://aws.amazon.com/proton/&quot;&gt;AWS Proton&lt;/a&gt;, however its real value proposition is in its ability to provider &lt;em&gt;building blocks&lt;/em&gt; for the infrastructure needs of today.&lt;/p&gt;

&lt;p&gt;It largely does a good job of providing these building blocks, but where it really falls down is in the “reasonable defaults” category with most services. Nowhere is this more obvious than in its authentication strategy.&lt;/p&gt;

&lt;h1 id=&quot;authentication-vs-authorization&quot;&gt;Authentication vs Authorization&lt;/h1&gt;

&lt;p&gt;Before I go off into a rant about why AWS authentication is broken by default, I’d like to talk about where &lt;a href=&quot;https://aws.amazon.com/iam/&quot;&gt;AWS Identity and Access Manager’s&lt;/a&gt; responsibilities lie. The same service, &lt;em&gt;IAM&lt;/em&gt;, is largely responsible for both authentication and authorization within AWS. Let’s define those before we explain IAM’s role.&lt;/p&gt;

&lt;p&gt;Authentication is the mechanism by which you tell a service &lt;em&gt;who you are&lt;/em&gt;. The most common form of authentication is a username and password, but there are many other ways of providing authentication like &lt;a href=&quot;https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html&quot;&gt;AWS access keys and secret keys&lt;/a&gt;. You send your credentials over to a service and it verifies those credentials are correct. As an aside here, one of the biggest problems with the internet is that credentials are designed to verify who you are but don’t actually verify the person owning the credentials is the person who’s supposed to. &lt;a href=&quot;https://en.wikipedia.org/wiki/Multi-factor_authentication&quot;&gt;Multi-factor authentication&lt;/a&gt; is designed to help with this, but there’s still lots of room for improvement.&lt;/p&gt;

&lt;p&gt;Authorization is the way the service knows &lt;em&gt;what you’re allowed to do&lt;/em&gt;. An AWS IAM policy is a document that give you the ability to dictate that once someone or something has authenticated to AWS, this is the actions they’re allowed to perform.&lt;/p&gt;

&lt;p&gt;I have a lot of time for AWS authorization via IAM policies because they’re extremely verbose and operate on a whitelisting basis, meaning by default you can’t really do anything at all.&lt;/p&gt;

&lt;p&gt;However, the “default” mechanism for providing access to AWS is absolutely shit. Here’s why.&lt;/p&gt;

&lt;h1 id=&quot;the-aws-credential-problem&quot;&gt;The AWS Credential Problem&lt;/h1&gt;

&lt;p&gt;If you’re a user of AWS, you probably have an AWS Access Key or AWS Secret Key stored on your machine right now. Go on, take a look in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;${HOME}/.aws/credentials&lt;/code&gt; and have a look if you have them. Do you? That’s bad.&lt;/p&gt;

&lt;p&gt;If you don’t have AWS credentials on your personal laptop, bust open your companies flagship application and do a quick search for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AWS_ACCESS_KEY&lt;/code&gt;. Do you see anything? That’s really fucking bad.&lt;/p&gt;

&lt;p&gt;If you answered yes to either of these, it’s best to keep reading. If you answered no, keep reading anyway.&lt;/p&gt;

&lt;h2 id=&quot;credential-rotation&quot;&gt;Credential Rotation&lt;/h2&gt;

&lt;p&gt;The &lt;em&gt;reason&lt;/em&gt; it’s bad is simple: static credentials are very easy to steal or extract for an attacker. There are lots of ways this can happen: you could leave your laptop on a train (if you live in a civilized country with access to public transit). You could accidentally check them into version control on the public internet. You might even accidentally blurt them out to someone while in a bar instead of your phone number if you’ve memorized them for some reason. However you choose to tell the world about your authentication keys for AWS, the impact is the same: those keys then need to be manually revoked by you because they have an &lt;em&gt;unlimited lifespan&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The general solution to this problem is to regularly rotate your IAM credentials. AWS tries to do its best to make you do this by shaming you on the AWS Users page:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/aws-creds.png&quot; alt=&quot;AWS Credentials Expired&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The problem with rotating these credentials is that it’s such a bloody &lt;a href=&quot;https://en.wiktionary.org/wiki/faff&quot;&gt;faff&lt;/a&gt;. You have to update the keys wherever they’re used and stuff might break when you’re doing it and I really just can’t be arsed, what’s the worst that could happen?&lt;/p&gt;

&lt;p&gt;Well the answer is that it could cost you a lot of your money or your companies money as this lovely Reddit search of the AWS subreddit shows:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/aws-reddit-hacked.png&quot; alt=&quot;Hacked AWS Accounts&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If you head into Reddit’s cesspit for a second and read one of those threads, you’ll see the advice to “make sure you have MFA turned on!” which is very good advice. The problem is, AWS access keys and secret keys don’t &lt;em&gt;need&lt;/em&gt; MFA to be effective. If someone has your keys, they can do a whole bunch of stuff, including change your password to lock you out. That’s not good, is it?&lt;/p&gt;

&lt;h1 id=&quot;temporary-credentials-to-the-rescue&quot;&gt;Temporary Credentials to the rescue&lt;/h1&gt;

&lt;p&gt;The obvious solution to this is to just not use AWS Access Keys or Secret Keys, but you still need to authenticate to the AWS API to tell it who you are! Luckily, AWS has a solution for this in the form of &lt;a href=&quot;https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html&quot;&gt;Temporary Credentials&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;session-tokens&quot;&gt;Session Tokens&lt;/h2&gt;

&lt;p&gt;AWS can provide you with an AWS Access Key and an AWS secret key via &lt;a href=&quot;https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html&quot;&gt;AWS STS&lt;/a&gt;, its security token service. Credentials generated via STS are very similar to the tokens you see in the AWS IAM User interface, but they also come with an AWS Session Token which requires you to specify &lt;em&gt;duration&lt;/em&gt;. Once that duration has expired, the issued credentials become inactive and can’t be used, so you now need to generate some more.&lt;/p&gt;

&lt;h2 id=&quot;hang-on&quot;&gt;Hang on…&lt;/h2&gt;

&lt;p&gt;Yes, that’s right. You’ll have AWS credentials you have to &lt;em&gt;constantly renew&lt;/em&gt; in order for them to be effective. You also need to have already authenticated with AWS to retrieve temporary credentials! If this sounds like a lot of work with a very limited amount of benefit, read on.&lt;/p&gt;

&lt;h1 id=&quot;back-to-authentication&quot;&gt;Back to Authentication&lt;/h1&gt;

&lt;p&gt;I started off making the point that AWS makes the &lt;em&gt;right&lt;/em&gt; thing hard to do, and in the case of temporary credentials, the difficulty comes in knowing what the hell is the right thing to do. AWS makes it really fucking easy to generate long standing AWS credentials, in a lot of ways, the &lt;em&gt;default&lt;/em&gt; for AWS is to use AWS IAM Users. It doesn’t tell you about all the alternatives to IAM users, when to use them and how to implement them at all.&lt;/p&gt;

&lt;p&gt;While it may or may not be clear to you that temporary credentials provide a huge security benefit, what may not be clear at this stage is how the hell you actually generate them.&lt;/p&gt;

&lt;p&gt;The answer to that largely depends on your current use case. Generating temporary credentials is different depending on if you’re a human, machine or application. So now I’ve spent the better part of this post explaining things you probably don’t care about like an online recipe that makes you scroll for 10 minutes, lets actually talk about the different strategies for authenticating to AWS.&lt;/p&gt;

&lt;h1 id=&quot;i-want-to&quot;&gt;I want to..&lt;/h1&gt;

&lt;p&gt;From here, we’re going to detail the scenarios in the form of user stories, because everyone loves those in their JIRA tickets don’t they?&lt;/p&gt;

&lt;h2 id=&quot;authenticate-to-aws-as-a-human-user&quot;&gt;Authenticate to AWS as a Human User&lt;/h2&gt;

&lt;p&gt;As a human user, the temptation is there to generate AWS keys via the IAM user interface. That’s wrong. You should be using &lt;a href=&quot;https://aws.amazon.com/iam/identity-center/&quot;&gt;AWS IAM Identity Center&lt;/a&gt; aka the artist formerly known as AWS SSO.&lt;/p&gt;

&lt;p&gt;AWS SSO allows you to either use AWS as an identity provider, or hook in your own identity provider like &lt;a href=&quot;https://www.okta.com/&quot;&gt;Okta&lt;/a&gt;, &lt;a href=&quot;https://auth0.com/&quot;&gt;Auth0&lt;/a&gt; or even &lt;a href=&quot;https://support.google.com/a/answer/60224&quot;&gt;Google&lt;/a&gt;. Any service that provides services as an &lt;a href=&quot;https://en.wikipedia.org/wiki/Identity_provider&quot;&gt;Identity Provider&lt;/a&gt; (or IdP) can be hooked into AWS IAM Identity Center.&lt;/p&gt;

&lt;p&gt;Once you’ve hooked in your IdP or enabled AWS’s IdP, you then use the &lt;a href=&quot;https://aws.amazon.com/cli/&quot;&gt;AWS CLI&lt;/a&gt; to authenticate via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aws sso login&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You’ll need to configure your &lt;a href=&quot;https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html#cli-configure-files-where&quot;&gt;AWS configuration&lt;/a&gt; file a little bit, mainly to tell it what AWS account and role you want when you login:&lt;/p&gt;

&lt;div class=&quot;language-ini highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;[profile personal-management]&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;sso_start_url&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;https://lbrlabs.awsapps.com/start&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;sso_region&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;us-west-2&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;sso_account_id&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;account-id&amp;gt;&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;sso_role_name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;AWSAdministratorAccess&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;region&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;us-west-2&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;output&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once you run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aws sso login&lt;/code&gt; for that profile, the AWS CLI will walk you through the authentication flow, verify who you are and then issue temporary credentials for you!&lt;/p&gt;

&lt;p&gt;These temporary credentials get stored inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/.aws/sso/cache&lt;/code&gt; and will expire within a duration you specify at the AWS IAM Identity Center level. Generally you will have to reauthenticate once a day via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aws sso login&lt;/code&gt;, but that’s good because it means if anyone steals your temporary credentials, they will automatically expire after a certain time period.&lt;/p&gt;

&lt;p&gt;Once you’re authenticated via SSO, you can also generate new temporary credentials using this &lt;a href=&quot;https://github.com/jaxxstorm/aws-sso-creds&quot;&gt;handly little tool&lt;/a&gt; I wrote once upon a time.&lt;/p&gt;

&lt;h2 id=&quot;authenticate-to-aws-as-an-ec2-instance&quot;&gt;Authenticate to AWS as an EC2 Instance&lt;/h2&gt;

&lt;p&gt;Hopefully this one is straightforward, but if you’re running workloads inside anything that uses an &lt;a href=&quot;https://aws.amazon.com/ec2/&quot;&gt;EC2&lt;/a&gt; Instance, whether that be an unmanaged EC2 instance like EC2 itself, or via a managed service like &lt;a href=&quot;https://aws.amazon.com/fargate/&quot;&gt;Fargate&lt;/a&gt;, you should be assigning an &lt;a href=&quot;https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html&quot;&gt;IAM role&lt;/a&gt; and possible an &lt;a href=&quot;https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html&quot;&gt;Instance Profile&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Assigning a role to your AWS workload will automatically mean credentials get refreshed for you on a time period. You no longer have hard coded AWS credentials stored in plaintext that can be extracted and used elsewhere. Easy!&lt;/p&gt;

&lt;h2 id=&quot;authenticate-to-aws-as-an-application-that-only-managed-content-in-an-s3-bucket&quot;&gt;Authenticate to AWS as an application that only managed content in an S3 bucket&lt;/h2&gt;

&lt;p&gt;This one’s a little unique. Let’s say you have an application that uses object storage, and you want to keep those files private from the internet, but you want people to be able to manage files. A good example? Something that allows you to upload a profile photo might fit the bill.&lt;/p&gt;

&lt;p&gt;You probably don’t want to generate AWS credentials for each user session, and you &lt;em&gt;definitely&lt;/em&gt; don’t want to allow anyone to mess with your S3 bucket (although, lots of people do that. They usually make the news when they do..).&lt;/p&gt;

&lt;p&gt;So AWS obviously came up with a solution to this! &lt;a href=&quot;https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html&quot;&gt;Presigned URLs&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;Presigned URLs generate a unique link to an object within an S3 bucket that expires. There are no temporary credentials, but the URL itself has credentials embedded in it, like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;https://thisisntarealbucket.s3.eu-west-2.amazonaws.com/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&amp;amp;X-Amz-Credential=&amp;lt;AWS_ACCESS_KEY&amp;gt;%2F20180210%2Feu-west-2%2Fs3%2Faws4_request&amp;amp;X-Amz-Date=202000905T171315Z&amp;amp;X-Amz-Expires=1800&amp;amp;X-Amz-Signature=12b74b0794aa147ad7d3d03b3f20c61f1f91cc9ad8873e3314255dc479a25351&amp;amp;X-Amz-SignedHeaders=host
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This has a limited number of use cases, but is &lt;em&gt;incredibly&lt;/em&gt; important for the use cases where it comes up. If you think you might need to let users upload or manage objects in a bucket, go with this option where you can.&lt;/p&gt;

&lt;h2 id=&quot;authenticate-to-aws-as-a-cicd-pipeline&quot;&gt;Authenticate to AWS as a CI/CD Pipeline&lt;/h2&gt;

&lt;p&gt;As with human users mentioned earlier, it can be incredibly tempting to just generate an IAM user and an Access Key and Secret Key and stick them in the secrets mechanism your CI/CD tool has.&lt;/p&gt;

&lt;p&gt;There is another way!&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html&quot;&gt;OIDC Providers&lt;/a&gt; give you the capability to authenticate to AWS via the &lt;a href=&quot;https://openid.net/connect/&quot;&gt;OpenIdentity Connect Protocol&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I wrote about this &lt;a href=&quot;/blog/2022/01/23/gha-cloud-credentials.html&quot;&gt;extensively&lt;/a&gt; if you’re using &lt;a href=&quot;https://github.com/features/actions&quot;&gt;GitHub Actions&lt;/a&gt; and AWS (as well as other cloud providers), but the reality is that &lt;em&gt;lots&lt;/em&gt; of CI/CD tools like &lt;a href=&quot;https://docs.gitlab.com/ee/ci/cloud_services/aws/&quot;&gt;GitLab&lt;/a&gt; and &lt;a href=&quot;https://circleci.com/docs/openid-connect-tokens&quot;&gt;CircleCI&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If your CI/CD tool doesn’t support OIDC, you should change CI/CD tools. There’s shitloads of them, choose one that does things properly, will you?&lt;/p&gt;

&lt;h2 id=&quot;authenticate-to-aws-as-compute-i-manage-that-isnt-running-inside-aws&quot;&gt;Authenticate to AWS as compute I manage that isn’t running inside AWS&lt;/h2&gt;

&lt;p&gt;The final authentication strategy is one I never thought I’d see materalize. In this scenario, perhaps you’re running compute in another cloud provider or in an on-premises datacenter. If you have the ability to run an operating system daemon you can use the imaginatively named &lt;a href=&quot;https://docs.aws.amazon.com/rolesanywhere/latest/userguide/introduction.html&quot;&gt;IAM Role Anywhere&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;IAM roles anywhere expects you to provision a valid TLS (&lt;a href=&quot;https://en.wikipedia.org/wiki/X.509&quot;&gt;X.509&lt;/a&gt;) certificate and create a trust (via a trust anchor) in AWS.&lt;/p&gt;

&lt;p&gt;Once you’ve trusted AWS with your infrastructure, you can then use a &lt;a href=&quot;https://docs.aws.amazon.com/rolesanywhere/latest/userguide/credential-helper.html&quot;&gt;credential process&lt;/a&gt; to issue some temporary credentials. Once upon a time, none of this was possible and you’d have to manually mess around rotating credentials, but now you get to mess around with Certificate Authorities instead. Is this better? Well, for the purposes of this blog post, lets say it is.&lt;/p&gt;

&lt;h1 id=&quot;wrap-up&quot;&gt;Wrap Up&lt;/h1&gt;

&lt;p&gt;I think I’ve covered a most of the needs here and given you at the very least, something to Google when you’re in a certain situation and need some AWS credentials. If you think I missed an option, let me know on &lt;a href=&quot;https://twitter.com/briggsl&quot;&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you need help implementing any of these strategies, why not &lt;a href=&quot;https://leebriggs.co.uk/contact?utm=awsauth&quot;&gt;Get in Touch!&lt;/a&gt;&lt;/p&gt;

</description>
                    <pubDate>Mon, 05 Sep 2022 00:00:00 +0000</pubDate>
                    <link>https://leebriggs.co.uk/blog/2022/09/05/authenticating-to-aws-the-right-way.html</link>
                    <guid isPermaLink="true">https://leebriggs.co.uk/blog/2022/09/05/authenticating-to-aws-the-right-way.html</guid>
                </item>
            
		
	</channel>
</rss>
