One of the first tools I came across when I started out in the IT industry was SmokePing. It’s been around for years and solves the important job of graphing latency between two points in a reasonable way. As a company grows and scales out into multiple datacenters, latency can affect the operation of software, so having it graphed makes a lot of sense.

I was surprised that there hasn’t been any alternative to SmokePing developed in the years since it was conceived. This is probably a testament to how well it works, but in my case, I already had a kick-ass graphite installation (with Grafana frontend, obviously) and I wanted to get my latency metrics in there, rather than having to support RRD tool and install a perl app.

So, I set about reinventing the wheel. Something on my radar was to get my head around Go and this seemed perfect for the task because:

  • It’s fast
  • You can build binaries with it super easily
  • It has concurrency built in

The last point was a big consideration, because pinging lots of endpoints consistently like SmokePing would be much easier if it’s trivial to launch concurrent operations. Go’s goroutines make this very easy.

Graphping is Born

So, with all this in mind, a colleague and I wrote Graphping. You can see the source code here.

In order to run it, you need to specify a config file, and an address for a statsd server. Making use of statsd means you can write metrics to any of the available statsd backends which allows you to use your existing monitoring infrastructure.

The config file makes use of HCL which means you can either write a human readable config file, or use a machine generated JSON config file. An example config file looks like this:

interval = 10 # A global interval. Can be overwritten per target group
prefix = "graphping" # A global prefix for statsd metrics

# Declare a target group with a name
target_group "search_engines" {
  # a custom ping interval for this group
  interval = 2
  # A prefix for the statsd metric for this group
  prefix = "search"
  # A name for the target. This becomes the statsd metric
  target "google" {
    address = ""
  target "bing" {
    address = ""

# You can specify multiple target groups
target_group "news_sites" {
  prefix = "uk"
  target "bbc" {
    address = ""

This all comes together to allow you to create graphs very similar to SmokePing. Here’s an example:

This is only my second project in Go, so there might be some issues or bugs and the code quality might not be fantastic. Hopefully as time goes on, further improvements will come.

Every company that uses Puppet eventually gets to the stage in their development where they want to store “secrets” within Puppet. Usually (hopefully!) your Puppet manifests and data will be stored in version control in plaintext and therefore adding these secrets to your manifests has some clear security concerns which need to be addressed.

You could just restrict the data to a few select people, and have it in a separate control repo, but at the end of the day, your secrets will still be in plaintext and you’re at the mercy of your version control ACLs.

Fortunately, a bunch of very smart people came across this problem a while ago and gave us the solutions we need to be able to solve the problem.

hiera-eyaml has been around a while now and gives you the capability to encrypt secrets stored in hiera. It provides an eyaml command line tool to make use of this, and will encrypt values for you using a pluggable backend. By default, it uses asymmetric encryption (PKCS#7) and will make any value indecipherable to anyone who has the key. You can see the example in the linked github repo, but for verbosity sake, it looks like this:

plain-property: You can see me

encrypted-property: >

In order to see the encrypted-property, you need to have access to the preshared key you used to encrypt the value, which means you have to copy the pre-shared key to your master. This is fine if you’re a single user managing a small number of Puppetmasters, but as your team scales this actually introduces a security consideration.

How do you pass the preshared key around? The more people that touch that key, the less secure it becomes. Distributing it to 20 odd people means that if a single user’s laptop is compromised, all your secrets will be under threat. Fortunately, there’s a better way of managing this which is facilitated by the plugin system hiera-eyaml supports, and the solution is hiera-eyaml-gpg

Using GPG Keys

The problem with hiera-eyaml-gpg is that the documentation only shows you how to set up hiera-eyaml-gpg, but you then have to go off and do a bunch of reading about how GPG keys work. If you already know how GPG keys work, skip ahead, this isn’t for you! If you don’t, let’s cover quickly how GPG keys work, and how this helps us solve the single key problem above.

Quick Overview

In a nutshell, GPG is a hybrid public and private key encryption system. In a bullet point format:

  • Each user or entity has a public and private key pair.
  • Public keys are used for encryption, private keys are used for decryption. Messages can be signed by encrypting a hash of the message using the sender’s private key, allowing the receiver to verify the integrity of the data by using the sender’s public key to decrypt the hash.
  • Private keys need to be kept secure by the owner.
  • Public keys need to be transferred reliably such that they cannot be altered, and substitute keys inserted.
  • You can encrypt data for multiple recipients, by using all of their public keys together. Any of them will be able to decrypt the data using their own private key.
  • A user can add a set of other user’s public keys to their GPG public keyring. Users create a “web of trust” by validating and signing user’s keys in their keyrings.

Thinking of this from a Puppet perspective:

  • Each puppetmaster (or sets of puppetmasters) will have their own public and private GPG key pair. The private keys will be kept local on the puppetmasters and should not be transferred anywhere else.
  • Each user that will be editing the secure data within puppet will also have a public and private key pair. They will keep their private key secure and private to themselves.
  • Each user will need to have all of the public keys of all puppetmasters, along with all other eyaml users, added to their own public keyring.
  • When a new user or new puppetmaster is added or a key is changed, all users will need to update their keyrings with the new public keys. Additionally, all encrypted data in hiera will need to be re-encrypted so that the new puppetmasters and users are able to decrypt the encrypted data.
  • If a puppetmaster gets compromised, or a user leaves the company, only the key for that puppetmaster (or set of puppetmasters) needs to be removed from the keyrings and encrypted data. None of the other puppetmaster or user keys need to be updated.

As you can see, this drastically improves security of your important data stored in hiera. With that in mind, let’s get started..

Generate a GPG Key

There are plenty of docs out there to explain how to generate a GPG key for each OS. In a short form, you should do this:

gpg --gen-key

You’ll get a handy menu prompt that will help you generate a key. SET A PASSPHRASE. Having a blank passphrase will compromise the whole web of trust for your encrypted data.

Generate a GPG Key for your Puppetmaster

Because GPG operates on the concept of each user using different keys, you’ll now need to generate a key for your Puppetmaster.

If you’re lucky, you can just use the above command and have done with it. In order to be more specific, here’s the way I know works to generate keys:

# Use a reasonable directory for gpghome
mkdir -m 0700 /etc/puppetlabs/.gpghome
chown puppet:puppet /etc/puppet/.gpghome

Now, the GPG we generate for the puppetmasters need some special attributes, so we’ll need a custom batch config file at /etc/puppetlabs/.gpghome/keygen.inp. Make sure you replace _keyname_ with something useful, like maybe puppetmaster

%echo Generating a default key
Key-Type: default
Subkey-Type: default
Name-Real: _keyname_
Expire-Date: 0
%echo done

Now, generate the key:

sudo -u puppet gpg --homedir /etc/puppet/.gpghome --gen-key --batch /etc/puppet/.gpghome/keygen.inp

Now that’s done, you should see a GPG key in your puppetmaster’s keyring:

sudo -u puppet gpg --homedir /etc/puppet/.gpghome --list-keys
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
pub   2048R/XXXXXXXX 2016-11-14
uid                  puppetmaster
sub   2048R/XXXXXXX 2016-11-14

A web of trust

GPG keys operate under the model that everyone has their own public and private key, and everyone in your team trusts each other (hopefully you trust your colleagues!). In the previous step, you generated a key, now, you need to make sure all your colleagues sign your key to verify its authenticity and confirm it’s valid. In order to do this, you need to distribute your public key to everyone and they need to sign it.

The way you distribute the public keys is up to you, but there are tools like Keybase or private keyservers available which you may choose to use. Obviously, it’s not recommended to send your puppetmasters GPG key to keybase. The most important consideration here is that the public keys can’t be modified in transit somehow. This means sending the GPG keys via email over the internet is probably not a fantastic idea, however sending to your colleagues via internal email probably wouldn’t be so terrible.

At a very minimum, you’ll need to sign the keys from your puppetmaster that you generated earlier. In order to do that, export the key in ASCII format:

sudo -u puppet gpg --homedir /etc/puppet/.gpghome --export -a -o /etc/puppet/.gpghome/

Copy the file locally so it’s ready to import.

In order to sign it, copy the file locally to your machine from $distribution_method and then run this:

gpg --import /path/to/

From here, you need to verify the signature, and if you’re happy, sign the key:

gpg --sign-key <keyname>

You’ll need to enter your own GPG keys’ passphrase in order to sign the key.

Everyone who’s going to be using encrypted yaml will need to perform this step for each of the keypairs you generate. This means when a new user joins the company, you’ll have to import and sign the keys that users generates. There are puppet modules which ease this process, and you can simply add a public key to a puppetmaster’s keyring by using golja-gnupg

Puppet, Hiera and GPG Keys

If you’ve now created a web of trust, you need to make Puppet aware of the GPG keys. Firstly, you’ll need to generate a GPG key for your masters. We group our masters into different tiers, dev/stg and prod, and we ensure these keys are distinctly separate. Then, make sure the key is signed by the relevant people, otherwise it’s pretty much useless :)

Once your keys and gpg config are set up, you’ll need to get hiera-eyaml-gpg working.

Install hiera-eyaml-gpg

The installation requirements are clearly spelled out here but for clarity’s sake, I’ll cover the process here. The process is basically the same for both users who’ll be using eyaml to encrypt values, and puppetmasters who will be encrypting values. From an OS perspective, you’ll need to make sure you have the ruby, ruby-devel, rubygems and gpgme packages installed. On CentOS, that looks like this:

sudo yum install ruby gpgme rubygems ruby-devel

Then, install the required rubygems in the relevant ruby path. If you’re using the latest version of puppetserver, you’ll need to install this using puppetserver gem install

sudo gem install gpgme
sudo gem install hiera-eyaml
sudo gem install hiera-eyaml-gpg

The Recipients File

One of the main ways that hiera-eyaml-gpg differs from standard hiera-eyaml is the gpg.recipients file. This file essentially lists the GPG keys that are available to decrypt secrets with a directory in hiera. This is an incredibly powerful tool, especially if you wish to allow users to encrypt/decrypt some secrets in your environment, but not others.

When the eyaml command is invoked, it will search in the current working directory for this file, and if one is not found it will go up through the directory tree until one is found. As an example, your hieradats directory might look like this:

├── development
│   ├── app1
│   │   └── hiera-eyaml-gpg.recipients
│   └── app2
│       └── hiera-eyaml-gpg.recipients
└── production
    ├── dc1
    │   ├── base.eyaml
    │   └── hiera-eyaml-gpg.recipients
    ├── hiera-eyaml-gpg.recipients
    └── role.eyaml

With this kind of layout, it’s possible to allow users access to certain app credentials, datacenters or even environments, without compromising all the credentials in hiera.

The format of the hiera-eyaml-gpg.recipients file is simple, it simply lists the GPG keys that are allowed to encrypt/decrypt values:


The value of this can be found in the uid field of the gpg --list-keys command.

Modify hiera.yaml

The final step in the process is to make hiera aware of this GPG plugin. Update to hiera.yaml to look like this:

  - yaml
  - eyaml
  - "nodes/%{clientcert}"
  :datadir: "/etc/puppet/environments/%{environment}/hieradata"
  :datadir: "/etc/puppet/environments/%{environment}/hieradata"
  :gpg_gnupghome: /etc/puppet/.gpghome
  :extension: 'eyaml'

At this point, puppet should use the GPG extension, assuming you installed it correctly previously

Adding an Encrypted Parameter

At this stage, you’ve done the following:

  • Generated GPG keys for all the human users who will be encrypting/decrypting values
  • Generated GPG keys for the puppetmasters which will be decrypting values
  • Shared the public keys around all the above to ensure they’re trusted
  • Installed the components required for Puppet to use GPG keys
  • Set up the hiera-eyaml-gpg.recipients file so hiera-eyaml-gpg knows who can read/write values.

The final step here is adding an encrypted value to hiera. When you did gem install hiera-eyaml you also got a handy command line tool to help with this.

In order to use it simply run the following:

eyaml edit hieradata/<folder>/<file>.eyaml

You’ll be asked to enter your GPG key password, and then you’ll get dropped into an editor with something like this in the header:

#| This is eyaml edit mode. This text (lines starting with #| at the top of the
#| file) will be removed when you save and exit.
#|  - To edit encrypted values, change the content of the DEC(<num>)::PKCS7[]!
#|    block (or DEC(<num>)::GPG[]!).
#|    WARNING: DO NOT change the number in the parentheses.
#|  - To add a new encrypted value copy and paste a new block from the
#|    appropriate example below. Note that:
#|     * the text to encrypt goes in the square brackets
#|     * ensure you include the exclamation mark when you copy and paste
#|     * you must not include a number when adding a new block
#|    e.g. DEC::PKCS7[]! -or- DEC::GPG[]!

As we noted, you’re using the GPG plugin, so add your value like so:

class::class::parameter: DEC::GPG[correct_horse_battery_staple]!

When you save the file, you can cat it again and you’ll see the value is now encrypted:


From here, you can push it to git and have it downloaded using the method you use to grab your config (I hope you’re using r10k!) and the puppetmaster (assuming you set up the GPG encryption correctly!) will be able to decrypt these secret and service it to hosts.

Happy encrypting!

I love Gitlab. With every release they announce some amazing new features and it’s one of the few software suites I consider to be a joy to use. Since we adopted it at $job we’ve seen our release cycle within the OPS team improve dramatically and pushing new software seems to be a breeze.

My favourite part of Gitlab is the flexibility and robustness of the gitlab-ci.yml file. Simply by adding a file to your repository, you can now have complex pipeline running tasks which can test, build and deploy your software. I remember doing things like this with Jenkins and being incredibly frustrated - with gitlab I seem to be able to do everything I need to without all the fuss.

I also make heavy use of travis-ci in my public and open source projects, and I really like the matrix feature that Travis offers. Fortunately, there’s a similar (but not quite the same) feature available in Gitlab CI but I feel like the documentation is lacking a little bit, so I figured I’d write up a step by step guide to how I’ve started to use these features for our pipelines.

A starting example

Let’s say you have a starting .gitlab-ci.yml like so:

  - test
  - build
  - deploy

  image: centos:6
    - rpmbuild -ba
    - tags
    - master
  stage: build
    - docker

  image: centos:7
    - rpmbuild
    - tags
    - master
  stage: build
    - docker

This is a totally valid file, but there’s a whole load of repetition in here which really shouldn’t need to be here. We can use some features of yaml called anchors and aliases which allow us to reduce the amount of code here. This is documented here in the Gitlab CI Readme, but I want to break it down into sections.

Define a hidden job

Firstly, we need to define a “hidden job” - this is essentially of course a job gitlab-ci is aware of but doesn’t actually run. It defines a yaml hash which we can merge into another hash later. We’ll take all of the hash values from the above two jobs that are the same, and place it in that hidden job:

# here we define a hidden job called "build" (prefixed with a dot)
# and then we assign it to an alias &build_definition
.build: &build_definition
    - rpmbuild -ba
    - tags
    - master
  stage: build
    - docker

What this has done is essentially created something like a function. When we call &build_definition, it’ll spit out the following yaml hash:

    - rpmbuild -ba
    - tags
    - master
  stage: build
    - docker

As you can see, the above yaml hash is only missing 2 things: A parent hash key and the value for “image”.

Reduce the code

In order to make use of this alias, we first need to actually define our build jobs. Remember, the above job is hidden so if we pushed to our git repo right now, nothing would happen. Let’s define our two build jobs.

  image: centos:6

  image: centos:7

Obviously, this isn’t enough to actually run a build. What we now need to do is merge to two hashes from the hidden job/alias and with our build definition.

  <<: *build_definition # this essentially says insert the hash values from &build_definition hash
  image: centos:6

  <<: *build_definition
  image: centos:7

That’s a lot less code duplication, and if you know what you’re looking at, it’s much easier to read.

Visualising your gitlab-ci.yml file

This all might seem a little confusing at first because it’s hard to visualise. The best way to get your head around what the output of your CI file is, is to remember that all Gitlab CI does when you push the file is load it into a hash and read the values. With that in mind, try this little 1 line script on your file:

ruby -e "require 'yaml'; require 'pp'; hash = YAML.load_file('.gitlab-ci.yml'); pp hash"

This is what the original yaml file hash looks like:

{"stages"=>["test", "build", "deploy"],
   "script"=>["rpmbuild -ba"],
   "except"=>["tags", "master"],
   "except"=>["tags", "master"],

And this is what the hash from the file with the anchors and such like contains:

{"stages"=>["test", "build", "deploy"],
  {"script"=>["rpmbuild -ba"],
   "except"=>["tags", "master"],
  {"script"=>["rpmbuild -ba"],
   "except"=>["tags", "master"],
  {"script"=>["rpmbuild -ba"],
   "except"=>["tags", "master"],

Hopefully that makes it easier to understand! As mentioned earlier, this isn’t as powerful (yet?) as Travis’s matrix feature, which can quickly expand your jobs multiple times over, but with nested aliases you can easily have quite a complex matrix.

We’re finally beginning to build out our production Kubernetes infrastructure at work, after some extensive testing in dev. Kubernetes relies heavily on TLS for securing communications between all of the components (quite understandably) and while you can disable TLS on many components, obviously once you get to production, you don’t really want to be doing that.

Most of the documentation shows you how to generate a self signed certficate using a CA certificate you create especially for kubernetes. Even Kelsey Hightower’s excellent “Kubernetes the Hard Way” post shows you how to generate the TLS components using a self signed CA. One of the nicest things about using Puppet is that you already have a CA set up and best of all, there’s some really nice APIs inside the puppet master/server meaning provisioning new certs for hosts is relatively straightforward. I really wanted to take advantage of this with our kubernetes setup, so I made sure etcd was using Puppet’s certs:


This works out of the box, because the certs for all 3 etcd hosts have been signed by the same CA.

Securing Kubernetes with Puppet’s Certs.

I figured it would be easy to use these certs for Kubernetes also. I set the following parameters in the API server config:

--service-account-key-file=/var/lib/puppet/ssl/private_keys/hostname.server.lan.pem --tls-cert-file=/var/lib/puppet/ssl/certs/hostname.server.lan.pem --tls-private-key-file=/var/lib/puppet/ssl/private_keys/hostname.server.lan.pem

but there were a multitude of problems, the main one being that when a pod starts up, it connects to the API using the kubernetes service cluster IP. You can see this in the log messages when starting a pod:

# kubectl logs kube-dns-v15-017ri --namespace=kube-system kubedns
I0821 08:48:12.808230       1 server.go:91] Using for kubernetes master
I0821 08:48:12.808304       1 server.go:92] Using kubernetes API <nil>
I0821 08:48:12.809448       1 server.go:132] Starting SkyDNS server. Listening on port:10053

I figured it would be easy enough to fix, I’ll just add a SAN for the puppet cert using the dns_alt_names configuration option. Unfortunately, this didn’t work, and I got the following error message:

E1125 17:33:16.308389 1 errors.go:62] Status: x509: cannot validate certificate because it doesn't contain any IP SANs

Puppet doesn’t have an option to set IP SANS in the SSL certificate, so I had to generate the cert manually and sign it by the Puppet CA. Thankfully, this is fairly straightforward (albeit manual)

Generating Certs Manually

First, create a Kubernetes config file for OpenSSL on your puppetmaster. I created a directory /var/lib/puppet/ssl/manual_ca to do all this.

[ ca ]

default_ca      = CA_default

[ CA_default ]

dir            = /var/lib/puppet/ssl/manual_ca
certs          = $dir/certs
crl_dir        = $dir/crl
database       = $dir/index.txt
new_certs_dir  = $dir/newcerts
certificate    = /var/lib/puppet/ssl/ca/ca_crt.pem
serial         = $dir/serial
crl            = /var/lib/puppet/ssl/ca/ca_crl.pem
private_key    = /var/lib/puppet/ssl/ca/ca_key.pem
RANDFILE       = $dir/ca/.rand
default_md     = sha256
policy         = policy_any
unique_subject = no

[ policy_any ]
countryName            = supplied
stateOrProvinceName    = optional
organizationName       = optional
organizationalUnitName = optional
commonName             = supplied
emailAddress           = optional

req_extensions = v3_req
distinguished_name = req_distinguished_name
string_mask             = utf8only

[ req_distinguished_name ]
countryName             = Country
stateOrProvinceName     = State
localityName            = Locality
organizationName        = Org
organizationalUnitName  = Me
commonName              = hostname

[ v3_req ]
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names

DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local
DNS.5 =
DNS.6 = hostname
IP.1 =
IP.2 = # external IP

Note the two IPs here. The first is the cluster IP from the kubernetes service, you can retrieve it like so:

# kubectl get svc
kubernetes       <none>        443/TCP    21d

I also added the actual IP of the kubernetes host for some future proofing. The DNS names have been generated from the Kube DNS config, so also make sure you match that to your kube-dns name.

Next, we need to generate a CSR and a key:

openssl req -newkey rsa:2048 -nodes -keyout private_keys/kube-api.key -out certificate_requests/kube-api.csr -config kubernetes.cnf

Verify that your CSR has the IP SANS in it:

openssl req -text -noout -verify -in certificate_requests/kube-api.csr | grep "X509v3 Subject" -A 1

Now, we need to sign the cert with the Puppet CA:

openssl x509 -req -in certificate_requests/kube-api.csr -CA /var/lib/puppet/ssl/ca/ca_crt.pem -CAkey /var/lib/puppet/ssl/ca/ca_key.pem -CAcreateserial -out certs/kube-api.pem -days 3000 -extensions v3_req -extfile kubernetes.cnf

This will create a cert in certs/kube-api.pem. Now verify it to ensure it looks okay:

openssl x509 -in certs/kube-api.pem -text -noout

We now have the a cert we can use for our kube-apiserver, so we just need to configure kubernetes to use it.

Configuring Kubernetes to use the certs.

Assuming you’ve copied the certs to your kubernetes master, we now need to configure k8s to use it. First, make sure you have the following config set in the apiserver:

--service-account-key-file=/etc/kubernetes/kube-api.key --tls-cert-file=/etc/kubernetes/kube-api.pem --tls-private-key-file=/etc/kubernetes/kube-api.key

And then configure the controller manager like so:

--root-ca-file=/var/lib/puppet/ssl/certs/ca.pem --service-account-private-key-file=/etc/kubernetes/kube-api.key

Restart all the k8s components, and you’re almost set.

Regenerate service account secrets

The final thing you’ll need to do is delete the service account secrets kubernetes generates on launch. The reason for this is because it use the service-account-private-key-file to generate them, and if you don’t do this you’ll get all manner of permission denied errors when launching pods. It’s easy to do this:

kubectl delete sa default
kubectl delete sa default --namespace=kube-system

NOTE if you’re already running pods in your kubernetes system, this may affect them and you may want to be careful doing this. YMMV.

From here, you’re using Puppet’s SSL Certs for kubernetes.

So you’ve decided you want to use Configuration Management to control your infrastructure. You’ve read about all of the benefits of “infrastructure as code” and you’ve decided you’re going to Puppet as your chosen configuration management tool.

I personally believe this to be a good choice. When making comparisons between Ansible, Chef and other configuraiton management tools, the true benefit of Puppet over these is the ecosystem that has been established around it to benefit your workflow. The problem with Puppet is getting started. You want to manage a bunch of stuff, but where do you start? How do all these tools fit together? What decisions do you need to make before diving in?

This is a very opinionated set of posts. I’ll try to cover the options, and how I’ve set about doing it, but the main theme here is essentially, getting you off the ground with Puppet.

So if you’ve finished the Puppet learning VM, and you’ve browsed a few modules and think “this is the tool for me!” then open up your desired editor and let’s get cracking!

Decision Time

Okay, now close your editor. We’re not going anywhere near it yet, because the first thing you need to do is make a few decisions about your infrastructure and what you’ll be managing with Puppet.

There are a lot of components within Puppet that let you manage your infrastructure in a flexible manner, but before you use them you need to know exactly what you want to do with them.

Your Infrastructure Layout

The first thing to think about is how does your infrastructure look at the high level. There are a few things you need to think about:

  • Do you have multiple geographic datacenters?
  • Do you have multiple deployment environments? (eg. dev, stage, production)
  • Do you have multiple infrastructure types? (eg. AWS and physical infrastructure)

The reason you need to think about these things is because it will determine how you use hiera to differentiate between these environments. There will be a full blog post about hiera later, but before you start using it you need to determine how your environment looks. The question you’re looking to answer is how do you logically seperate your infrastructure? Once you have an idea, it’s time to think about your individual hosts.

Node Classification and Roles

The very next thing you need to thing you need to decide is how you’re going to classify your nodes. Each node in a puppet infrastrucuture has a role or a “thing that it does”. As an example, you might have a database server (with role dbserver) and a web server (webserver). How do you determine that a webserver is a webserver? You might already know it is, but how does your infrastructure know? There are quite a few ways to do this.

  • Name based. You might always have the word “web” in the name, in which you can use a regex match
  • IP address. Maybe all webservers are in a specific subnet, in which case you might want to match on IP

These two options are both valid, and you can support them within Puppet. If you don’t currently have a classification system, or you want to improve it, you can use an ENC (External Node Classifier). The most popular ones are:

  • LDAP
  • Foreman
  • Something else with a HTTP API

Essentially if you use an ENC, it’ll become your source of true for your roles. This is personally the way I think it should be done, and I highly recommend using foreman. More to come later.