Getting Started With Chef

A little over a year ago I was plugging through setting up another OpenCycleMap server. I knew what needed installing, and I'd done it many times before, but I suspected that there was a better way than having a terminal open in one screen and my trusty installation notes in the other.

Previously I'd taken a copy of my notes, and tried reworking them into something resembling an automated installation script. I got it to the point where I could work through my notes line-by-line, pasting most of them into the terminal and checking the output, with the occasional note requiring actual typing (typically when I was editing configuration files). But to transform the notes into a robust hands-off script would have been a huge amount of work - probably involving far too many calls to sed and grep - and making everything work when it's re-run or when I change the script a bit would be hard. I suspected that I would be re-inventing a wheel - but I didn't know which wheel!

The first thing was to figure out some jargon - what's the name of this particular wheel? Turns out that it's known as "configuration management". The main principle is to write code to describe the server setup, rather than running commands. That twigged with me straight away - every time I was adding more software to the OpenCycleMap servers I had this sinking feeling that I'd need to type the same stuff in over and over on different servers - I'd prefer to write some code once, and run that code over and over instead. The code also needs to be idempotent - i.e. it doesn't matter how many times you run the code, the end result is the same. That's about the sum of what configuration management entails.

There's a few open-source options for configuration management, but one in particular caught my eye. Opscode's Chef is ruby-based, which works for me since I do a fair amount of ruby development and it's a language that I enjoy working with. And chef is also what the OpenStreetMap sysadmins use to configure their servers, so having people around who use the same system would simply be a bonus.

What started off as a few days effort turned into a massive multi-week project as I learned chef for the first time, and plugged through creating cookbooks for all the components of my server. It was a massive task and took much longer than I'd initially expected, but 18 months on it was clearly worth it - I'd have never been able to run enough servers for all the styles I have now, nor been able to keep up with the upgrades to the software and hardware without it. It's awesome.

So here's some tips, for those who have their own servers and are in a similar position to what I was.

How many servers before it's worth it? Configuration management really kicks in to its own when you have dozens of servers, but how few are too few to be worth the hassle? It's a tough one. Nowadays I'd say if you have only one server it's still worth it - just - since one server really means three, right? The one you're running, the VM on your laptop that you're messing around with for the next big software upgrade, and the next one you haven't installed yet. If you're running a server with anything remotely important on it, then having some chef-scripts to get a replacement up and running if the first goes up in smoke is a really good time-critical aid when you need it most.
How do you get started with chef? Well, it's tough, the learning curve is like a cliff. Chef setups have three main parts - the server(s) you're setting up (the "node"), the machine you're pressing keys on (the "workstation") and the confusingly-named "chef server" which is where "nodes" grab their scripts ("cookbooks") from. It makes sense to cut down the learning, so I'd recommend using the free 5-node trial of their Hosted Chef offering. That way you only need to concentrate on the nodes and workstation setup at first - and when you run out of nodes, there's always the open-source chef-server if the platform is too expensive.
Which recipes should I use? There are loads available on github, and there's links all over the chef website. In general, I recommend avoiding them, at least at first. Like I mentioned, the learning curve is cliff-like and while you can do super-complex whizz-bang stuff with chef, the public recipes are almost all vastly overcomplicated, and more importantly, hard to learn from. Start out writing your own - mine were little more than a list of packages to install at first. Then I started adding in some templates, a few scripts resources here and there, and built up from there as I learned new features. Make sure your chef repository is in git, and that you're committing your cookbook changes as you go along
Where's the documentation? I'd recommend following the tutorial to get things all set up, while trying not to worry too much about the details. Then start writing recipes. For that, the resources page on the wiki tells you everything you need to know - start with the package resource, then the template resource, then on to the rest. There's a whole bunch of stuff that you won't need for a long time - attributes, tags, searches - so don't try learning everything in one go.

I'll be writing more about developing and testing cookbooks in the future - it's a whole subject in itself!

This post was posted on 19 November 2012 and tagged chef, opencyclemap

gravitystorm.co.uk

Getting Started With Chef