Thoughts on Using Chef

After masquerading as a DevOps engineer for the last few months (largely out of necessity), I thought it would be good to write up my thoughts on the experience as a whole. Much of this will largely be about Chef - how I chose to use Chef and my experience working with it.

Framework Complexity

If you’ve been following the ebb and flow of trends in DevOps, you’ll find that things seem to go in an out of fashion very quickly. Initially, system deploys were largely the realm of system administrators manning a mixture of shell scripts or Perl scripts. Occasionally, you’d also find instances of CFEngine in the wild. As the startup movement occurred and developers at leaner and smaller companies found themselves doing operations, many other frameworks such as Chef, Puppet, and Ansible started rising in popularity.

Initially, when I was considering the various options for deploying our stack, shell scripts or small frameworks like Babushka or Sprinkle were very appealing. Chef and Puppet in particular seemed to have a lot of dependencies and the ramp time felt long (they also both suffered from what I consider subpar documentation at the time). Fast forward a few months and I’m a happy Chef user. Why the change?

Why the framework?

When I interview candidates for system administrator or developer operations roles, I am often curious if they understand why the framework is superior in the first place. The answer lies in the fact that the dominant configuration management frameworks are declarative. For the uninitiated, declarative programming is a strange thing.

Declarative programming

In procedural programming, the programming informs the computer of a series of actions that the computer should perform. Most scripts take this form, in addition to programs written in a systems language such as C (e.g. Do this, then do that).

Conversely, the object oriented programmer informs the computer about a series of objects that act and interact with each other to accomplish the desired effect (e.g. This is a coffee maker which can brew coffee and can be cleaned).

In functional programming, the programmer defines a collection of functions that are composed and operate on inputs to produce a result (e.g. Here are instructions for taking the mean, and also computing the standard deviation).

Declarative programming is distinct from the others in that the programmer is defining not what the computer should do, but what the computer should be (e.g. There should be a file located at this path which reflects the current time).

Declarative programming is a natural fit for configuration management. Suppose for example, that I want an haproxy.cfg file to exist at /etc/haproxy/haproxy.cfg with some contents $CONTENTS. If I was to do this with shell scripts, I could just do:

1
echo $CONTENTS > /etc/haproxy/haproxy.cfg

Easy enough. But what if I also needed the permissions of that file to be 0700? I could modify the script:

1
2
echo $CONTENTS > /etc/haproxy/haproxy.cfg
chmod 0700 /etc/haproxy/haproxy.cfg

But what if that file is already created and owned by a different user? What if the file is there already and the state is correct? I wouldn’t want to repeat the operation again each time, so I’d have to write some conditional logic to check the diff against the contents I expect. Eventually, this simple script may evolve well beyond simple bash statements into a full blown program.

Imagine instead, if one could just write

1
2
3
4
5
6
file "/etc/haproxy/haproxy.cfg" do
  content @content
  mode 00700
  user "haproxy"
  group "haproxy"
end

Ignoring the specifics of the syntax, this is a declaration of what some aspect of the system should be. The entire declaration is a declared resource and it is provisioned by the underlying resource provider, in this case, the file. The file resource provider knows how to compute how the current state is inconsistent with the desired state and converge the former to the latter.

Chef

Programming in Chef then, is just a matter of building up a long list of these resources. Most of the resources will be files or packages that are provisioned by the built-in resource providers (file and package are two examples). Chef will then faithfully try to converge the system to the desired declared state, doing nothing if there is nothing to do.

Not understanding how the framework operates can create some gotcha moments. For example, suppose I wanted to have a file at the path /tmp/haproxy_last_update that contains the last time haproxy was updated (contrived I know). Since Chef lets the developer inline Ruby code in the Chef recipe, the neophyte Chef programmer may be tempted to write something like:

1
2
3
4
5
6
7
8
9
10
file "/etc/haproxy/haproxy.cfg" do
  content @content
  mode 00700
  user "haproxy"
  group "haproxy"
end
haproxy_update_time = File.mtime("/etc/haproxy/haproxy.cfg")
file "/tmp/haproxy_last_update"
  content haproxy_update_time
end

What will actually happen is that an exception will be thrown saying that "/etc/haproxy/haproxy.cfg" does not exist when File.mtime is called. If the file was already created, the contents of /tmp/haproxy_last_update would be the old update time as opposed to the new one with no exception thrown. What gives? The answer lies in how the Chef run is actually performed.

Chef Runs

Chef runs take a machine from one state to another in two phases. First, all the recipes are compiled and logic is run to produce a queue of resources that are to be applied. The issue with the usage of File.mtime above is that it will be excuted before the resource just above gets performed. The chef run will instead do the following:

  1. Push the file "/etc/haproxy/haproxy.cfg" resource onto the queue (does not actually modify the file!)
  2. Call File.mtime and store the result in a local variable
  3. Push the file "/tmp/haproxy_last_update" resource onto the queue (also does not actually modify any files)
  4. Assuming no other resources were declared, run the resource from step 1.
  5. Run the resource from step 2.

Now we see why the recipe behaved so strangely. Ruby logic embedded in recipe in this manner should assume that the machine has not left its original state since all resources are being pushed to a queue at the time the logic is invoked. So what’s the right way to write it? In this case, the right thing to do is to move the update time logic to a resource as well so that it will be invoked after the first resource.

Notifications and Subscriptions

This is running a bit long but the last pattern I wanted to discuss is that of recipe events. While it is possible to have tons of ruby logic in all the Chef recipes to determine what resources should run, or have all this logic embedded in ruby_block resources, the correct thing to use is the notification or the subscription.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
file "/etc/haproxy/haproxy.cfg" do
  content @content
  mode 00700
  user "haproxy"
  group "haproxy"
  notifies :action "ruby_block[/tmp/haproxy_last_update]", :create
end
ruby_block "/tmp/haproxy_last_update"
  block do
    haproxy_update_time = File.mtime("/etc/haproxy/haproxy.cfg")
    `echo ${haproxy_update_time} > /tmp/haproxy_last_update`
  end
  action :nothing
end

This snippet will have the second resource be pushed on the queue only if the first one was pushed to the queue. By chaining notifications, chef runs will be efficient and the recipes themselves will be lean. The best sorts of Chef recipes, in fact, are those with minimal code logic. That is because these recipes are by necessity more idiomatically declarative.

Writing your own resource providers

The last bit to learn once you get comfortable with using existing resource providers is to write your own! This will happen organically over time as you start reusing components across multiple recipes. What will probably happen is that your recipes will start as a simple list of resources. It will then expand as more complicated logic sets in, then contract afterwards as logic is refactored out into custom resource providers.

Conclusion

Having dabbled with Puppet, Ansible, and some other lighter weight frameworks, I think you can’t go wrong with Chef. There is a hefty upfront learning curve but I believe it pays dividends in the long run and the motivated individual can be up to speed in a weekend. I would recommend leveraging Opscode heavily on the free tier for learning purposes. In addition, if you cannot spare a machine to learn Chef, you can easily set up a virtual machine (I use Vagrant and test-kitchen for this).

Comments