Example 05: Build systems


Remember that (very annoyingly) the indented bits of Makefiles must use a <TAB> and not spaces.

Let’s start with a toy project that contains R code to create a figure in PDF format, a markdown file for an article that includes that figure, and a Makefile that contains instructions to produce the articlea as a PDF file. It lives in files/examples/_make-example/ and looks like this:

ls files/examples/_make-example

I’ve put this project on GitHub as well if you want to clone it and practice with it. The Makefile looks like this:

# NOTE: install the pandoc-crossref filter if you don't already have it
# e.g. brew install pandoc-crossref

# The output file `mypaper.pdf` depends on `mypaper.md` and `fig1.pdf`
mypaper.pdf: mypaper.md fig1.pdf 
	pandoc -F pandoc-crossref mypaper.md -o mypaper.pdf

# The output file `fig1.pdf` depends on `fig1.r`
fig1.pdf: fig1.r
	R CMD BATCH fig1.r

# Clean target
.PHONY: clean

	rm -f mypaper.pdf
	rm -f fig1.pdf
	rm -f fig1.r.Rout

In this Makefile the final output is a PDF called mypaper.pdf. In order for it to be made we have to run pandoc on mypaper.md. But the text of mypaper.md includes a reference to the figure, fig1.pdf. So whenever we make mypaper.pdf we should make sure that fig1.pdf is up-to-date. We say that mypaper.pdf being current thus depends on fig1.pdf also being current. That’s what the line

mypaper.pdf: mypaper.md fig1.pdf

means in the Makefile. It’s saying mypaper.pdf depends on both mypaper.md and fig1.pdf. And the tabbed-in lines under the target line are the actual commands that produce mypaper.pdf, which in this case is just a pandoc command that takes mypaper.md produces mypaper.pdf from it.

Meanwhile fig1.pdf is created by an R script called fig1.R, which is what the second target line says. And fig1.pdf is created by the tabbed-in line R CMD BATCH fig1.r

In class and on the slides we covered the idea of targets (intermediate or final output files) and their prerequisites in more detail. A Makefile is a recipe for producing outputs from targets. We can see what make will do with our project by running make --dry-run. If we have not done anything get should see this:

# Rendering live means we have to switch into the project directory
cd files/examples/_make-example
make --dry-run

This is the sequence of shell commands make will run we type make at the console and hit enter. The result will be that first fig1.pdf and then mypaper.pdf will be created. In order to produce mypaper.pdf we first need to make fig1.pdf and so the command associated with the fig1.pdf target gets run first. The --dry-run switch means they are just echoed to the console, not executed. Once we’re happy with how they look, we can do it for real:

cd files/examples/_make-example

And our output files have been created, along with a log for the R script:

ls files/examples/_make-example

Now if we type make again (without editing either mypaper.md or fig1.r) we will get this:

make: `mypaper.pdf' is up to date.

Nothing is re-run, because nothing needs to be done.

The other target in the Makefile above is called clean and is associated with a list of commands that remove the output targets and some intermediate files too. You run make clean when you want to return the project to a state where no targets have been made and there are no extraneous log files (etc) hanging around the project folder. You have to specify what that clean state is, in the sense that you must write the commands to remove the specific files or folders you want gone. Here you can see we define the clean target to mean “delete (without asking for confirmation) the files mypaper.pdf, fig1.pdf, and the R log file fig1.r.Rout”.

cd files/examples/_make-example
make clean

And now the outputs are gone:

ls files/examples/_make-example

You can make any target in this way, by writing make <target_name>. Here we have two output targets, fig1.pdf and mypaper.pdf. So we could type, for example, make fig1.pdf and only that target would be produced (and only its prerequisites would be evaluated).

Here I’ve made the names of the targets the same as the names of the files that eventually get produced, but that is not required. It’s the commands associated with the targets that produce the outputs. Those targets can be named whatever makes sense for the actions being performed. So for example we could call the fig1.pdf target figures and write make figures, or the mypaper.pdf target paper and write make paper.

Finally, the .PHONY line says “clean is a target that should be run if called, even though there are no files named clean. This is to get around the problem that make targets are usually files that get created, but for some operations (like cleaning up) no files are made, we just want make to run some necessary commands.

Things to try

  • As a next step, change some text in mypaper.md, for example by replacing my name with yours. What does make --dry-run show now?

  • Next, try editing the R code in fig1.r to make a slightly different figure. Now what does it show?

  • Makefiles can get much more complex, but the complexity is more or less (1) all along the lines of making your dependency graph more complex, with more steps in the chain before the final output is created, or (2) introducing variables that allow you to act on more than one file in a shorthand way. The make manual is well-written and clear. The only barrier is that many of its examples are focused on the chain of targets and prerequisites associated with compiling C programs.