St. Pauli school of TDD
A systematic approach to Test-driven Development that leads to continuous progress
Test-driven development is designed to provide regular feedback at intervals of minutes or even seconds as to whether the current software is free of errors. If too much coding is done between when the tests can be run, it will result in a slower development process due to larger issues when the tests are finally able to be run. We often notice that many developers are able to handle the first two or three TDD cycles smoothly, but the subsequent cycles are so slow that it can hardly be called test-driven development. We have therefore developed a new systematic approach that leads to continuous progress in short TDD cycles. Following the two well-known TDD approaches - "Chicago school" and "London school" - we have named this approach the "St. Pauli school of TDD".
Comparison to other schools of TDD
All schools have an accepted method on three common aspects:
|First Test Case
|Use of Mocks
|First Test Case
|Use of Mocks
As shown above, the St. Pauli school of TDD differs in 1 out of 3 aspects from every other school.
Besides these aspects, the St. Pauli school has two additional requirements, which are not an integral part of the other schools:
We want to demonstrate the method using the Diamond Kata as an example:
We start with a new Clojure project. We choose Clojure, because it has
minimal syntax and has a high signal-to-noise ratio. Also, Clojure boasts a high readability once you are
familiar with its prefix notation. Example:
f(x) is written
We enter the first TDD cycle with a failing (red) test, that has been auto-generated by the Clojure build tool. We are therefore in the red state.
To get into the green state as quickly as possible, we assert that
0 is indeed
This is not very useful, but we are just warming up.
The first step of the St. Pauli school of TDD is to start with a simple API test. So we change the test's
name and specify the API of the
diamond function. We already made some design decisions there:
The input of the function should be a single character, the output should be a vector containing a string
for each line and the function name should be "diamond". Since a function with this name does not exist yet,
the test runner prints an error and we are back in the red state.
To make some progress towards the green state, we write a minimal implementation of the
diamond function. The macro, a special kind
defn creates a new
the name specified by the first argument to
diamond). The second
defn is a vector of all the arguments of the function. There is only one argument
diamond function and it is named
$ sign has no
special meaning, we just use it to prefix the variable name since there is already a
char function provided by Clojure.
Since the diamond function does not return anything, we are still in the red state, but now the test result
is much more helpful:
expected: (= ["a"] (diamond \a)), actual: (not (= ["a"] nil)) This means,
(diamond \a) should return
["a"], but it returns
nil is not
The quickest way to get back in the green state is to return the expected value
["a"]. This is
both part of the Fake-it-Pattern and the Triangulate-Pattern. If we refactor the constant
value to the real implementation, we would have used the Fake-it-Pattern. But this would be a too big step
at this point. That is why we continue with the Triangulate-Pattern. With this pattern, we add more tests
until returning hard coded answers would get ridiculous and the real implementation gets more obvious.
cond is similar to a switch
statement in other languages. Depending on the variable
$char, it returns different hard-coded
vectors. There are now three tests and a structure is emerging. That is why we continue with the second part
of the Fake-it-Pattern and the real implementation.
We replace the hard-coded vector
["__a__" "_b_b_" "c___c" "_b_b_" "__a__"] with
["__a__""_b_b_" "c___c" "_b_b_" "__a__"]). When
into is called with only one
argument it returns that same argument, this change qualifies as a refactoring, because the internal
structure of diamond has been changed but the visible result is the same.
into can also be called with two collections as arguments. In that case, into returns the
first collection with all elements of the second collection included. We take the next tiny step and call
into with the same vector and an empty vector. Unsurprisingly, this does also not change the result, but is
a little bit closer to the real implementation.
Here, we split the first vector in two parts and combine them again with
into. We are still in
the green state, but now we can see a possibility to reduce the problem. The second vector can be derived
from the first vector if we remove the first vector’s last element and reverse it afterwards.
Both vectors are still independent from each other, but we already removed the last element of the second
pop without breaking any
tests by adding a neutral element at the end. Except for the fact that it gets removed, this element can be
ignored (hence the name).
At this point, to introduce the reversing of the second vector in a non-breaking way, we need to switch the
first and the second element of the second vector before surrounding it with the
let macro we are able to
define local variables. We call it
$pyramid, because the shape of the vector we assign to the
like a pyramid. At first, we just define the variable without using it anywhere. With all these small steps,
we can be relatively confident in keeping the code in the "green".
Now we replace the first occurrence of the pyramid with the variable.
And now the second occurrence.
Here we paste a prepared
todo function into the project. The
todo function can be
called with arbitrary arguments (marked by the
& sign), which can be accessed within the
function body as a list called
todo function will always return the last
We assign the responsibility of the subproblem to construct a pyramid to a function called
pyramid. This function does not exist yet, but we would like it to. We wrap it within the
todo function, where we state what the
pyramid function would return. Since we are
not yet sure which arguments we should use to call this new function, for now we can just call it here with
the single argument
:todo. This name signals to us, that we will need to return later and make
a decision. We can be sure, that writing tests for the
pyramid first will help us come up with
a good API for that function.
The St. Pauli school of TDD defines a recursive approach. The
pyramid function is now the new
SUT and we start again with a most basic API-test. We deliberately chose a different test input than in the
diamond context. This prevents us from forgetting hard-coded values in the code. We decide that the
function should have a
start and an
end parameter. The top of the resulting
pyramid should be the defined by the
start argument and the base of the pyramid by the
argument. The height and width of the pyramid can then be calculated by the distance between the
end argument. Since the
pyramid function does not yet
exist, we are now in the red state.
Similar to the
diamond function, we only implement the function signature without returning
anything to get clear feedback what the test is expecting and what is still missing. Then we continue with
the Triangulate-pattern to get back in the green state and learn more about the behaviour of the
Again, three test are sufficient to notice a pattern. If the distance between the
end argument increases by one, one more argument is appended to the vector and all existing
arguments are surrounded by one more underscore character. For example, given a pyramid with three lines,
the top line has two underscores at the front and two at the back, the middle line has one underscore at the
front and one at the back and the bottom line has no underscores at the front or at the back. If we start
constructing the pyramid with the top line and surround each line of the pyramid with underscores when we
add another line to the pyramid, we create this distinguished shape. How can we iteratively construct the
pyramid then? First we need to know, how to append an element to a vector in Clojure. This is accomplished
(conj  1)
. To add multiple lines to a vector we can use the
reduce function: The result of
(reduce + 0 [1 2 3]) is
6 and the result of
(reduce conj  [1 2 3])
[1 2 3]). That means, exchanging the vector
["__x__" "_y_y_" "z___z"] with
(reduce conj ["__x__" "_y_y_" "z___z"]) is getting us closer to the real iterative
construction of the pyramid without changing the behaviour of the
pyramid function. In this
way, we are both making progress and staying in the green.
Instead of calling
reduce with the
conj function directly, we instead add an
indirection and call
reduce with a function, that calls the
In line 18, we map over each line in
pyramid and call the
which returns exactly the argument that it was called with:
(mapv identity [1 2 3]) returns
[1 2 3] and
(mapv identity ["a" "b"]) returns
["a" "b"]). We use the
mapv instead of the
map function, because a pyramid is
mapv returns a vector and
map returns a sequence.
To prepare for the underscore-surrounding logic, the next little step is to inline the
identity function (the parameter name
p does not fit well, though: Because a
pyramid consists of lines, the parameter name
l to avoid a name clash
would have been better).
str function converts any value to
a string. We are only mapping over strings, so the strings stay the same. To illustrate:
["a" "b"]) returns
["a" "b"] and
(mapv str [1 2 3]) returns
Now we implement the first half of the surrounding logic by only adding an underscore to the front of the
l and removing all underscores in front of the strings we are mapping over. To do that, we
str function. It can be also called with an arbitrary number of arguments and returns a
concatenation of the string representation of all these arguments.
To complete the underscore-surrounding logic, we do the same with the underscores at the back.
We make the underscore-surrounding logic explicit by extracting a function for it.
At this point we notice, that the vector at line 23, that we are reducing over, consists of the
middle lines of
(diamond \y) and
(diamond \z), given
that the first letter of the alphabet would be
\x. That means, if we had a function
that we could pass a character and that would return the corresponding middle line, our
function would be close to completion. So, we use the
todo function to formulate our need for a
After writing down how to use the
middle-line function, it gets obvious that the API of
is flawed. If
(map middle-line [\x \y \z]) should result in
["x" "y_y" "z___z"],
then that would also mean that
(middle-line \x) should result in
\y) should result in
"y_y". Given only one character as an argument, how should
decide, how many underscores it should return? We also have to pass the information, which character is
supposed to be the first character of the alphabet, which is the second character and so on. That is why we
change the API of
middle-line to pass it the character as well as its index within an
arbitrary alphabet. We call this sequence an indexed-alphabet.
Instead of hard-coding the indexed-alphabet, we can generate it from the
start and the
parameter of the
pyramid function. We formulate our wish for an
function and start a new cycle by making the
indexed-alphabet function our new SUT.
Again, the first step in a new cycle according to the St. Pauli school of TDD is to write a simple test at the API-level of the SUT before implementing the SUT. But this time we made a mistake by naming the test identical to the SUT, which results in an error at 11:33.
The SUT does not yet exist so we expect an error. The next step is to write the definition of the SUT. We expect to get rid of the error and get an assertion failure instead.
Because of the identical naming of the SUT and its test, the error does not disappear. Since we were progressing with baby steps, we are faster by going back to when we were green and redo the last step, instead of wondering or debugging, what we did wrong. In this way, we minimise the time in the red.
This time the test are named correctly.
We are now getting the expected assertion failure, because we have not implemented
indexed-alphabet, yet. Therefore, we continue with the familiar Triangulate-pattern.
vector is a function that can be
called with arbitrary arguments and returns a vector containing all the arguments.
map-indexed is a function
map, except that it calls the mapping function with
0 and the first
element of the mapped collection,
1 and the second element, etc. By combining both
vector, we can replace the hard-coded
[[0 \x][1 \y][2 \z]] with
vector [\x \y \z]).
The next step is to replace
[\x \y \z] with something, that generates a character sequence
beginning with the
start parameter, ending with the
end parameter, and all the
necessary characters in between. In Clojure, we can generate ranges of integers easily with
start end). For example,
(range 4 7) returns
(4 5 6). But
range does not work with characters, that is why we prepare to convert a range of integers to a
range of characters. The first tiny step is to introduce the mapping by mapping over the hard-coded vector
identity as the mapping function. As we used this technique before, we know that this
refactoring is safe and we will stay in the green.
char function converts an integer to a character and the
int function converts a character to
an integer. So we are changing the vector of chars to a vector of integers and map over it with the
function. Applying both changes effectively compensate each other. The result is the exact same sequence as
before and all tests still pass.
With all the transformation in place, we can now replace the vector with a call to
which only needs the start and the end, none of the elements in the middle.
In contrast to our expectation, the test fails and informs us that
[[0 \x] [1 \y] [2 \z]] is
not equal to
[[0 \x] [1 \y]]. We made an off-by-one-error apparently.
(range start end) creates a sequence including
start, but excluding
end. To include
end in our sequence, we need to increment end by one by calling
Now we can replace the hard-coded characters with the
end parameters of
And now we can remove the hard-coded branches for when
\x. After that, we can also remove the
Only now do we complete our first St. Pauli TDD cycle by finishing with a validation test that is structurally different to the previous test data. This is helpful to avoid overfitting to the training set we used to drive the implementation.
We start the next St. Pauli TDD cycle by writing a test for the
The definition of the
middle-line function makes use of destructuring. This is a technique to
assign names to elements of a collection parameter. As formulated in the tests, the API of the
function expects a vector as the single argument, representing one element of an indexed alphabet. The first
element of that vector is the index within the alphabet, so we assign the name
index to that
element. The second element is the actual character which we assign the name
$char (again, the
$ is just there to avoid a name collision with the function
We continue again with the Triangulate-pattern.
After three examples we see that all middle lines for characters with an index larger
0 have a
similar structure. We do not think that any additional examples would lead to any more insight. Instead we
notice, that the middle line string consists of three parts: For all characters with an index larger than
0, the first and the last part are always the same and only the middle part changes dynamically
depending on the input. Hence we use tiny baby steps to split the string in three parts.
This structure resembles our surround-logic, except that we do not surround the middle part with
underscores but with the
$char parameter instead. Since our
surround function can
only surround values with underscores, we upgrade it so that it can surround a value with arbitrary values.
We do not write a test for that upgrade because we feel confident that we can simply write the correct
implementation in short time. This approach is called "Obvious Implementation". As a rule of thumb, we only
use this pattern when writing the real implementation is faster than an average TDD cycle.
Now we focus on the dynamic part of the middle line. We notice a pattern in how the middle line is created
depending on the arguments. If the index is
0, the middle line is simply the character. If the
1, one underscore is surrounded by no additional underscores and the input character.
If the index is
2, one underscore is surrounded by one set of underscores and the input
character. If the index is
3, one underscore is surrounded by two sets of underscores and the
input character. The else-case describes, up to this point, only the behavior when the index is
2. That means, if we surround an underscore with one set of underscores, we get the same result
as the hard-coded
At this point we introduce two more functions.
takes a collection as argument and returns the first element of that collection:
(first [3 4
3. An alternative to
first is to use the
nth function and call it with the
collection and the index of
(nth [3 4 5] 0) return
takes a mapping function and an initial value and returns an infinite sequence that starts with the initial
value and whose consecutive values are the mapping function applied to the previous element of the sequence:
(first (iterate inc 0)) returns
(nth (iterate inc 50) 3)
(first (iterate surround
"_", we can replace the hard-coded string with this expression without
changing any visible behaviour.
In line 34, we reference the
surround-function three times. The first call is to surround the
underscores with the input character. This is different than the next two references of the
surround functions. These duplication of calling
surround twice by replacing
(surround (first (iterate surround
(second (iterate surround "_")) which is equivalent to
surround "_") 1)
We also notice, that the nth element is dependent on the index of the input character. We can replace the
1 by decrementing the index by
1 via the built-in function:
Now it is time to remove the hard-coded
"x" with the input character.
After that, our else-case can also handle the case when the index equals
1, so we can delete
We can replace the last hard-coded value in line 32 with the input character converted to a string.
The branching can be simplified by replacing the
cond macro with a simple
The condition can also be simplified by only checking whether the index is positive.
We finish the current St. Pauli TDD cycle for
middle-line by adding a validation test.
Because we successfully finished the last SUT, we search for references of "todo" to verify that we can now
perform the real implementation. We find two references, one in line 72 where we wrap the call to the
pyramid function with the
todo function. We cannot resolve this reference, because
pyramid itself is also using the
todo function. Even if we thought the St. Pauli
TDD cycle for the
pyramid function was completed, we would realize at this point that it is
not. The second reference is at line 50. Because both
indexed-alphabet have completed their St. Pauli TDD cycle, we can replace the
todo wrapper and just call the real implementation.
Now the else-case in line 47 can also handle all other cases so we can remove line 45 and line 46. As soon
as there is no longer any branching, we can remove the
cond macro altogether.
In hindsight, this refactoring was questionable, because the parameter name
line no longer
fits. In line 46 we are now calling
(middle-line line). This does not make sense, because
middle-line expects to be called with a pair of
and not with a line.
After adding a validation test, we can now finish the current St. Pauli TDD cycle for the
pyramid function. Again, we make sure that the test data is as different as possible compared
to the previous test data.
Now we can remove the last
todo reference. Thanks to our tests, we know at this point how to
pyramid function. One could argue that we violated YAGNI since our diamond function
always starts with the
\a character but the
pyramid function is able to generate
pyramids that start with arbitrary characters. On the other hand, this design reveals on the highest
abstraction level that the diamonds always start with the
\a character. If we formulated this
restriction within the
pyramid function, it would have been buried one abstraction level deeper
in the code. One could also argue that because of the Single Responsibility
Principle that if we change the starting character of a diamond to a capital
A, not only
would the diamonds-tests break but so would the pyramid-tests.
Now, since our else-case can handle all other cases we can remove the
Approaching the end of the kata, we group the test and production code together.
All tests pass. Since we do not need the
todo function any longer, we remove it.
Congratulations, we completed the Diamond-Kata!
Now we complete the final St. Pauli TDD cycle and make sure we really completed the kata by adding the final validation test at the API-level.
Does a procedure according to TDD imply that you always have to program as cumbersome as shown in the demo?
The demo was solely optimized to show how short feedback cycles can be kept.
Since there are better development techniques out there, isn't TDD a waste of time?
Even though Thinking before programming, Hammock-driven Development, Property-Based Testing, REPL-Driven development, and Type-driven Development are useful techniques, this does not imply that TDD is useless, nor that the aforementioned techniques and Test-driven Development are mutually exclusive.
Were can I find the source code used in the demo?
The code is HERE on GitHub.