St. Pauli school of TDD

A systematic approach to Test-driven Development that leads to continuous progress


Test-driven development is designed to provide regular feedback at intervals of minutes or even seconds as to whether the current software is free of errors. If too much coding is done between when the tests can be run, it will result in a slower development process due to larger issues when the tests are finally able to be run. We often notice that many developers are able to handle the first two or three TDD cycles smoothly, but the subsequent cycles are so slow that it can hardly be called test-driven development. We have therefore developed a new systematic approach that leads to continuous progress in short TDD cycles. Following the two well-known TDD approaches - "Chicago school" and "London school" - we have named this approach the "St. Pauli school of TDD".


Start with a simple test at the API-level of your Subject under Test (SUT).
Without using mocks, work from the highest abstraction level to the lowest abstraction level of your SUT. Reduce the scope of a SUT by using stubs that simulate that they have solved one or more subtasks of the SUT.
If there are stubs in the SUT, replace the stubs with their real implementation by making them the new SUT and go to Start on the API-level
Always add a validation test after each SUT has been finished to avoid overfitting to the training set that you used to develop the SUT.
During development, treat your suite of tests as append-only. Do not delete or comment out a test that is formally correct, but conflicts with your current development state. If you painted yourself into a corner, revert to a state where you can make continuous progress again instead.

Comparison to other schools of TDD

All schools have an accepted method on three common aspects:

School St. Pauli Detroit
Direction Outside-In Inside-Out
First Test Case Simple Simple
Use of Mocks avoid avoid
School London Munich
Direction Outside-In Outside-In
First Test Case Simple Complex
Use of Mocks embrace avoid
School St. Pauli Detroit London Munich
Direction Outside-In Inside-Out Outside-In Outside-In
First Test Case Simple Simple Simple Complex
Use of Mocks avoid avoid embrace avoid

As shown above, the St. Pauli school of TDD differs in 1 out of 3 aspects from every other school.

Besides these aspects, the St. Pauli school has two additional requirements, which are not an integral part of the other schools:


We want to demonstrate the method using the Diamond Kata as an example:


We start with a new Clojure project. We choose Clojure, because it has minimal syntax and has a high signal-to-noise ratio. Also, Clojure boasts a high readability once you are familiar with its prefix notation. Example: f(x) is written (f x).


We enter the first TDD cycle with a failing (red) test, that has been auto-generated by the Clojure build tool. We are therefore in the red state.


To get into the green state as quickly as possible, we assert that 0 is indeed 0. This is not very useful, but we are just warming up.


The first step of the St. Pauli school of TDD is to start with a simple API test. So we change the test's name and specify the API of the diamond function. We already made some design decisions there: The input of the function should be a single character, the output should be a vector containing a string for each line and the function name should be "diamond". Since a function with this name does not exist yet, the test runner prints an error and we are back in the red state.


To make some progress towards the green state, we write a minimal implementation of the diamond function. The macro, a special kind of function, defn creates a new function with the name specified by the first argument to defn (here: diamond). The second argument to defn is a vector of all the arguments of the function. There is only one argument of the diamond function and it is named $char. The $ sign has no special meaning, we just use it to prefix the variable name since there is already a char function provided by Clojure. Since the diamond function does not return anything, we are still in the red state, but now the test result is much more helpful: expected: (= ["a"] (diamond \a)), actual: (not (= ["a"] nil)) This means, (diamond \a) should return ["a"], but it returns nil, and nil is not ["a"].


The quickest way to get back in the green state is to return the expected value ["a"]. This is both part of the Fake-it-Pattern and the Triangulate-Pattern. If we refactor the constant value to the real implementation, we would have used the Fake-it-Pattern. But this would be a too big step at this point. That is why we continue with the Triangulate-Pattern. With this pattern, we add more tests until returning hard coded answers would get ridiculous and the real implementation gets more obvious.


The macro cond is similar to a switch statement in other languages. Depending on the variable $char, it returns different hard-coded vectors. There are now three tests and a structure is emerging. That is why we continue with the second part of the Fake-it-Pattern and the real implementation.


We replace the hard-coded vector ["__a__" "_b_b_" "c___c" "_b_b_" "__a__"] with (into ["__a__""_b_b_" "c___c" "_b_b_" "__a__"]). When into is called with only one argument it returns that same argument, this change qualifies as a refactoring, because the internal structure of diamond has been changed but the visible result is the same.


into can also be called with two collections as arguments. In that case, into returns the first collection with all elements of the second collection included. We take the next tiny step and call into with the same vector and an empty vector. Unsurprisingly, this does also not change the result, but is a little bit closer to the real implementation.


Here, we split the first vector in two parts and combine them again with into. We are still in the green state, but now we can see a possibility to reduce the problem. The second vector can be derived from the first vector if we remove the first vector’s last element and reverse it afterwards.


Both vectors are still independent from each other, but we already removed the last element of the second vector via pop without breaking any tests by adding a neutral element at the end. Except for the fact that it gets removed, this element can be ignored (hence the name).


At this point, to introduce the reversing of the second vector in a non-breaking way, we need to switch the first and the second element of the second vector before surrounding it with the reverse function.


With the let macro we are able to define local variables. We call it $pyramid, because the shape of the vector we assign to the variable looks like a pyramid. At first, we just define the variable without using it anywhere. With all these small steps, we can be relatively confident in keeping the code in the "green".


Now we replace the first occurrence of the pyramid with the variable.


And now the second occurrence.


Here we paste a prepared todo function into the project. The todo function can be called with arbitrary arguments (marked by the & sign), which can be accessed within the function body as a list called args. The todo function will always return the last argument.


We assign the responsibility of the subproblem to construct a pyramid to a function called pyramid. This function does not exist yet, but we would like it to. We wrap it within the todo function, where we state what the pyramid function would return. Since we are not yet sure which arguments we should use to call this new function, for now we can just call it here with the single argument :todo. This name signals to us, that we will need to return later and make a decision. We can be sure, that writing tests for the pyramid first will help us come up with a good API for that function.


The St. Pauli school of TDD defines a recursive approach. The pyramid function is now the new SUT and we start again with a most basic API-test. We deliberately chose a different test input than in the diamond context. This prevents us from forgetting hard-coded values in the code. We decide that the pyramid function should have a start and an end parameter. The top of the resulting pyramid should be the defined by the start argument and the base of the pyramid by the end argument. The height and width of the pyramid can then be calculated by the distance between the start and end argument. Since the pyramid function does not yet exist, we are now in the red state.


Similar to the diamond function, we only implement the function signature without returning anything to get clear feedback what the test is expecting and what is still missing. Then we continue with the Triangulate-pattern to get back in the green state and learn more about the behaviour of the pyramid function.


Again, three test are sufficient to notice a pattern. If the distance between the start and the end argument increases by one, one more argument is appended to the vector and all existing arguments are surrounded by one more underscore character. For example, given a pyramid with three lines, the top line has two underscores at the front and two at the back, the middle line has one underscore at the front and one at the back and the bottom line has no underscores at the front or at the back. If we start constructing the pyramid with the top line and surround each line of the pyramid with underscores when we add another line to the pyramid, we create this distinguished shape. How can we iteratively construct the pyramid then? First we need to know, how to append an element to a vector in Clojure. This is accomplished with conj: (conj [] 1) results in [1]. To add multiple lines to a vector we can use the reduce function: The result of (reduce + 0 [1 2 3]) is 6 and the result of (reduce conj [] [1 2 3]) is [1 2 3]). That means, exchanging the vector ["__x__" "_y_y_" "z___z"] with (reduce conj []["__x__" "_y_y_" "z___z"]) is getting us closer to the real iterative construction of the pyramid without changing the behaviour of the pyramid function. In this way, we are both making progress and staying in the green.


Instead of calling reduce with the conj function directly, we instead add an indirection and call reduce with a function, that calls the conj function.


In line 18, we map over each line in pyramid and call the identity function, which returns exactly the argument that it was called with: (mapv identity [1 2 3]) returns [1 2 3] and (mapv identity ["a" "b"]) returns ["a" "b"]). We use the function mapv instead of the map function, because a pyramid is a vector, mapv returns a vector and map returns a sequence.


To prepare for the underscore-surrounding logic, the next little step is to inline the identity function (the parameter name p does not fit well, though: Because a pyramid consists of lines, the parameter name line or l to avoid a name clash would have been better).


The str function converts any value to a string. We are only mapping over strings, so the strings stay the same. To illustrate: (mapv str ["a" "b"]) returns ["a" "b"] and (mapv str [1 2 3]) returns ["1" "2" "3"].


Now we implement the first half of the surrounding logic by only adding an underscore to the front of the line l and removing all underscores in front of the strings we are mapping over. To do that, we use the str function. It can be also called with an arbitrary number of arguments and returns a concatenation of the string representation of all these arguments.


To complete the underscore-surrounding logic, we do the same with the underscores at the back.


We make the underscore-surrounding logic explicit by extracting a function for it.


At this point we notice, that the vector at line 23, that we are reducing over, consists of the middle lines of (diamond \x), (diamond \y) and (diamond \z), given that the first letter of the alphabet would be \x. That means, if we had a function middle-line that we could pass a character and that would return the corresponding middle line, our pyramid function would be close to completion. So, we use the todo function to formulate our need for a middle-line function.


After writing down how to use the middle-line function, it gets obvious that the API of middle-line is flawed. If (map middle-line [\x \y \z]) should result in ["x" "y_y" "z___z"], then that would also mean that (middle-line \x) should result in "x", while (middle-line \y) should result in "y_y". Given only one character as an argument, how should middle-line decide, how many underscores it should return? We also have to pass the information, which character is supposed to be the first character of the alphabet, which is the second character and so on. That is why we change the API of middle-line to pass it the character as well as its index within an arbitrary alphabet. We call this sequence an indexed-alphabet.


Instead of hard-coding the indexed-alphabet, we can generate it from the start and the end parameter of the pyramid function. We formulate our wish for an indexed-alphabet function and start a new cycle by making the indexed-alphabet function our new SUT.


Again, the first step in a new cycle according to the St. Pauli school of TDD is to write a simple test at the API-level of the SUT before implementing the SUT. But this time we made a mistake by naming the test identical to the SUT, which results in an error at 11:33.


The SUT does not yet exist so we expect an error. The next step is to write the definition of the SUT. We expect to get rid of the error and get an assertion failure instead.


Because of the identical naming of the SUT and its test, the error does not disappear. Since we were progressing with baby steps, we are faster by going back to when we were green and redo the last step, instead of wondering or debugging, what we did wrong. In this way, we minimise the time in the red.


This time the test are named correctly.


We are now getting the expected assertion failure, because we have not implemented indexed-alphabet, yet. Therefore, we continue with the familiar Triangulate-pattern.


vector is a function that can be called with arbitrary arguments and returns a vector containing all the arguments. map-indexed is a function similar to map, except that it calls the mapping function with 0 and the first element of the mapped collection, 1 and the second element, etc. By combining both map-indexed and vector, we can replace the hard-coded [[0 \x][1 \y][2 \z]] with (map-indexed vector [\x \y \z]).


The next step is to replace [\x \y \z] with something, that generates a character sequence beginning with the start parameter, ending with the end parameter, and all the necessary characters in between. In Clojure, we can generate ranges of integers easily with (range start end). For example, (range 4 7) returns (4 5 6). But range does not work with characters, that is why we prepare to convert a range of integers to a range of characters. The first tiny step is to introduce the mapping by mapping over the hard-coded vector with identity as the mapping function. As we used this technique before, we know that this refactoring is safe and we will stay in the green.


The char function converts an integer to a character and the int function converts a character to an integer. So we are changing the vector of chars to a vector of integers and map over it with the char function. Applying both changes effectively compensate each other. The result is the exact same sequence as before and all tests still pass.


With all the transformation in place, we can now replace the vector with a call to range, which only needs the start and the end, none of the elements in the middle.


In contrast to our expectation, the test fails and informs us that [[0 \x] [1 \y] [2 \z]] is not equal to [[0 \x] [1 \y]]. We made an off-by-one-error apparently.


(range start end) creates a sequence including start, but excluding end. To include end in our sequence, we need to increment end by one by calling (inc end)


Now we can replace the hard-coded characters with the start and end parameters of the the indexed-alphabet function.


And now we can remove the hard-coded branches for when end equals \y and end equals \x. After that, we can also remove the cond macro.


Only now do we complete our first St. Pauli TDD cycle by finishing with a validation test that is structurally different to the previous test data. This is helpful to avoid overfitting to the training set we used to drive the implementation.


We start the next St. Pauli TDD cycle by writing a test for the middle-line function.


The definition of the middle-line function makes use of destructuring. This is a technique to assign names to elements of a collection parameter. As formulated in the tests, the API of the middle-line function expects a vector as the single argument, representing one element of an indexed alphabet. The first element of that vector is the index within the alphabet, so we assign the name index to that element. The second element is the actual character which we assign the name $char (again, the prefix $ is just there to avoid a name collision with the function char).


We continue again with the Triangulate-pattern.


After three examples we see that all middle lines for characters with an index larger 0 have a similar structure. We do not think that any additional examples would lead to any more insight. Instead we notice, that the middle line string consists of three parts: For all characters with an index larger than 0, the first and the last part are always the same and only the middle part changes dynamically depending on the input. Hence we use tiny baby steps to split the string in three parts.


This structure resembles our surround-logic, except that we do not surround the middle part with underscores but with the $char parameter instead. Since our surround function can only surround values with underscores, we upgrade it so that it can surround a value with arbitrary values. We do not write a test for that upgrade because we feel confident that we can simply write the correct implementation in short time. This approach is called "Obvious Implementation". As a rule of thumb, we only use this pattern when writing the real implementation is faster than an average TDD cycle.


Now we focus on the dynamic part of the middle line. We notice a pattern in how the middle line is created depending on the arguments. If the index is 0, the middle line is simply the character. If the index is 1, one underscore is surrounded by no additional underscores and the input character. If the index is 2, one underscore is surrounded by one set of underscores and the input character. If the index is 3, one underscore is surrounded by two sets of underscores and the input character. The else-case describes, up to this point, only the behavior when the index is 2. That means, if we surround an underscore with one set of underscores, we get the same result as the hard-coded "___" string.


At this point we introduce two more functions. first takes a collection as argument and returns the first element of that collection: (first [3 4 5]) returns 3. An alternative to first is to use the nth function and call it with the collection and the index of 0: (nth [3 4 5] 0) return 3. iterate takes a mapping function and an initial value and returns an infinite sequence that starts with the initial value and whose consecutive values are the mapping function applied to the previous element of the sequence: (first (iterate inc 0)) returns 0 and (nth (iterate inc 50) 3) returns 53. Since (first (iterate surround "_")) returns "_", we can replace the hard-coded string with this expression without changing any visible behaviour.


In line 34, we reference the surround-function three times. The first call is to surround the underscores with the input character. This is different than the next two references of the surround functions. These duplication of calling surround twice by replacing (surround (first (iterate surround "_"))) with (second (iterate surround "_")) which is equivalent to (nth (iterate surround "_") 1)


We also notice, that the nth element is dependent on the index of the input character. We can replace the hard-coded 1 by decrementing the index by 1 via the built-in function: dec.


Now it is time to remove the hard-coded "x" with the input character.


After that, our else-case can also handle the case when the index equals 1, so we can delete line 33.


We can replace the last hard-coded value in line 32 with the input character converted to a string.


The branching can be simplified by replacing the cond macro with a simple if.


The condition can also be simplified by only checking whether the index is positive.


We finish the current St. Pauli TDD cycle for middle-line by adding a validation test.


Because we successfully finished the last SUT, we search for references of "todo" to verify that we can now perform the real implementation. We find two references, one in line 72 where we wrap the call to the pyramid function with the todo function. We cannot resolve this reference, because pyramid itself is also using the todo function. Even if we thought the St. Pauli TDD cycle for the pyramid function was completed, we would realize at this point that it is not. The second reference is at line 50. Because both middle-line and indexed-alphabet have completed their St. Pauli TDD cycle, we can replace the todo wrapper and just call the real implementation.


Now the else-case in line 47 can also handle all other cases so we can remove line 45 and line 46. As soon as there is no longer any branching, we can remove the cond macro altogether.


In hindsight, this refactoring was questionable, because the parameter name line no longer fits. In line 46 we are now calling (middle-line line). This does not make sense, because middle-line expects to be called with a pair of index and character and not with a line.


After adding a validation test, we can now finish the current St. Pauli TDD cycle for the pyramid function. Again, we make sure that the test data is as different as possible compared to the previous test data.


Now we can remove the last todo reference. Thanks to our tests, we know at this point how to call the pyramid function. One could argue that we violated YAGNI since our diamond function always starts with the \a character but the pyramid function is able to generate pyramids that start with arbitrary characters. On the other hand, this design reveals on the highest abstraction level that the diamonds always start with the \a character. If we formulated this restriction within the pyramid function, it would have been buried one abstraction level deeper in the code. One could also argue that because of the Single Responsibility Principle that if we change the starting character of a diamond to a capital A, not only would the diamonds-tests break but so would the pyramid-tests.


Now, since our else-case can handle all other cases we can remove the cond macro.


Approaching the end of the kata, we group the test and production code together.


All tests pass. Since we do not need the todo function any longer, we remove it. Congratulations, we completed the Diamond-Kata!


Now we complete the final St. Pauli TDD cycle and make sure we really completed the kata by adding the final validation test at the API-level.


Why another school? Don't we have enough already?

The St. Pauli school of TDD started as a tongue-in-cheek response to the new founded Munich school of TDD by fellow software crafter David Völkel, but evolved into a useful TDD style of its own.

Does a procedure according to TDD imply that you always have to program as cumbersome as shown in the demo?

The demo was solely optimized to show how short feedback cycles can be kept.

Since there are better development techniques out there, isn't TDD a waste of time?

Even though Thinking before programming, Hammock-driven Development, Property-Based Testing, REPL-Driven development, and Type-driven Development are useful techniques, this does not imply that TDD is useless, nor that the aforementioned techniques and Test-driven Development are mutually exclusive.

Were can I find the source code used in the demo?

The code is HERE on GitHub.

Logo it-agile GmbH