## Convert any function with side effects to testable side-effect-free

When dealing with legacy code (== code without existing tests), the first task before making behavioral changes is to prepare characterization tests. The easiest way to achieve that is to separate the pure computation from communication inside the existing functions.

But it is a daunting task, because you do not have tests in the first place to help you transform the code into a testable format. Attempts to “wrap” existing side effects usually create an awkward layer of testing middleware. In the worst case this helper library becomes so complex that it needs tests for itself. It is way more “test helper” code than should be desirable.

I’ve come up with a simple transformation to extract the “computational kernel” from an existing function into a separately callable piece of code. Unwanted dependencies are injected as new function parameters. This way, you can keep the original impure function so that its callers are not affected. For the tests, you can exercise its extracted kernel without fear of crosstalk or a need for slow fixtures dealing with side effects.

I use pseudo-C++ below, but the idea should be applicable to many imperative languages. Consider the next function definition as a starting point.

global_config_t config = {...};

return_type_t do_work_with_side_effects(param_type param) {
...
handle h = open_resource(param);
...
if (config->param) {
...
}
...
close_resource(param);
...
}


I’ve omitted ... all the parts that do not cause side-effects or access global data. When it comes to testing it, this function has two problems.

1. It depends on the global variable config. Things are half-bad if it is accessed in read-only manner. If the global gets updated, things are worse. If there are several test cases in the same file, care should be taken to re-initialize config between running them. Running test cases in parallel in the same memory space is out of question.

2. It calls other functions (open_resource and close_resource) which produce side effects (access filesystem, network, database, maybe do locking etc.) As a result, calling do_work_with_side_effects will also cause side effects.

Both these dependencies make it very inconvenient to include do_work_with_side_effects in any tests.

Let’s extract the computational kernel of this function line by line. Whenever we see a new symbol that is not one of its formal arguments, we convert it into the argument. The global variable becomes a parameter, and functions previously called directly become function references.

return_type_t do_work(param_type param,
global_config_t &local_config,
open_function_t &open_fn,
close_function_t &close_fn) {
...
handle h = open_fn(param);
...
if (local_config->param) {
...
}
...
close_fn(param);
...
}


The original function now becomes a wrapper around do_work. For all newly added parameters, it passes the original references to the global data and impure functions.

global_config_t config = {...};

return_type_t do_work_with_side_effects(param_type param) {
return do_work(param, config, open_resource, close_resource);
}


Note how all of the computational logic of do_work_with_side_effects has been moved to do_work. The only responsibility left for the former is to bind the fixed list of parameters, and pass through the original parameters (in this case param), to the latter.

For this to work, the language has to support some sort of function and data pointers or references. Most of them have such support.

do_work is testable now. If you avoid passing functions with side-effects to it, its operation will stay side-effect-free! This is exactly what is needed for tests. Pass mocks, stubs and spies to retain fine control of what state gets changed.

// In a test

static void test_do_work() {
global_config_t mock_config = {...};
open_function_t stub_open = {... /* lambda or any other executable code */ };
close_resource spy_close = {...};
...
auto result = do_work(param, mock_config, stub_open, spy_close);
...
}



There is no need to modify any callers remaining in the existing production code at all because the signature and behavior of do_work_with_side_effects has not changed at all. But notice how future code can also call do_work directly instead.