In my day job, I spend quite a lot of time writing – and reading – automated tests written in RSpec. At some point, I realised that I have Opinions™ about automated testing and how to do it better, and a few personal bugbears. It’s worth noting that I can’t claim vast experience in this area, but we have a proud industry tradition of not letting things like that stop us.

So, any_instance. This feature (properly of rspec-mocks) has the rather interesting position of being advised against in its own documentation. I feel that that never suggests good things to start with, but across my team I probably have the most visceral reaction. any_instance worries me. Not because I think its unreasonable (sure, there are some odd semantics), but because it suggests some things about the code being tested.

Partial Mocks

The obvious issue I take with any_instance and friends is that they construct partial mocks of objects that are – typically – generated during the course of the test. The remainder of the object is left intact, and this does my head in: part of the object has just been arbitrarily rewritten, and the remainder may be completely incompatible with the particular stub. Generally, I see this when trying to understand some strange action-at-a-distance phenomenon that has left apparently unrelated tests in tatters.

It would be nice to think that whenever any_instance is deployed, the test cases in which it is implicated would be carefully examined for other calls that could be made liars by the override; my view is that the ease of partial mocks of semideterminate objects make this unlikely: this is the sort of “just be disciplined” approach that although true is better handled by tooling.

Systematic Truncation

Every now and then I find myself considering the value of simply overriding specific methods on a class by default. To be more concrete, imagine a system that maintains an audit trail, and that this incurs a significant overhead on every action in a particular domain (we might, for instance, invoke an external dependency as a transaction precondition). That system is generally not the entity under test, so the risks of omitting the code path are possibly low, and the benefits (especially if the test instance, if it even exists, is aggressively rate-limited) can be significant.

This is the sort of situation where I really feel that what I want to do is provide a complete fake class. But if an existing codebase does rely on the specific outcomes of methods on the real object, you have something of a problem. Given this, using any_instance for truncation suggests to me overcoupling in a particular manner (although I acknowledge that my perspective on this is likely to be fairly idiosyncratic).

Aesthetics

In a lot of scenarios I’ve seen, invocations of any_instance can be replaced as follows:

allow_any_instance_of(Foo).to receive(:bar).and_return(1)

allow(Foo).to receive(:new).and_wrap_original do |m, *args|
  original = m.call(*args)
  allow(original).to receive(:bar).and_return(1)
  original
end

This is almost painfully unpleasant (although if you’re overriding several methods, you at least only pay this once). If you have an effectively total override, then the following form can be sufficient:

allow_any_instance_of(Foo).to receive(:bar).and_return(1)
    
allow(Foo).to receive(:new) { instance_double(Foo, bar: 1) }

This form looks a bit less painful at first glance, but is of course solving a different problem – it just might happen that that problem is the one that would actually be preferable to solve.

The Core Problem

So, I think there’s something central about any_instance that bothers me, and I alluded to it above. any_instance serves a clear (and potentially useful!) role, but it enables. By providing a mechanism by which the developer can easily provide arbitrary variations on entities in a fundamentally ad hoc manner, understanding how various tests interlock becomes difficult.

The absence of a canonical, uniform mechanism to handle this (à la “the audit fake”) makes it difficult to understand what parts of the existing code is causing test difficulty; it fragments solutions to what could be a meaningful set of problems; it makes changes to the overridden methods fraught (consider a clarifying rename of a method that, when not stubbed, and in one test, breaks on Tuesday); it requires the changing developer (and reviewer) to think about the various options.

I’m unsure if constructing canonical fakes will actually make matters better – my experience here is really in cases where the alternative was clearly going to be a technical disaster – but it feels like a strong possibility. Then again, I certainly tend to solve a lot of these problems by imposing structure on them, and maybe that’s not always compatible with other people’s approaches to design. If nothing else, there’s a chance to experiment here.