How to run experiments, when you can’t afford a randomized trial

Tom Wein image for slider smaller

Photo copyright: ©Leigh Blackall

Author: Tom Wein

Experiments are powerful – but field randomized trials, or RCTs, can be expensive. From survey experiments to work in the lab, there are other ways of gathering causal evidence.

Good evidence delivers better programming. Often, to get the best evidence, we turn to experimentation. There has been an explosion in the number of field trials for international development in recent years. In humanitarian work, to give just a couple of examples, World Vision and IRC are both running major experimental trials around their cash programming.

But not everyone can or should run a full field experimental trial. Trials are expensive; one 3ie blog cites an average cost of US$400,000. They can take a long time and be disruptive. In humanitarian emergencies, even though they might provide the most reliable evidence on the effects of interventions, they will not always be the best choice.

Luckily, there are other ways of including experimentation in your work.

Survey experiments: One is to run a survey experiment. If you are going to be running a survey anyway, it’s very little trouble to program in a few versions of the same question, each with a different prompt to see if this leads to different responses. Even simpler is to run it on Amazon’s Mturk service.

Natural experiments: The basic logic of natural experiments is easy: you just follow a group of people who are doing what they would normally do and compare them with a similar group. If you make a careful, credible argument about why your comparison group really is similar, you can compare outcomes in both groups. It’s not perfect, but it can tell you more than only asking the people who benefited from your program. This is something more and more humanitarian organisations are already doing – Oxfam found success with this approach in Zambia.

Laboratory experiments: Lab experiments can have huge power as a controlled test-bed for low risk trials. You can set them up to look at the effects of actions analogous to what would happen in the field. Careful design work allows you to draw credible links between your experiment and the reality. If the lab infrastructure and willing participants are already available, you can run them far more quickly than a typical RCT.

Forecasting: Although not technically experimental, you might want to look at the bourgeoning science of forecasting. Even though people are mostly bad at predictions, some people are better than others. If you ask the question in the right way, and a crowd is wise enough, it can allow you to accurately rank the effectiveness of different ideas.

Digitization: If your programme has digital elements and data are recorded automatically, experimentation becomes far easier. Tech companies run hundreds of so called A/B tests a day, and lots of people nowadays run email or SMS experiments. If you don’t have the funds to experiment right now, consider using what cash you do have to digitize your programme as much as possible. It will then be far cheaper to do all kinds of evaluation later – an underdiscussed benefit of the digitization agenda.


RCTs are still hugely powerful. When you can do them, you should. They are the most realistic test available of whether your programme works. But sometimes second best is good enough.

In considering whether to do an RCT, there are great resources out there to help you. Raising Voices, a Ugandan charity, has published their learnings from running an RCT. Karlan and Gugerty’s ‘The Goldilocks Challenge’ explores ‘Right Fit Evidence’ and the many approaches you can take, starting with a clear theory of change and good monitoring systems. Evidence Aid’s new practice guide has a whole section on quasi-experimental methods. When an RCT doesn’t fit, but experimentation is still valuable, these tools offer different ways of doing it.

About the author:


Tom Wein is a research consultant. He works to advance justice and create better governance through useful research. His website is, and he tweets @tom_wein.


Keywords: evaluation, experimentation, laboratory, RCT, forecasting, surveys

Comment (1) Add yours ↓
  1. carolinefiennes

    This is a surprising article. The issue isn’t ‘how to run an experiment (cheaply)’; the issue is ‘how to answer the question that you’ve got’.

    First, if the question is about the impact of a programme, then a ‘survey experiment’ (i.e, various variants of a survey question) won’t answer that.

    Second, the thing about RCTs taking a long time is an unhelpful myth. It’s just not true.
    RCTs only take a long time if the outcome of interest takes a long time to emerge. Put the other way around, if the outcome of interest takes a long time to emerge, then you have no choice but to wait it – irrespective of your experimental method. But if the outcome of interest appears quickly, then the experiment (of any method) can be quick. The time is nothing to do with the method and entirely to do with the outcome itself.
    The commercial A/B tests that you cite are RCTs – they just happen to have outcomes that appear fast. {I once worked with a guy who’d built a credit card company sold for >£200m, basically by doing masses of A/B tests to find the messaging which got people most likely to sign up.} I wrote about all this here:

    Natural experiments take advantage of a change in circumstances which effectively creates a control group, which crucially is involuntary so there’s no selection effect. An example is China’s introduction of its one-child policy. Clearly you can’t randomly ban some families from having a second child, but one-child families in China which conceived a second child immediately before that policy was introduced vs those which didn’t (whose lone children didn’t get a younger sibling because of the policy) are more-or-less comparable, so one can estimate the effect of that policy from the differences in those children.

    The issue is never ‘what expt should we run’ but rather ‘how can we answer the question’. And the answer to that should always be to look first at the literature which already exists. (see talk here:

    November 21, 2018

Submit a comment