A couple of recent blogs discuss how the emphasis on randomized control trials (RCTs) in economics affects the type of research that is done. This blog by Chris Blattman talks about a potential arms race where each new RCT has to be ‘bigger and better’, with more ‘innovation’ in order to be published, and how that innovation requires researchers to have even more control over all design parameters.
Following on this, David McKenzie of the World Bank blogs that if more and more control over study parameters is required to get published, this would naturally lead researchers away from conducting RCTs of ‘messy’ interventions where parameters are not all fully controlled--most notably large-scale government programs--which in turn would seriously reduce the policy relevance of the research. He concludes that, with the exception of U.S. policy, “it is becoming harder to sell policy impact papers to top journals.” I also have previously written of the danger that the randomista revolution would limit the types of questions that are pursued by our best researchers (In defense of external validity). Rachel Glennerster of J-PAL then wrote this blog about conducting RCTs with governments in which she provides 5 short benefits alongside 6 lengthy costs and advises ‘even senior researchers’ to “think hard” before taking on an RCT with a government!
Glennerster is with J-PAL who specializes in working with NGOs to design RCTS, but still, her blog prompted me to engage in some introspection—I have spent the last 7 years of my professional life helping to build The Transfer Project, which focuses on evaluating cash transfer programs in sub-Saharan Africa (SSA); the Transfer Project exclusively focuses on government programs and never considers working with NGOs. Why is it that I feel so passionately about working with governments, while senior colleagues shy away? Could I articulate this in a way that makes sense to researchers and evaluators in the economics profession? Here is my effort…
The aforementioned blogs focus on a particular ‘design’ or ‘tool’ for inferring causality. Blattman and McKenzie lament that the emphasis on the ‘tool’ is influencing the type of questions we are likely to see addressed, while Glennerster is completely focused on the tool as the primary objective of the research. Back when I was in graduate school, we were taught to fit the tool to the research question, to pick the strongest tool for the research question you want to answer. Glennerster’s position epitomizes, I believe, the current state of affairs in economics: young economists are now picking the research question to fit the tool rather than the other way around: “Where can I run an RCT?” rather than ‘What big question do I want to answer?”
The Transfer Project is driven by a set of big questions which can be summarized in “Can large-scale cash transfers work in very poor, low-capacity settings?” This overarching question encompasses a set of specific issues that need to be explored to fully answer the larger question. For example, can governments in SSA actually deliver cash at regular, predictable intervals? What amount of transfer are citizens willing to support, and is this amount big enough to have an impact on the lives of the poor? Clearly we need to study actual government implemented programs at a reasonable scale to make headway into these questions. Further, since the way programs are set-up and rolled out are not always amenable to an RCT—we use the best methods available given the context and research question.
Indeed, for us, the whole point is to not have control over the program parameters at all, to see what actually happens when the political and technical processes converge on a program design, and when a low-capacity and poorly-resourced Ministry actually has to deliver that design to large numbers of beneficiaries. Our experience across 8 countries tells us that in many cases, governments can deliver cash at regular intervals, but providing a transfer level that ensures positive impacts is a continuing challenge. However, we also find that when cash is delivered regularly at suitable levels, there are positive impacts on both protective and productive domains.
But let me play devil’s advocate for the sake of discussion: Can’t we just mimic a ‘likely’ program across several countries as was done recently for a graduation program and then have our favorite NGO implement it? A key problem with this approach is first deciding what the ‘likely’ program parameters would be. An even bigger problem is mimicking implementation—indeed for me this is the real Achilles Heel of the RCT set. This is similar to the issue of adherence in the pharmaceutical world; yes the drug is effective if taken at specific times in specific combination for a specific period. But with busy schedules and limits on cognitive capacity, who can actually ever do that? Similarly, if the accounts reconciliation process for disbursements are so onerous that the Ministry of Social Welfare can rarely clear them in time to ensure bimonthly beneficiary payments, can a cash transfer program relying on these design processes work successfully at scale?
The point is that in resource poor settings (or arguably in any setting), we really have no idea what the implementation process will be like until we get the people who will have to do it, to actually do it! Similarly, we can invent some really neat, exciting design features (E.g. let’s give them a cow and some cash, a behavioral nudge, and visit them every week with informational and social marketing campaigns for the next few years), but will the electorate ever support that, and can any civil service implement it?
I’m afraid there is only one way to find out, and that’s why some of us only work with governments, in all their slow-moving, inflexible, risk-prone, disagreeable, political-driven glory.
Following on this, David McKenzie of the World Bank blogs that if more and more control over study parameters is required to get published, this would naturally lead researchers away from conducting RCTs of ‘messy’ interventions where parameters are not all fully controlled--most notably large-scale government programs--which in turn would seriously reduce the policy relevance of the research. He concludes that, with the exception of U.S. policy, “it is becoming harder to sell policy impact papers to top journals.” I also have previously written of the danger that the randomista revolution would limit the types of questions that are pursued by our best researchers (In defense of external validity). Rachel Glennerster of J-PAL then wrote this blog about conducting RCTs with governments in which she provides 5 short benefits alongside 6 lengthy costs and advises ‘even senior researchers’ to “think hard” before taking on an RCT with a government!
Glennerster is with J-PAL who specializes in working with NGOs to design RCTS, but still, her blog prompted me to engage in some introspection—I have spent the last 7 years of my professional life helping to build The Transfer Project, which focuses on evaluating cash transfer programs in sub-Saharan Africa (SSA); the Transfer Project exclusively focuses on government programs and never considers working with NGOs. Why is it that I feel so passionately about working with governments, while senior colleagues shy away? Could I articulate this in a way that makes sense to researchers and evaluators in the economics profession? Here is my effort…
The aforementioned blogs focus on a particular ‘design’ or ‘tool’ for inferring causality. Blattman and McKenzie lament that the emphasis on the ‘tool’ is influencing the type of questions we are likely to see addressed, while Glennerster is completely focused on the tool as the primary objective of the research. Back when I was in graduate school, we were taught to fit the tool to the research question, to pick the strongest tool for the research question you want to answer. Glennerster’s position epitomizes, I believe, the current state of affairs in economics: young economists are now picking the research question to fit the tool rather than the other way around: “Where can I run an RCT?” rather than ‘What big question do I want to answer?”
The Transfer Project is driven by a set of big questions which can be summarized in “Can large-scale cash transfers work in very poor, low-capacity settings?” This overarching question encompasses a set of specific issues that need to be explored to fully answer the larger question. For example, can governments in SSA actually deliver cash at regular, predictable intervals? What amount of transfer are citizens willing to support, and is this amount big enough to have an impact on the lives of the poor? Clearly we need to study actual government implemented programs at a reasonable scale to make headway into these questions. Further, since the way programs are set-up and rolled out are not always amenable to an RCT—we use the best methods available given the context and research question.
Indeed, for us, the whole point is to not have control over the program parameters at all, to see what actually happens when the political and technical processes converge on a program design, and when a low-capacity and poorly-resourced Ministry actually has to deliver that design to large numbers of beneficiaries. Our experience across 8 countries tells us that in many cases, governments can deliver cash at regular intervals, but providing a transfer level that ensures positive impacts is a continuing challenge. However, we also find that when cash is delivered regularly at suitable levels, there are positive impacts on both protective and productive domains.
But let me play devil’s advocate for the sake of discussion: Can’t we just mimic a ‘likely’ program across several countries as was done recently for a graduation program and then have our favorite NGO implement it? A key problem with this approach is first deciding what the ‘likely’ program parameters would be. An even bigger problem is mimicking implementation—indeed for me this is the real Achilles Heel of the RCT set. This is similar to the issue of adherence in the pharmaceutical world; yes the drug is effective if taken at specific times in specific combination for a specific period. But with busy schedules and limits on cognitive capacity, who can actually ever do that? Similarly, if the accounts reconciliation process for disbursements are so onerous that the Ministry of Social Welfare can rarely clear them in time to ensure bimonthly beneficiary payments, can a cash transfer program relying on these design processes work successfully at scale?
The point is that in resource poor settings (or arguably in any setting), we really have no idea what the implementation process will be like until we get the people who will have to do it, to actually do it! Similarly, we can invent some really neat, exciting design features (E.g. let’s give them a cow and some cash, a behavioral nudge, and visit them every week with informational and social marketing campaigns for the next few years), but will the electorate ever support that, and can any civil service implement it?
I’m afraid there is only one way to find out, and that’s why some of us only work with governments, in all their slow-moving, inflexible, risk-prone, disagreeable, political-driven glory.