Impact Measurement – Part Three of Three
Measuring impact of nonprofit programs is clearly an important element in determination of benefit to society. It has also become a proxy to demonstrate return on investment to governments, foundations, corporations and other donors.
The state of the art and science of such impact determination is largely work in progress.
Certain nonprofit sectors, namely healthcare and education, are well advanced in such measurement compared to other sectors. Epidemiological methodology, such as used by the Centers for Disease Control and Prevention, has important elements that are available to be adopted/adapted by other sectors.
Using such tools as Geographic Information Systems (GIS), with the enormous amount of data already available through government agencies, and combining it with population statistics, another highly measured variable, in various time series may be an important way to derive impact of specific nonprofit programs. (See Part One of this Series).
Other methods currently used include process measures, e.g. the financial and accountability variables assessed by Charity Navigator and reported in an ascending number of stars, or the all or nothing Standards approach of our own Minnesota Charities Review Council, are widely employed and respected.
Reputational methodologies, using expert panels or other informants, to assess output are also currently employed, e.g. the rigorous survey methodology of Philanthropedia, a GuideStar subsidiary, is an excellent example of this approach. (See Part Two of this Series.)
What is Really Needed
All of these, while state of the art and improving, fall short in achieving what is ultimately necessary:
- Assessing an agency’s performance in achieving specific societal impacts, as well as indicating return on investment to donors.
- Obtaining useful information to improve the performance of the agency in producing even better impacts efficiently and effectively.
- Using the impact information to generalize findings so that it can be applied more broadly and brought to scale by a number of agencies rather than just the specific agency originally achieving such impacts.
One can say that the first objective is starting to result in certain acceptable information. The second objective is not yet clearly in sight, although it is sometimes achieved, often by serendipity. The third needs methodologies that determine causality.
Such measurement required for the third objective, and, as such, for all three, is the subject of this posting. It is clearly the most difficult, given the large amount of variables that may be part of the impact of any specific program. There is also the problem of diversity of agencies delivering similar impacts and the very difficult task of comparability in meaningful, and measurable, ways.
The Randomized Field Trial
One method that may help, at least in part, is the randomized field trial (RFT), the randomized controlled experiment. Given the multitude of variables, our lack of full knowledge of intervening variables and many other factors, this would not have been a productive avenue until recently, when advanced information systems and statistical methodologies have become generally available to us all.
A seminal book in this area and one that will guide the rest of this discussion is Jim Manzi’s new volume, Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society, a tour de force of the history of scientific method in all of its complexities, the search for an empirical basis for strategy along with the non-experimental ones, descriptions of what Manzi calls, “The Experimental Revolution in Business,” his own evolution in development of experimental software in business strategy and his suggestions for experimental studies in determining societal policy.
Manzi doesn’t specifically address nonprofits and their outcomes, but does use educational and health care examples throughout. He employs the idea of “structured trial and error, in which human minds consciously develop ideas for improved practices, and then use rigorous experiments to identify those that work.”
Manzi contends, and I agree, that the development of information technology combined with the randomized medical clinical trial methodology described in Part One now has application in the evaluation of social programs and the work of specific agencies.
The requirement of this empirical methodology, like all attempts at impact assessment, is that it be inexpensive so that many iterative tests may be run in each agency being studied. Here we can borrow from Manzi’s description of what has happened in business, in the analysis of Capital One and Google on what works and what doesn’t work:
“The enabling technological development has been the radical decreases in the costs of storing, processing, and transmitting information created by Moore’s Law. The method has been to use information technology to routinize, and ultimately automate, many aspects of testing. …..(A) closer union of formal social science and business experimentation can improve both. Greater rigor can pay enormous dividends for business experiments. And reorienting social science experimentation around using automation and other techniques to run very large numbers of experiments can substantially improve our practical ability to identify better policies in at least some areas.”
The Firm
How can we go about setting up these randomized experiments in the nonprofit world? And what about cost, technology, and expertise?
While we eventually have to make certain that there’s a reasonable chance that the pay-off from the experiment is worth the cost of running it, we should look to those very donors who want better impact information to help subsidize the start-up costs of these new and radically different methodologies.
We would also need the data collection and analysis technology as well as the trained people to do the work. So we must start off modestly, using small experiments that build into this iterative process that then reports results in our nonprofit journals and stores results in the functional equivalent of the Cochrane Collaboration, a repository for therapeutic RFTs.
One of the best ways to begin such randomized field trials would be the use of the British medical experimental methodology, the firm. Patients entering a National Health Service clinic have been randomized at their initial visit and kept in a cohort throughout their tenure at the site. If there were two cohorts, A and B, an experimental technique would be introduced and tracked in one group while the second would continue the usual and customary method. Multiples of different experiments could then be run at the same time, with the electronic medical record recording the results in the trials.
Sample sizes could vary to accommodate specificity and sensitivity, with results hopefully useful to understand impacts and the variables producing them, to generate improvements in the agency’s methods and processes and to report out findings that could be adopted by others.
Surely, the world in which we nonprofits live is complex and dynamic, so what works “here” may not work “there,” nor an innovation that works in 2012 may not work in 2018. Nevertheless, these are not excuses not to do this work. In fact, it is high motivation to get on with the task now.
If we are truly serious about understanding how to create and improve social benefit on a scale that impacts large populations, then we must start to utilize these sophisticated, improved empirical methods of determining what works and what does not.
We cannot afford, long term, not to.
Copyright 2012 The Good Counsel, division of Toscano Advisors, LLC. May be duplicated with citation.
[…] Combined with more sophisticated IT systems and measurement methodology, we are now able to provide important and economical returns on investment, continuous improved quality and outcomes, sharing of best practices among providers nationwide, all producing wonderful societal benefit. (See my third article on measuring impact.) […]
[…] Adding the dimension of using medians in each metropolitan area, without reporting on range and other variations, muddies the water even more. Median reporting, of itself, just doesn’t cut the mustard when they are used to compare overall “performance” in my opinion, regardless of the operational, empirical definition of “performance” either theirs or mine. (See the essay on Impact Measurement – Part Three.) […]
[…] However, they often base their information on many faulty assumptions. Most result, outcome and impact data would not stand up to exacting methodological scrutiny. If we’re going to get into this strategic initiative, then let’s do it right. (See my recent article.) […]
[…] Combined with more sophisticated IT systems and measurement methodology, we are now able to provide important and economical returns on investment, continuous improved quality and outcomes, and sharing of best practices among providers nationwide, all producing wonderful societal benefit. (See my third article on measuring impact.) […]
[…] Until we have comparability and the open sharing of information, we cannot have continuous quality improvement. And yes, without comparability, without information sharing, and without continuous improvement, we really can’t have scalability, if we really desire it. (See one of our many posts on Impact Measurement.) […]
[…] Read more of my thoughts about impact measurement. […]