In the words of The Economist, data is now the world’s most valuable resource. Amazon, the most valuable company in the world – is based on data. Today, every sector of the economy is getting disrupted by big data applications and tools. For example, the financial services sector, especially big banks are leveraging the power of data to take advantage of their markets by developing products and services that are responsive to users’ behaviors. However, while the public sector sits on top of the most data, it has not been able to take advantage of it. This is not for lack of awareness in the public sector of the power data brings to strategy. In this paper, we explore the challenges and the opportunity to transform evaluation in the public sector with new data tools, and the pitfalls we must avoid when using these tools. We contend that the evaluation field needs to take greater advantage of data science than it currently is, and that embracing data science can make game-changing contributions to enable evidence-based social change and anti-poverty interventions by the public sector and philanthropy.
The rapid growth in data science has focused on big-data analytics, use of artificial intelligence (AI), and machine learning. Technological advances have led to development of innovative solutions for capturing, integrating, analyzing, and sharing data. The ubiquity of powerful and relatively low-cost cloud-based technologies makes it easy to access various types of data and generate insights from that data. Within this context, the potential of data science to solve new and existing challenges in evidence-building and evidence-based programming has rapidly increased.
Public sector organizations and philanthropy are increasingly recognizing the importance of evaluation, driving more demand for services from evaluators. Some of the barriers and challenges that traditional evaluation has faced in using data for evidence building are now surmountable. Yet, while technological advances have increased access to these data tools, access is not uniform across the globe. That’s why Sankofa and LiveStories are teaming up to channel this potential into greater impact on the public sector’s most intractable issues.
Through our respective professional practices, evaluation, and data science, we seek to actualize our passion and commitment to improve the lives and circumstances of poor and vulnerable people across the globe by harnessing data and analytic tools. We can use information and insights to build evidence that will lead to improvements in, and greater impact of anti-poverty interventions. We have both seen many instances where evaluations and data analytics have fallen short of the transformative impact that we believe is possible. In some cases, projects are discounted or ignored under the guise of “the data are bad”. In other cases, evaluators leave much of their data unexplored once they have answered their specific evaluation questions. Data science applications can help evaluators get more from their data by linking multiple sources of data. This will enable deeper exploration of the complexities of systems change, see potential connections among outcomes, or uncover unintended consequences of interventions. As such, evaluators can leverage data science to generate richer evidence and more actionable insights for their clients.
How can traditional evaluation better leverage data science?
Traditional evaluation methods begin with questions. A question is asked, an evaluation is designed around the questions, data is collected and analyzed, and the original questions are answered. Data science doesn’t require an initial question. Often data science is about exploring the data, making intelligent understanding of it, and looking for interesting patterns and relationships without constraints of initial questions. Data scientists leverage computer programs and models to link, integrate, and analyze large amounts of data across disparate datasets. These used to require expensive systems and specialized skills. Cloud-based data technologies have made data science much more accessible than ever before.
This ability to organize and make sense of large amounts of data can be used as an exploratory entry point to evaluations. Data science can suggest potential hypotheses and questions that the evaluation can examine more deeply. Data science can also be used to explore and leverage existing, secondary data to provide context to findings from the primary data collection, thus adding greater value for policy makers, program implementers, and funders.
While we are optimistic about the potential of combining the instinct of data science with the wisdom of traditional evaluation to reshape the evaluation field, we are struck by the fact that, except for a few niche instances, the two fields have not realized the power of their combined practices – both remain as if operating in different orbits. Data science has been increasingly leveraged by private sector business over the last 10 years. It is increasingly seen as a critical business analytics tool that directly impacts the bottom line. In the public sector, data science and big data analytics is often viewed by evaluators as a quaint tool that lacks the discipline of theory-based evaluation and the systematic social science research tradition in which evaluation has its origins and remains anchored.
It is worth noting that the benefits of the combination of evaluation and data science work both ways; the data science field can benefit from incorporating more of conventional evaluation’s theoretical frameworks that help guide the exploration big data (see Bamberger 2018).
While both fields remain siloed, there is a rising need and increasing opportunities for traditional evaluation to leverage the power of data science. The opportunity lies in picking up specific components of the big data movement and applying it to our traditional evaluation approaches.
- More power means handling more complex calculations. An iPhone today can guide 120,000,000 Apollo 13 missions to the moon and back, at the same time. This awesome power hasn’t increased linearly in the last 50 years. It has been through an exponential increase and will get even more powerful in the future. In evaluation, we work with multi-variant systems. Getting a complete picture of a system requires building and understanding complex models. The computing power available to us has transformed, but, our models in traditional evaluation have not.
- More speed means shorter feedback loop. It is not just about the power. These machines resolve calculations in record time. Public sector programs can take advantage of this speed by providing real-time to near real-time feedback to systems and have the decision-makers and implementers make appropriate changes more quickly.
- More power and more speed don’t cost a premium. Cloud infrastructure has made storage and computing power a commodity. It is cheaper to let AWS or Azure manage your infrastructure than manage your own data warehouse or server farm. There might be strategic reason to host your own servers, but, in most cases the economics of hosting your own servers just do not work out. Instead of managing costly infrastructure, those costly resources can now be directed to core programs.
The result of the above trends is that we can run more experiments faster than ever before. We can ask many complex questions and expect to get answers quickly. We can set shorter milestones and increase the number of checkpoints that allow us to identify problems more quickly and double down on things that are working. Implementers and funders should design programs to take advantage of these capabilities. This will reduce time to complete evaluation projects, drive down the cost of evaluation, and ultimately, lead to a more efficient use of resources.
Despite its promise, data science is not a panacea for evaluation. There are risks that need to be managed. As Heider (2017) notes, science applications can amplify longstanding issues that evaluators have faced in the past. Primarily, the issues with data science include:
- Ethics – guarding against instruction on privacy;
- Governance – ensuring proper access to and data security;
- Biases – being aware of biases in data and avoiding their amplification; and
- Capacity – keeping in mind the capacity of people, technology and systems not just the technology.
While we acknowledge these risks, we believe the benefits outweigh the risks. Data science in the age of AI and machine learning can significantly enhance traditional evaluation to benefit evidence-based decision-making that will ultimately increase the effectiveness and impact of programs that serve the vulnerable.