Hello! I’m Rick Davies, Evaluation consultant, from Cambridge, UK.
Predictive analytics is the use of algorithms to find patterns in data (e.g. clusters and association rules) by inductive means, rather than by theory led hypothesis testing. I can recommend three free programs: RapidMiner Studio, BigML and EvalC . My main use of these has been to develop prediction models, i.e. find sets of attributes that are associated with an outcome of interest.
Here are some situations where I think prediction modelling can be useful, when looking at international development aid programs:
- During project selection:
- To identify what attributes of project proposals are the best predictors of whether a project will be chosen for funding, or not
- To identify how well a project proposal appraisal and screening process is as a predictor of the subsequent success of projects in achieving their objectives
- During project implementation:
- Participants’ specific and overall experiences with workshops and training events
- Donors’ and grantees’ specific and overall experiences of their working relationships with each other
- During a project evaluation:
- “Causes of effects” analysis: To identify what combination(s) of project activities (and their contexts) were associated with a significant improvement in beneficiary’s lives.
- “Effects of causes” analysis: To identify what combinations of improvements in beneficiaries’ lives were associated with a specific project activity (or combination of)
- To identify “positive deviants” – cases where success is being achieved when failure is the most common outcome.
BigML and RapidMiner have more capacities than I needed. So, I developed EvalC3, an Excel app available here, where a set a set of tools is organised into a workflow:
In the Input and Select stages choices are made about what case attributes and outcomes are to be analysed. In the Design and Evaluate stage users can manually test prediction models of their own design or they can use four different algorithms to find the best performing models. Different measures are available to evaluate model performance. All models can be saved, and case coverage of any two or more models can be compared. The case membership of any one model can also be examined in more detail. This last step is important because it enables the transition from cross-case analysis to within case-analysis. The latter is necessary to identify if there is any casual mechanism underlying the association described by the prediction model.
The workflow design assumes that “Association is a necessary but insufficient basis for a causal claim,” which is more useful than simply saying “Correlation does not equal causation.”
- Evaluating the impact of flexible development interventions using a ‘loose’ theory of change Reflections on the Australia-Mekong NGO Engagement Platform summarises my argument on how program theory and predictive modelling can work together.
- Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner, by Kotu and Deshpande (2014) is very useful and detailed how-to-do-it reference.
- A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences, by Goertz and Mahoney (2012) is a very readable exposition of how cross-case and within-case analysis can be used for causal analysis, from a simple set theory perspective that fits with EvalC3
Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to firstname.lastname@example.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.