AEA365 | A Tip-a-Day by and for Evaluators

TAG | sampling

I am Kate Cartwright, a 2016 AEA Minority Serving Institution Fellow and an Assistant Professor of Health Administration in the School of Public Administration at the University of New Mexico in Albuquerque. I study racial and ethnic health equity in regard to healthcare access, quality, and outcomes.

As an evaluator who values health equity, the imbalance of funding and research which prioritizes the health of underrepresented and underserved populations is of great concern. Researchers and evaluators alike are able to follow best practices in the field. However, too often the “best” practices reify inequities, which includes practices that leave out underrepresented groups.

A provocative essay published in The Atlantic in the summer of 2016 investigates why health studies are frequently so white when our population is so diverse. The article offers several theories, but repeatedly reveals that best practices in research fail to hold researchers accountable for non-inclusive sampling strategies. A recent PLoS Medicine article notes that even though the 1993 National Institutes of Health (NIH) Revitalization Act mandates that federally funded clinical research prioritize the inclusion of women and minorities, the act has not yielded parity in clinical study inclusion (for example, less than 2% of National Cancer Institute funded cancer trials from 1993 to 2013 met the NIH inclusion criteria).

Lesson Learned: Design Inclusive Sampling Strategies

Evaluators must design evaluations which have inclusive sampling strategies if they hope to improve the efficacy, effectiveness, and equity of evaluations.

Hot Tip: Always Include the Community as a Stakeholder

In one workshop on culturally responsive evaluation I attended at Evaluation 2016, some participants lamented that they would like to be more inclusive of community members when evaluating community health programs, but that they had to respond to the priorities of their stakeholders first. Thankfully, we were in a session with a great leader who gently, but firmly, challenged them (and all of us) to remember that community members must be counted as primary stakeholders in all evaluations.

Rad Resources:

The American Evaluation Association is AEA Minority Serving Institution (MSI) Fellowship Experience week. The contributions all this week to aea365 come from AEA’s MSI Fellows. For more information on the MSI fellowship, see this webpage: http://www.eval.org/p/cm/ld/fid=230 Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

My name is Michael Quinn Patton and I am an independent evaluation consultant. Development of more-nuanced and targeted purposeful sampling strategies has increased the utility of qualitative evaluation methods over the last decade. In the end, whatever conclusions we draw and judgments we make depend on what we have sampled.

Hot Tip: Make your qualitative sampling strategic and purposeful — the criteria of qualitative excellence.

Hot Tip: Convenience sampling is neither purposeful nor strategic. Convenience sampling means interviewees are selected because they happen to be available, for example, whoever happens to be around a program during a site visit. While convenience and cost are real considerations, first priority goes to strategically designing the sample to get the most information of greatest utility from the limited number of cases selected.

Hot Tip: Language matters. Both terms, purposeful and purposive, describe qualitative sampling. My work involves collaborating with non-researchers who say they find the term purposive academic, off-putting, and unclear. So stay purposeful.

Hot Tip: Be strategically purposeful. Some label qualitative case selection “nonprobability sampling” making explicit the contrast to probability sampling. This defines qualitative sampling by what it is not (nonprobability) rather than by what it is (strategically purposeful).

Hot Tip: A purposefully selected rose is still a rose. Because the word “sampling” is associated in many people’s minds with random probability sampling (generalizing from a sample to a population), some prefer to avoid the word sampling altogether in qualitative evaluations and simply refer to case selection. As always in evaluation, use terminology and nomenclature that is understandable and meaningful to primary intended users contextually.

Hot Tip: Watch for and resist denigration purposeful sampling. One international agency stipulates that purposeful samples can only be used for learning, not for accountability or public reporting on evaluation of public sector operations. Only randomly chosen representative samples are considered credible. This narrow view of purposeful sampling limits the potential contributions of strategically selected purposeful samples.

Cool Trick: Learn purposeful sampling options. Forty options (Patton, 2015, pp. 266-272) mean there is a sampling strategy for every evaluation purpose.

Lesson Learned: Be strategic and purposeful in all aspects of evaluation design, including especially qualitative case section.

Rad Resources:

  • Patton, M.Q. (2014) Qualitative inquiry in utilization-focused evaluation. In Goodyear, L., Jewiss, J., Usinger, J., & Barela, E. (Eds.), Qualitative inquiry in evaluation: From theory to practice.Jossey-Bass, pp. 25-54.
  • Patton, M.Q. (2015) Qualitative Research and Evaluation methods, 4thSage Publications.
  • Patton, M.Q. (2014) Top 10 Developments in Qualitative Evaluation for the Last Decade.

Qual research & eval 9780470447673.pdf

 

 

 

 

 

 

The American Evaluation Association is celebrating Qualitative Evaluation Week. The contributions all this week to aea365 come from evaluators who do qualitative evaluation. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Hi, my name is Nicole Vicinanza and I’m a Senior Research Associate with JBS International, a consulting firm. In my consulting work, one of my roles is to provide technical assistance in evaluation to community based organizations, government programs and service providers whose primary job is not evaluation.

Hot Tip: How do you explain random sampling to folks for whom sampling is a new requirement? Try using candy to show your clients the impact that different approaches and sample sizes can have. I’ve done this with groups by using different colored Hershey’s kisses in paper bags, but any similarly shaped, but different colored items will do. It allows folks to see the impact of different samples in concrete ways, cheaply, quickly and edibly.

First set up the bags with one color (e.g. silver) representing the most common type of respondent, and then other colors (e.g. purple and silver and tan/caramel) in much smaller numbers to represent respondents who have specific issues or problems. Set up the bags so that the total number of “respondents” and “problems” is easy to remember, or write down what you’ve put in the bags. Introduce the issue of sampling to the group and hand out the bags (either to individuals or table groups), but don’t tell them the proportion or type of “problems” that are in their bag. Use the bags to try out different approaches to sampling and sample sizes. Folks can look, and then quickly pick what they think is a representative sample, draw “blind” samples of different sizes, or pull larger or smaller samples from different bags. After they’ve pulled their first sample, tell them what the different colors represent, then discuss how that knowledge might change the size of the sample they pull. Try different random sample sizes, and record the results you get on worksheets or flip charts. Once you’ve finished trying samples of different sizes, tell them the proportions of different “problems” in their bags. How close did the different sample sizes come to the actual proportions? Discuss which approaches and sample sizes worked best for different information needs (e.g. uncovering different types problems vs. estimating proportions of problems). Talk about how moving from candies to sampling with real people may impact the results- which can move you into a discussion about non-response bias, how people with different issues may respond differently to you data collection, and cost issues.

Note: If folks can see what they’re pulling, they may bias the random samples- try having one person hold the bag, and another pull with their eyes shut. Also, don’t start eating until you’ve pulled the last sample- otherwise your pre-set proportions will get thrown off!

The American Evaluation Association is celebrating Best of aea365 week. The contributions all this week are reposts of great aea365 blogs from our earlier years. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

 

·

Hello! I’m Elizabeth Rupprecht, an Industrial Organizational Psychology graduate student at Saint Louis University.  I would like to tell you more about a great resource for collecting national or international evaluation data—Amazon’s Mechanical Turk (mTurk). mTurk is normally used to provide organizations with assistance completing tasks. Typically, an organization will set up a “task,” such as transcribing one minute of audio. Then, mTurk posts this task for any interested mTurk “workers” to complete. After the organization reviews the work done, the worker is paid between one cent to a dollar, depending on the complexity and length of the task. In a recent article, researchers noted that mTurk provides I/O psychologists with a large and diverse sample of working adults from across the country for research on topics such as: crowdsourcing, decision-making, and leadership (Buhrmeister et al., 2011).  mTurk could also be useful for evaluations needing sizable and diverse samples. For example, in the case of policy analysis, mTurk could be used to read the pulse of American voters on specific governmental policies. For consumer-oriented evaluation, mTurk could be used to help researchers obtain a convenient, diverse, and large sample of consumers to assess products or services.

Rad Resource: Even though mTurk may seem too good to be true, research published in Judgment and Decision-Making has found that the range of participants found on mTurk are representative of the US population of Internet users. In addition, 70-80% of users are from the US (Paolacci et al., 2010).

Cool Trick: mTurk has its own survey tools, but it allows you to add a link to an external assessment tool, which increases speed and allows for advanced functionality—such as the ability to export directly into third-party statistics programs (SPSS, SAS, Excel, etc).

Hot Tip: As my colleague, Lacie Barber, discussed in her aea365 contribution, implementing quality control checks in surveys can help improve the quality of data. In my experience using mTurk, I have found that specification of your target population is necessary both in the mTurk advertisement/recruitment statement for the “workers” and in the actual survey. Weeding out participants who overlook your specifications in the advertisement is vital! If the “workers” do not follow your specifications, or do not complete their “task,” (i.e. your survey) you do not need to pay them.

Only time will tell if mTurk becomes a highly used engine for social science and evaluation research, but at this moment, it seems like the hot new type of convenience sample!

Buhrmeister, M., et al. (2011). Amazon’s mechanical turk: A new source of inexpensive, yet high-quality data? Perspectives on Psychological Science, 6(3). 3-5.

Paolacci, G., et al. (2010). Running experiments on Amazon mechanical turk. Judgment and Decision Making, 5(5). 411-419.

The American Evaluation Association is celebrating Society for Industrial & Organizational Psychology (SIOP) Week with our SIOP colleagues. The contributions all this week to aea365 come from our SIOP members and you may wish to consider subscribing to our weekly headlines and resources listwhere we’ll be highlighting SIOP resources. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice.

·

Hi, we are Mende Davis (assistant research professor) and Mei-kuang Chen (advanced graduate student) in the department of Psychology at the University of Arizona. We are also members of the Evaluation Group for Analysis of Data (EGAD) led by Lee Sechrest. G*power is a useful tool to estimate minimum sample size or possible power of a potential study. In our own work as researchers applying for numerous grants, G*power has been a handy tool.

Hot Tip: An evaluation without enough cases may not be able to answer the research questions. You don’t want to be in the position of telling stakeholders that the study was only powered to detect a real difference 15% of the time. Power analysis is used to estimate the number of cases needed to detect a true difference if it exists. Three things are needed for a run-of-the-mill power analysis; alpha (probability of committing a type I error, i.e., rejecting a “true” null hypothesis), beta (probability of committing a type II error, i.e., accepting a “false” null hypothesis), and the expected effect size. Alpha is the familiar ‘type I error rate’ that is often set at .05 (p=.05, meaning you are willing to accept a false positive one time out of twenty). Beta is related to the value of statistical power (power= 1- beta) that you select, which is often set at .80. This means you want to be able to detect a real difference 80% of the time. The effect size is the strength of the relationship between two variables (e.g., the amount of change you expect in your outcome variable). Effect sizes are usually reported in standardized units, such as r, f2, or odds-ratios. Pilot studies and the literature can help us make an educated guess about the effect size. Checking the literature for an effect size can be a real eye opener. With the statistical analysis to be used (e.g., t-test, or regression equation), plus the levels of alpha, beta, and the estimated effect size at hand, you can use G*power to estimate the minimum sample size. If you have the information about the available sample size instead of the effect size, G*power can tell you how much statistical power you would have in your study.

Rad Source: G*power is free. Where can you get G*power? The newest G*power 3 can be obtained at http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/ and the G*Power 2 manual will still be useful for using G*power 3. It can be found at http://bit.ly/GPower2Manual.

Rad Source: The calculating of power and required sample size depends on which statistical tool you will use in your study. Some knowledge about power analysis will be helpful for evaluators: http://www.statsoft.com/textbook/power-analysis/

References
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd edition). Hillsdale, NJ: Erlbaum.

· · ·

Hi, my name is Nicole Vicinanza and I’m a Senior Research Associate with JBS International, a consulting firm. In my consulting work, one of my roles is to provide technical assistance in evaluation to community based organizations, government programs and service providers whose primary job is not evaluation.

Hot Tip: How do you explain random sampling to folks for whom sampling is a new requirement? Try using candy to show your clients the impact that different approaches and sample sizes can have. I’ve done this with groups by using different colored Hershey’s kisses in paper bags, but any similarly shaped, but different colored items will do. It allows folks to see the impact of different samples in concrete ways, cheaply, quickly and edibly.

First set up the bags with one color (e.g. silver) representing the most common type of respondent, and then other colors (e.g. purple and silver and tan/caramel) in much smaller numbers to represent respondents who have specific issues or problems. Set up the bags so that the total number of “respondents” and “problems” is easy to remember, or write down what you’ve put in the bags. Introduce the issue of sampling to the group and hand out the bags (either to individuals or table groups), but don’t tell them the proportion or type of “problems” that are in their bag. Use the bags to try out different approaches to sampling and sample sizes. Folks can look, and then quickly pick what they think is a representative sample, draw “blind” samples of different sizes, or pull larger or smaller samples from different bags. After they’ve pulled their first sample, tell them what the different colors represent, then discuss how that knowledge might change the size of the sample they pull. Try different random sample sizes, and record the results you get on worksheets or flip charts. Once you’ve finished trying samples of different sizes, tell them the proportions of different “problems” in their bags. How close did the different sample sizes come to the actual proportions? Discuss which approaches and sample sizes worked best for different information needs (e.g. uncovering different types problems vs. estimating proportions of problems). Talk about how moving from candies to sampling with real people may impact the results- which can move you into a discussion about non-response bias, how people with different issues may respond differently to you data collection, and cost issues.

Note: If folks can see what they’re pulling, they may bias the random samples- try having one person hold the bag, and another pull with their eyes shut. Also, don’t start eating until you’ve pulled the last sample- otherwise your pre-set proportions will get thrown off!

This contribution is from the aea365 Daily Tips blog, by and for evaluators, from the American Evaluation Association. Please consider contributing – send a note of interest to aea365@eval.org.

· ·

Archives

To top