AEA365 | A Tip-a-Day by and for Evaluators

CAT | Quantitative Methods: Theory and Design

Hello! I’m Judy Savageau from the Center for Health Policy and Research at UMass Medical School following up on yesterday’s post with Part II of basic data analyses. A number of posts outlining statistical/analytic details are in AEA365’s archives. For example, there are some great posts on “Readings for Numbers People (or Those Who Wish They Were)”, “Starting a Statistics [Book] Club”, and “Explaining Statistical Significance”. These posts discuss multivariate modeling, longitudinal data analysis, propensity score matching, factor analysis, structural equation modeling, and more. But what defines multivariate analyses and how do they differ from bivariate analyses?

Hot Tip:

Decisions about bivariate statistics (i.e., assessing the relationship between 2 variables; e.g., gender and school performance) are made based on the ‘type’ of data (e.g., categorical vs continuous; see yesterday’s Part I post). There are many reputable resources that show simple tables for determining which statistic to use (see Rad Resources below), including:

  • Chi-square test: 2 categorical variables (e.g., program participation: yes/no and job type)
  • T-test: 1 categorical variable with 2 levels (e.g., gender: male/female) and 1 continuous variable (e.g., IQ, SAT scores)
  • ANOVA – Analysis of Variance: 1 categorical variable with 3 or more levels (e.g., program performance: low / moderate / high) and 1 continuous variable (e.g., years of education)
  • Correlation coefficient: 2 continuous variables (e.g., years of employment and number of correct responses to knowledge about job-related standards)

Hot Tip:

Finally, use multivariate analyses when you want to look at a large number of variables and their relationship (collectively) to one outcome. The most appropriate multivariate statistic depends, in large part, on the categorical or continuous nature of the outcome variable. For example, in one federally-funded study assessing the multiple factors related to return to work after a work-related injury (e.g., severity of injury, years until anticipated retirement, pre-injury job satisfaction, employer assessment of re-injury potential, etc.), our outcome variable was ‘return to work’ measured in multiple ways:

  • Categorical measure: return to work – Yes/No. To determine which factors are most predictive of whether or not a person with a work-related injury will come back to work might best be explored using logistic regression.
  • Continuous measure: how quickly (in weeks) might a person return to work following a work-related injury might best be explored using linear regression.

There are many decisions to be made when developing a data analysis plan. I’m hoping that this 2-part introduction to the basics of statistical analyses gets you started in thinking about the best way to explore and analyze your quantitative data. Of course, having a statistician/data analyst sitting ‘at the table’ with the team as early as possible will ensure that you collect data in the best format to answer your research questions.

Rad Resources:

Here are just a couple of web pages that help with some decision-making about when it’s most appropriate to choose one statistical test over another – depending on the type of data you have.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Hello! I’m Judy Savageau from the Center for Health Policy and Research  at UMass Medical School. A recent post from Pei-Pei Lei, my colleague in our Office of Survey Research, introduced some options for statistical programming in R. I wondered whether a basic introduction to statistics might be in order for those contemplating ‘where do I begin’, ‘what statistics do I need to compute’, and ‘how do I choose the appropriate statistical test’. While most AEA365 blogs don’t cover every topic in detail, perhaps a basic 2-part introduction will help here. Analyses are very different with qualitative versus quantitative data; thus, I’ve concentrated on the quantitative side of statistical computations.

Hot Tip:

Analyses fall into 3 general categories: descriptive, bivariate, and multivariate; they’re typically computed in that order as we:

  • explore our data (descriptive analyses) with frequencies, percentile distributions, means, medians, and other measures of ‘central tendency’;
  • begin to look at associations between an independent variable (e.g., age, gender, level of education) and an outcome variable (e.g., knowledge, attitudes, skills; bivariate analyses); and
  • try to identify a set of factors that might be most ‘predictive’ of the outcome of interest (multivariate analyses).

Hot Tip:

The decision about what statistical test to use to describe data and its various relationships depends on the ‘nature’ of the data. Is it:

  • Categorical data:
    • nominal; e.g., gender, race, ethnicity, smoking status, participation in a program: yes/no;
    • ordinal: e.g., a Likert-type scale score of 1=Strongly disagree to 5=Strongly agree or 5 levels of education: ‘Less than high school’, ‘High school graduate/GED’, ‘Some college/Associate degree’, ‘College graduate – 4-year program’, and ‘Post-graduate (Masters or PhD degree)’;
    • interval: ordinal data in fixed/equal-sized categories; e.g., age groups in 10-year intervals or salary in $25,000 intervals; or is it:
  • Continuous data:
    • For example: age, years of education, days of school missed due to asthma exacerbations), etc.

Of course, data are often collected in one mode and then ‘collapsed’ for particular analyses (e.g., age recoded into meaningful age groups, Likert-type scales recoded as ‘agree’/’neutral’/ ’disagree’).

Hot Tip:

Decisions must take into consideration whether the data are ‘normally distributed’ (i.e., is there ‘skewness’ in the data such that the values for age are mostly in persons under 45 though you have a small number of people who are in their 60’s, 70’s, and 80’s?). Most statistical tests have a number of underlying assumptions that one must meet – all starting with data being normally distributed. Thus, one typically begins looking descriptively at their data: frequencies and percentile distributions, means, medians, and standard deviations. Sometimes, graphing the data shows the ‘devil in the detail’ with regard to how data are distributed. There are some statistics one can compute to measure the degree of skewness in the data and whether distributions are significantly different from ‘normal’. And, if the data are not normally distributed, there are several non-parametric statistics that can be computed to take this into account.

Tomorrow’s post will focus on bivariate and multivariable statistics. Stay tuned!

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

My name is Samantha Theriault, and I am the Research Assistant at Randi Korn & Associates (RK&A), a research, evaluation and intentional planning company that specializes in museums and informal learning.

Lesson Learned: As a research assistant, I spend much of my time entering and processing data.  Data entry and clean-up is time consuming and challenging, and it makes me feel like a worm turning dirt among the roots of our work. From this point of view, I’m watching projects grow from the ground-up, so it’s even more exciting to see the final product. Preparing quantitative and qualitative data for analysis isn’t glamourous, but it is invigorating and contributes to the success of all our work. Evaluation runs on data – so keeping it organized from the beginning is vital.

Hot Tip: Set aside chunks of time to spend with data. Depending on the size of your project and methods, data entry can take a few hours or several days, or even weeks! Prioritizing data entry – rather than “squeezing it in” between tasks – which increases my comfort with the data set and puts me in a rewarding flow state. With mindfulness during data entry, I notice patterns as they emerge. For example, I recently struggled to interpret a participant’s shorthand on a question about which neighborhood they live in, but noticed others spelled out all the words in their responses. I matched the shorthand to the full neighborhood name (and double-checked with Google!), which I would have missed were I entering data mindlessly or too quickly. Similarly, I might notice other trends, such as reduced responses to a certain question, which I flag for examination later.

Cool Trick: Create a “living” data entry processing handbook.  When I was first learning to process data using SPSS, I created a “cleaning up data files checklist” and add unique tasks and tips to it each time I work on a new data set.  My checklist includes recoding system-missing responses, ensuring that survey responses follow skip logic, and reminders such as “slow down and double-check your work!” Since my colleagues depend on these data sets to do their work, I include their needs in my checklist, too: spell out acronyms, label variables, and delete “working” variable labels I created while collapsing categories into single columns. I also create a log for each digital survey’s lifecycle, documenting any changes to the museum’s exhibition or program during data collection, and quirks to remember when it’s time to process the data for analysis. This detailed approach to record-keeping is especially useful when my colleagues have specific questions about the data during analysis.

Rad Resource: Microsoft OneNote is a password-protected digital notebook that I use to keep track of my data cleaning process (and many other elements of managing data collection and processing). I like that I can repurpose checklists and save relevant files on the same page – it’s like a 4D Moleskine to me!

(click for larger image)

The American Evaluation Association is celebrating Labor Day Week in Evaluation: Honoring the WORK of evaluation. The contributions this week are tributes to the behind the scenes and often underappreciated work evaluators do. Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

Greetings AEA community, I’m Pei-Pei Lei, a biostatistician in the Office of Survey Research at the University of Massachusetts Medical School. Have you been looking to expand your skill set in statistical programming? Have you wondered if R is the appropriate statistical software package for your needs? The purpose of this post is to help you decide whether R is right for you and, if so, how you can get started using it.

R may be the right tool if you:

  • Need to manage and/or analyze quantitative data
  • Are looking for a free alternative to commercial software packages, such as SAS, SPSS, and STATA
  • Don’t mind writing computer code – does print (“Hello, world!”) look easy enough to you?
  • Want to create nice-looking and informative figures and graphics (see this website for example)

If you’re not sure, here are some places for you to get a feel for R language:

  • TryR: This website provides online interactive step-by-step practice on the webpage
  • DataCamp: This website provides online interactive step-by-step practice (more material than TryR)

Hot Tips:

The following is a list of MOOCs (Massive Open Online Courses) that can help you learn R for free (or pay a fee for a verified certificate):

  • R programming on Coursera: It’s a 4-week course to go through basic R programming knowledge. It provides a weekly quiz and a final project for you to test your skills. Good for beginning to intermediate users.
  • Introduction to R for Data Science on edX: It’s a self-paced 4-week course to go through basic R programming knowledge. This course is using DataCamp for class materials and exercises. Good for beginners.
  • R Basics – R Programming Language Introduction on Udemy: This is a self-paced course that goes through basic set up such as downloading the software and coding. Good for beginners.
  • Data Analysis with R on Udacity: This course takes about 2 months to finish (it’s also part of the Data Analyst nanodegree program). Its tutorial videos show coding processes in RStudio. Good for beginning to intermediate users.

You can also install the Swirl R package to learn R in R. It gives you interactive instructions for different topics. This is good for intermediate users.

Rad Resources:

  • R-bloggers: This is a repository of R-related articles, including tutorials. You can subscribe to the mailing list to receive the latest articles.
  • Stack overflow: This is a forum where you can post your question and get answers, or even better, provide answer to others’ questions!

Lessons Learned:

Don’t be intimidated by the many choices you have in learning R. They are the means to reach your goal. So pick one that you like and get started!

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

· ·

We are Lisa Holliday and Olivia Stevenson. We are data architects with The Evaluation Group where we have recently begun to transition to R for data analysis.  R is a free program for statistical analysis that is powerful, but can have a steep learning curve if you want to utilize its scripting capabilities.

Is R worth it for evaluators?

In short, yes! R makes advanced analyses, such as propensity score models, social network analyses, and hierarchical linear models possible with just a few lines of code. R opens possibilities with data visualizations that are highly customizable while maintaining fast reproducibility. R is an essential tool for all evaluators especially as the demand on evaluators for more rigorous designs and analyses continue to grow.

Where can I start?

The Rcmdr package offers a point and click interface that makes R much more user-friendly.  Not only is it easy to install, but it also generates R scripts, which can be saved, modified, and re-run. This can be a big time-saver!

Cool Trick 1:  The Latest Version of R

To get started, you will need to install the latest version of R, which can be found here.  The most recently released version will appear in the “News” feed.  You should also install RStudio Desktop.  RStudio is an integrated development environment (IDE) that makes working with R and installing R packages quick and easy.  When you want to use R, open RStudio to get started.

Cool Trick 2: Installing Rcmdr

Within RStudio, select “Install” from the “Packages” pane in the lower right hand corner of your screen.  In the pop-up window, enter “Rcmdr” and select “Install.”

(*click on image to see larger)

 

rcmdr

Cool Trick 3: Using Rcmdr

Once you have installed Rcmdr, all you will need to do is select it from the “Packages” pane.  While it will open automatically, the first time you use it, you may receive a message that you need to download additional packages.  If this happens, approve the installations, and Rcmdr will open in a new window when this process is complete.

(*click on image to see larger)

opening_rcmdr

Rad Resources: Rcmdr Training

There are a lot of great resources available on how to get started with Rcmdr.  Here is a brief introduction to Rcmdr that includes how to import data.  A good introduction to using RStudio (and R in general) is Lynda’s “Getting Started with the R Environment,” which you may be able to view for free through your public library.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

·

Hi, we are Pei-Pei Lei and Carla Hillerns from the Office of Survey Research at the University of Massachusetts Medical School’s Center of Healthcare Policy and Research. The other day, we asked each other what one analysis tool is most vital to our quantitative survey work. We agreed on the answer – a banner table.

A banner table is a simple thing, really – just a set of crosstabs – but it’s so useful in analysis. For example, the table below shows how people of different ages and insurance types differ in their experiences with their doctors. By displaying all our key variables in one view, a banner table helps us to visualize stories from the data. It allows us to understand if subgroups of our respondents have different behaviors/opinions without having to run multiple analyses.

Lei Hillerns

Hot Tip: We’ve used age and insurance type as our banner points in the table above. Both were collected as part of the same survey that asked respondents how often their doctors listened carefully to them. However, you can use multiple sources to create banner points, such as background data on the sample or previous waves of the survey.

Hot Tip: In setting up your tables, incorporate statistical test results so you can communicate statistically significant differences easily. In the above table, superscripts indicate statistically significant differences between banner points at the 95% confidence level.

Hot Tip: There are plenty of crosstab software packages that can create a large number of banner tables easily, but they usually come with a fee. If you have a limited budget or a small dataset, consider creating your banner tables through tools you already have. Here are a few software packages you might have and how you can create your banner tables with them:

  • Excel: pivot table
  • R: table function
  • SAS: proc tabulate function
  • SPSS: crosstabs comment
  • STATA: table comment

Rad Resources:

  • Want more information on banner tables? Check out these websites for more details and examples:

http://www.greenbook.org/marketing-research/anatomy-of-a-crosstab-03377

http://www.statpac.net/crosstabs-software.htm

  • Are you a Qualtrics user? Here’s a helpful guide to creating crosstabs using your survey software:

https://www.qualtrics.com/support/survey-platform/data-and-analysis-module/cross-tabulation/cross-tabulation-overview/

  • Are you a Confirmit user? Confirmit has a built-in tool, Instant Analytics, for creating banner tables:

http://betatesterconfirmitcommunity.ning.com/discussions/instant-analytics-now-available-to-all-confirmit-professional-use?context=category-Instant+Analytics

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

 

My name is Ama Nyame-Mensah, and I am a doctoral student in the Social Welfare program at the University of Pennsylvania.

Likert scales are commonly used in program evaluation. However, despite their widespread popularity, Likert scales are often misused and poorly constructed, which can result in misleading evaluation outcomes. Consider the following tips when using or creating Likert scales:

Hot Tip #1: Use the term correctly

A Likert scale consists of a series of statements that measure individual’s attitudes, beliefs, or perceptions about a topic. For each statement (or Likert item), respondents are asked to choose one option from a list of ordered response choices that best aligns with their view. Numeric values are assigned to each answer choice for the purpose of analysis (e.g., 1 = Strongly Disagree, 4 = Strongly Agree). Each respondent’s responses to the set of statements are then combined into a single composite score/variable.

Nyame 1

Hot Tip #2: Label your scale appropriately

To avoid ambiguity, assign a “label” to each response option. Make sure to use ordered labels that are descriptive and meaningful to respondents.

Nyame 2

Hot Tip #3: One statement per item

Avoid including items that consist of multiple statements, but only allow for one answer. Such items can confuse respondents and introduce unnecessary error into your data. Look for the words “and” and “or” as a signal that an item may be double-barreled.

Nyame 3

Hot Tip #4: Avoid multiple negatives

Rephrase negative statements into positive ones. Such statements are confusing and difficult to interpret.

Nyame 4

Hot Tip #5: Keep it balanced

Regardless of whether you use an odd or even number of response choices, include an equal number of positive and negative options for respondents to choose from because an unbalanced scale can produce response bias.

Nyame 5

Hot Tip #6: Provide instructions

Tell respondents how you want them to answer the question. This will ensure that respondents understand and respond to the question as intended.

Nyame 6

Hot Tip #7: Pre-test a new scale

If you create a Likert scale, pre-test it with a small group of coworkers or members of your target population. This can help you determine whether your items are clear, and your scale is reliable and valid.

The Likert scale and items used in this blog post are adopted from the Rosenberg Self-Esteem Scale.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

 

 

·

My name is Spectra Myers, and I am a graduate student at the University of Minnesota’s Organizational Leadership Policy and Development working on a Masters in Evaluation Studies.

I have been working closely with a Minneapolis agency addressing homelessness among youth on a service evaluation. The project included fielding a paper survey, data entry and analysis. The agency was thrilled with the actionable information generated and program changes suggested by staff as a result of a survey. They committed to fielding the survey twice a year to track their progress and generate further insights for program improvements. The only problem: their staff does not have training in, or access to, programs like SPSS or R to generate the descriptive statistics needed for analysis. Even Excel seemed too cumbersome for their needs.

Rad Resource: Statwing is an online subscription data analysis program with straightforward data uploading and intuitive features. It automatically codes missing data; generates descriptive statistics; notes p-values, effect sizes, confidence intervals; and it even includes basic regression. They have a free 14-day trial and offer monthly and annual plans at www.statwing.com.

Hot Tip: Want to share data with collaborators or clients? Statwing makes it easy to generate a link that you can share to provide read-only access, regardless of whether others have an account.

Lesson Learned: Commit to supporting your clients learning process in data analysis. Just because Statwing is intuitive doesn’t mean you’re off the hook. It still takes some knowledge of statistics to know when or when not to use the provided features including the suggested statistical analyses and visualizations. I’m taking the approach of analyzing the second round of data with agency staff to ensure a successful transition.

·

Hi, I’m Ama Nyame-Mensah. I am a doctoral student at the University of Pennsylvania’s School of Social Policy & Practice. In this post, I will share with you some lessons learned about incorporating demographic variables into surveys or questionnaires.

For many, the most important part of a survey or questionnaire is the demographics section. Not only can demographic data help you describe your target audience, but also it can reveal patterns in the data across certain groups of individuals (e.g., gender, income level). So asking the right demographic questions is crucial.

Lesson Learned #1: Plan ahead

In the survey/questionnaire design phase, consider how you will analyze your data by identifying relevant groups of respondents. This will ensure that you collect the demographic information you need. (Remember: you cannot analyze data you do not have!)

Lesson Learned #2: See what others have done

If you are unsure of what items to include in your demographics section, try searching through AEA’s Publications or Google Scholar for research/evaluations being done in a similar area. Using those sources, you can locate links to specific tools or survey instruments that use demographic questions that you would like to incorporate into your our work.

Lesson Learned #3: Let respondents opt out

Allow respondents the option of opting out of the demographics section in its entirety, or, at the very least, make sure to add a “prefer not to answer” option to all demographic questions. In general, it is good practice to include a “prefer not to answer” choice when asking sensitive questions because it may make the difference between a respondent skipping a single question and discontinuing the survey altogether.

Lesson Learned #4: Make it concise, but complete

I learned one of the best lessons in survey/questionnaire design at my old job. We were in the process of revamping our annual surveys, and a steering committee member suggested that we put all of our demographic questions on one page. Placing all of your demographic questions on one page will not only make your survey “feel” shorter and flow better, but it will also push you to think about which demographic questions are most relevant to your work.

Collecting the right demographic data in the right way can help you uncover meaningful and actionable insights.

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

 

·

We are Carla Hillerns and Pei-Pei Lei from the Office of Survey Research at the University of Massachusetts Medical School’s Center for Health Policy and Research. We’d like to discuss a common mistake in surveys – double-barreled questions. As the name implies, a double-barreled question asks about two topics, which can lead to issues of interpretation as you’re not sure if the person is responding to the first ‘question’, the second ‘question’ or both. Here is an example:

Was the training session held at a convenient time and location?          Yes          No

A respondent may have different opinions about the time and location of the session but the question only allows for one response. You may be saying to yourself, “I’d never write a question like that!” Yet double barreling is a very easy mistake to make, especially when trying to reduce the overall number of questions on a survey. We’ve spotted double (and even triple) barreled questions in lots of surveys – even validated instruments.

Hot Tips: For Avoiding Double-Barreled Questions:

  1. Prior to writing questions, list the precise topics to be measured. This step might seem like extra work but can actually make question writing easier.
  2. Avoid complicated phrasing. Using simple wording helps identify the topic of the question.
  3. Pay attention to conjunctions like “and” and “or.” A conjunction can be a red flag that your question contains multiple topics.
  4. Ask colleagues to review a working draft of the survey specifically for double-barreled questions (and other design problems). We call this step “cracking the code” because it can be a fun challenge for internal reviewers.
  5. Test the survey. Use cognitive interviews and/or pilot tests to uncover possible problems from the respondent’s perspective. See this AEA365 post for more information on cognitive interviewing.

Rad Resource: Our go-to resource for tips on writing good questions is Internet, phone, mail, and mixed-mode surveys: The tailored design method by Dillman, Smith & Christian.

Lessons Learned:

  1. Never assume. Even when we’re planning on using a previously tested instrument, we still set aside time to review it for potential design problems.
  2. Other evaluators can provide valuable knowledge about survey design. Double-barreled questions are just one of the many common errors in survey design. Other examples include leading questions and double negatives. We hope to see future AEA blogs that offer strategies to tackle these types of problems. Or please consider writing a comment to this post if you have ideas you’d like to share. Thank you!

Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org . aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.

·

Older posts >>

Archives

To top