I’m Jennifer Dolatshahi, an evaluator at the New York City Department of Health and Mental Hygiene. In addition to my evaluation activities, I also teach internal workshops on beginner and intermediate R.
No matter how many other programs cross my desk, R remains one of my favorite tools for data manipulation, analysis, visualization, reporting, etc. I’ve been using R for 5 years and teaching it for 2, and I am still learning new ways to use it. Getting started can feel daunting (remembering how I coded early on can make me cringe!), but I’m here to share some great packages to help you dive right into data manipulation and basic analyses.
For data manipulation, I turn to the tidyverse, a group of packages developed by Hadley Wickham and others at R Studio. These packages, including dplyr and tidyr, use simple, intuitive commands and rely on the concept of “tidy” (read: well-organized & normalized) data. stringr deals with those messy character variables we all love to hate, and lubridate is great for working with date/time variables. While tidyverse has some limitations, it also provides a shared grammar and structure across multiple packages that makes it easy to find a solution to your data cleaning or data viz needs. It also has some options for basic analyses, like two-by-two tables, with dplyr::summarise() and tidyr::spread(). SQL users will see familiar commands in the form of _joins and case_when(). And don’t forget that pipes are your friends!
There are also some great packages outside of the tidyverse for basic analyses.
- psych, stats, and summarytools all provide options for a range of descriptive statistics and more advanced analyses.
- stargazer provides simple descriptive statistics on numeric variables and, when called on a regression object, has myriad options for presenting analysis results.
- swirl is an interactive package that helps you learn R. If you don’t come from a programming background, this package helps explain the structure and idiosyncrasies of the language.
For some examples of these packages in action, see my GitHub.
Rad Resources: You rarely if ever need to pay anything to learn R! The tidyverse has great resources, including R for Data Science, a free online text complete with examples and exercises, and the R Graphics Cookbook for your data viz needs. R Studio has cheat sheets on a variety of topics, and also posts webinars and videos from their annual conference.
Hot Tip: R comes with an amazing and robust online (and sometimes in person!) community to help you along the way. Google what you need to do and I promise someone has written a blog or stack overflow post with a solution. Find R meetups in your area, like R Ladies. And find out if your organization has a community of R users and get them talking, even if just on a listserv or slack channel.
I hope this helps get you started on your R journey, and definitely share those cool packages you discover along the way.
Do you have questions, concerns, kudos, or content to extend this aea365 contribution? Please add them in the comments section for this post on the aea365 webpage so that we may enrich our community of practice. Would you like to submit an aea365 Tip? Please send a note of interest to aea365@eval.org. aea365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators.