Welcome to aea365! Please take a moment to review our new community guidelines. Learn More.

Tech TIG Week: Supporting Data Analytics with Google BigQuery by Wai Lam Wong

Hi everyone, my name is Wai Lam Wong and I’m an information systems analyst at a community health organization in California. My academic background is in computational social sciences and I’m always looking for ways to leverage technology for social impact.

This post is about using Google BigQuery to enable a small nonprofit’s project team to collaboratively analyze over a dozen data sources together. If you’re reaching an unhealthy level of frustration with too many spreadsheets, then I encourage you to read on! Please note the intended audience of this post includes folks who have some familiarity with data analysis and who are looking to shift towards a centralized data warehouse. If you’re unsure if this topic is for you, consider checking out Elizabeth DiLuzio’s posts first on “Spreadsheet vs Database: Which One is Right for Me?“ and “Using Dashboarding Software When a Database Isn’t an Option“. 

Lessons Learned

In a recent project, I needed to find a way for five people to analyze multiple data sources together. I knew better than to have everyone create their own copies, since this would spiral into a mess where one person’s “version 3” would inexplicably differ from another’s. Reproducibility would also quickly be relegated to the realm of wishlists. What we needed was a central repository which flexibly matched our team members’ skillsets and the project’s financial expectations (read: preferably free). Google BigQuery may be worth looking into if the considerations below also speak to your situation:

  • Analytical flexibility: BigQuery integrated easily with each person’s tool(s) of choice. In our case, this meant Google Sheets, Tableau, SQL, dbt, and Python. Each person could link and merge tables from across the data warehouse as needed.
  • Centralized storage: BigQuery acted as our project’s source of truth. Since team members pulled data through live connections to BigQuery, we just needed to refresh our connections to get the latest version of the data.
  • Metadata management: Every BigQuery table is expected to have a description and schema (the schema defines what each column represents and what data type it holds). Since we were pulling in data from government portals, APIs, and internal sources, it was important for us to clearly document what each table contained. Standardized documentation ensured that the team was analyzing data appropriately.
  • Free tier: the project team’s final cost for using BigQuery came out to $0 as the total size of our datasets stayed within the free 10 GiB of storage per month and our analytical needs stayed within the free 1 TB of query processing per month.
  • Scalability: While spreadsheet tools such as Microsoft Excel can run into size limitations (~1m rows, ~16k columns), BigQuery is designed to handle big data with ease.

It’s helpful to remember that BigQuery is an enterprise data warehouse and many of its capabilities may not be relevant for smaller organizations. Nevertheless, by keeping in touch with developments in the analytics and data engineering domains, we can often find simple ways to be effective with powerful tools. To get started, check out the quickstart guides such as “Query a public dataset with the Google Cloud console”.Happy exploring!

The American Evaluation Association is hostingIntegrating Technology into Evaluation TIG Week with our colleagues in the Integrating Technology into Evaluation Topical Interest Group. The contributions all this week to AEA365 come from ITE TIG members. Do you have questions, concerns, kudos, or content to extend this AEA365 contribution? Please add them in the comments section for this post on the AEA365 webpage so that we may enrich our community of practice. Would you like to submit an AEA365 Tip? Please send a note of interest to AEA365@eval.org. AEA365 is sponsored by the American Evaluation Association and provides a Tip-a-Day by and for evaluators. The views and opinions expressed on the AEA365 blog are solely those of the original authors and other contributors. These views and opinions do not necessarily represent those of the American Evaluation Association, and/or any/all contributors to this site.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.