Welcome to SURE 2025


SURE 2025

Department of Statistics & Data Science
Carnegie Mellon University

Introduction

Plan for today

9:15: Opening remarks + program overview

10:15: Academic advisor presentations (Glenn)

11:30: Lunch + meet the department staff (Baker Hall 129)

1:30: Lab

  • Health (Scaife 234)

  • Sports (Scaife 236) + 3:00: Guest speaker

Who are we?

  • Department of Statistics & Data Science, Carnegie Mellon University

  • Lead instructor: Quang Nguyen (preferred form of address: Quang)

    • Office hours: see calendar
  • Teaching assistants

    • Health: Princess Allotey, Julian Braganza, Hao Lee, James Leiner

    • Sports: Yuchen Chen, Sara Colando, Erin Franke, Leigh Preimesberger

About SURE

Summer Undergraduate Research Experience

Explore cutting-edge statistics and data science methodology with applications in

  • Heathcare: UnitedHealth Group Bridges to Healthcare Technology

  • Sports: Carnegie Mellon Sports Analytics Camp (CMSACamp)

“The best thing about being a statistician…is that you get to play in everyone’s backyard.” — John W. Tukey

Goals

  • Develop fundamentals research skills: data wrangling, visualization, modeling, communication

  • Become familiar with R, tidyverse, Quarto (Markdown syntax), Git/GitHub

  • Become familiar with cutting-edge statistical machine learning techniques

  • Create a portfolio of projects and practice reproducible research

  • Network with academic researchers and industry professionals

    • Help navigate your next steps—industry vs. graduate school

Resources

Check these frequently!

Schedule
(subject to change)

A typical day consists of 3 main events

  • Lectures

  • Speaker/webinar sessions

  • Labs

Lectures

Mon–Fri, 9:15–10:45am, Scaife 234

  • First ~2 weeks: EDA, basic data science tasks
  • Next ~4 weeks: statistical modeling, machine learning
  • After that: special topics, guest lectures

A few scheduling notes:

  • Holidays: Juneteenth (Thu June 19) and Independence Day break (July 2–4 + weekend)
  • Wed June 18: UHG CMU in-person event (see calendar, Health only; Sports get a day off)
  • Later on: Sport-specific days/events (Health gets time off)

Labs

Mon–Fri, 1:30–3pm

Scaife 234 (Health) or Scaife 236 (Sports)

  • Demo labs

  • Project labs

    • will begin with a mini EDA project

    • then shift to focus on main capstone project

Speaker/webinar sessions

Either mid-day (in between lecture and lab) or after lab

Scaife 234 / MS Teams (Health) or Scaife 236 / Zoom (Sports)

Note: dates/times may vary; check calendar

  • Health: UHG webinars, individual meetings with mentors

  • Sports: project pitches, guest speakers

Side note: Lunches

Dates/times may vary; check calendar

  • Location: Baker Hall 129

  • Don’t hesitate to take more food with you!

EDA project

  • Practice understanding the structure of a dataset and perform basic EDA tasks (e.g., data wrangling, data visualization) in R, and using GitHub for collaboration

  • Work in groups of 2–3

  • Timeline

    • Release date: Thursday, June 5

    • 6-minute presentation (no notes/scripts) on Tuesday, June 17 during lab

Capstone project

  • Analyze a dataset in health or sports analytics to answer a research question that is important to people in your respective field

  • Work in groups of 2–3

  • Presentation checkpoint(s) (no notes/scripts)

  • Deliverables (more details will be provided later on)

    • Report
    • Poster
    • Presentation

First week

Wednesday–Friday

  • Quang out of town

  • Lectures and labs as usual

    • Guest lectures by TA experts

    • TAs will run labs as usual

  • Thursday: EDA project released during lab

Reminders

  • Fill out the survey forms (Communication and Data Science Background)

  • Reset CMU wifi password (for non-CMU students)

  • Check Calendar, Slack, email often

Tips

  • This is a research program. Feel free to go above and beyond & explore things that aren’t taught in lectures
  • Focus on principles, tools are incidental (i.e. it’s more important to get how things work first over just how to do things)
  • Don’t expect to get all the concepts in one go (instead, repetition… do more & read more)

Expectations

  • In-person attendance

    • Be on time. PLEASE.

    • This applies to lectures, labs, other sessions (e.g., webinars, guest speakers, other activities)

    • This is part of the Code of Conduct

  • Participate and ask questions

  • Work together. Help and support each other.

  • Enjoy, learn and grow

What are office hours?