SD 212 Spring 2025 / Labs


Lab 4: Info Challenge

1 Overview

We will be participating in the UMD Info Challenge, in groups of 3 or 4.

Each team will focus on a single data set by a real provider, and spend the week understanding, cleaning, analyzing, and building an effective presentation from that data.

1.1 Learning goals

  • Practice the full data science pipeline: acquisition, storage, processing/cleaning, analysis, and visualization/communication
  • Work on a real data set towards the goals of industrial or government organizations
  • Work within a team

2 Schedule and structure

(Dates and deadlines in bold)

  • Tuesday Feb 25: Comp day (no class)

  • Wednesday Feb 26: choose dataset

  • Friday Feb 28: Comp day (no class)

  • Saturday Mar 1 at 10am in Hopper Hall: Kickoff day, at USNA

    Meet mentors, live web stream, learn about datasets, get started

  • Monday Mar 3: Work on IC during class

  • Tuesday Mar 4: Work on IC during lab

  • Wednesday Mar 5: Work on IC during class

  • Friday Mar 7: Comp day (no class)

  • Friday Mar 7 by 1800: Video presentations and peer evals. Submit your video to this shared folder. Fill out a peer eval form for each teammate. Project files and slide presentations (not video) uploaded to GitHub. Submit your markdown file to the submit system (see below).

  • Mar 8-15: Spring Break

  • Tuesday Mar 18: Live presentations and recap during lab

3 Helpful documents

4 GitHub

You should definitely make a GitHub repo to do your work on the Info Challenge! Look back at your notes from our recent unit in class and related homeworks if you need a reminder how to do that.

You should have just one GitHub repo per team. Once a single team member creates their GitHub repo for the info challenge, they can invite their teammates to it (as well as their instructor).

5 Grading

Your work will be judged by the IC judges for prize consideration. It will also count as a lab grade for SD212, independently of the IC contest judging.

Your SD212 grade will be based on the Info Challenge judging rubric, as scored by your instructor based on what you submit in the Markdown file, your code in GitHub, your video, and your presentations during lab.

  • 80%: Info Challenge judging rubric, as scored by your instructor based on:

    1. Your answers to the questions in the markdown file (below)
    2. Your code uploaded to GitHub
    3. Your video presentation
    4. Your presentation during lab time
  • 20%: Individual teamwork score based on teamwork rubric completed by all group members.

Your grade may be adjusted down by up to -25% for failure to follow instructions and meet required deadlines.

6 Questions

Please answer and have one team member (only) submit these questions prior to the SD212 submission deadline.

  1. Who are your team’s members?

  2. Enter the URL of the GitHub repository that contains your code and presentation materials.

  3. Briefly describe the file organization in your GitHub repository, Where is your presentation? Where is your code and what does it do?

  4. Say a few words about how your team worked together. Who took on the role as “team manager”? How did you organize and share your work?

  5. For the Info Challenge project, what outside data source(s) did you incorporate?

  6. What did you have to do to clean and process the data?

    (Include the provided datasets as well as any outside data that you found in your discussion. Just a few sentences giving the overall idea is fine.)

  7. What did you do to analyze the data?

    (Again, just a few sentences with an overall description is good.)

  8. How did you create visualizations of your analysis?

  9. What concrete recommendations or conclusions did you make?

  10. What tips and suggestions do you have for next year’s Info Challenge participants?

6.1 Markdown file to fill in

Here is the file with questions to fill in and submit for today’s lab: lab04.md

You can run this wget command to download the blank md file directly from the command line:

wget "https://www.usna.edu/Users/cs/nchamber/courses/sd212/lab/md/lab04.md"

6.2 Submit your work

Submit the markdown file (with the girhub link to all of your work):

submit -c=sd212 -p=lab04 lab04.md

or

club -csd212 -plab04 lab04.md

or use the web interface