Lab 11: JSON and Baseball

Sports statistics is big business, and with related events like a booming sports betting industry, more people are looking to automated methods for data analysis. This lab exposes you to data access from an online API, and shows you how to pull data from an easy-to-use Major League Baseball (MLB) player search program. The best part of the lab is that we do not have any data files. Instead, you'll make LIVE queries to a web server database to get the information you need. The lab will use a few of our recently learned skills: JSON, web queries, and classes.

Goal

Let's start with the goal this week, and then the instructions will step you through it. We want a program that searches for a team, and then searches for players on that team based on their positions. Your program should look like this when run (red is user input):

Team? Orioles
Position? CF
Cedric Mullins (CF)	bats L	height 5'8''	weight 175
Position? C
Anthony Bemboom (C)	bats L	height 6'2''	weight 200
Robinson Chirinos (C)	bats R	height 6'1''	weight 220
Nick Ciuffo (C)		bats L	height 6'0''	weight 205
Adley Rutschman (C)	bats S	height 6'2''	weight 220
Pedro Severino (C)	bats R	height 6'1''	weight 235
Chance Sisco (C)	bats L	height 6'3''	weight 210
Austin Wynns (C)	bats R	height 6'0''	weight 190
Position? 2B
Jonathan Arauz (2B)	bats S	height 6'0''	weight 195
Gunnar Henderson (2B)	bats L	height 6'2''	weight 210
Jahmai Jones (2B)	bats R	height 6'0''	weight 210
Domingo Leyba (2B)	bats S	height 5'11''	weight 205
Richie Martin (2B)	bats R	height 6'0''	weight 190
Rougned Odor (2B)	bats L	height 5'11''	weight 200
Chris Owings (2B)	bats R	height 5'10''	weight 185
Rio Ruiz (2B)		bats L	height 6'2''	weight 220
Pat Valaika (2B)	bats R	height 6'0''	weight 200
Stevie Wilkerson (2B)	bats S	height 6'2''	weight 200
Position? 1B
Trey Mancini (1B)	bats R	height 6'3''	weight 230
Ryan Mountcastle (1B)	bats R	height 6'4''	weight 230
Position? SS
Freddy Galvis (SS)	bats S	height 5'10''	weight 190
Jorge Mateo (SS)	bats R	height 6'0''	weight 182
Position? quit  

Step 1: Write the get_team_ids function

Create a file part1.py

Team? Orioles

When the user enters their team, we have to first find the team's ID in the baseball database. We want to ask for all the players on this team, but we can't do that without knowing the team's ID first. Step one is thus to ask for ALL known team IDs from the Web database. You'll then save these in a dictionary.

Required: Write a function called get_team_ids():

  1. This function takes zero arguments.
  2. This function returns a dictionary of team names mapped to IDs.

The function's only job is to query the baseball website, process the JSON of teams that the website returns, and then build a dictionary that maps team names (strings) to IDs (ints). The function returns this dictionary.

You'll need this code to make the website call and get the JSON:

import requests
      
params = { 'sport_code':"'mlb'", 'sort_order':'name_asc', 'season':'2022' }
r = requests.get(url="http://lookup-service-prod.mlb.com/json/named.team_all_season.bam", params=params)
data = r.json()

That's all you need to get the data. Now climb through the data variable (a dictionary) and get those names and IDs! It's up to you to figure out the format that was returned. Use type(), .keys(), and print() liberally as you debug to see what it has! Pull out the debugger if you need it. Please use the mlb_org_id field for the team ID. There are lots of name fields, so find the right one to match our output :)

In this step you must (1) write your function, (2) call your function, and (3) print out the resulting dictionary. Your output for this Step should look like this:

$ python3 part1.py
{'AL Div. Winner #3': '11', 'AL Higher Seed': '11', 'AL Wild Card #1': '11', 'AL Wild Card #2': '11', 'AL Wild Card #3': '11', 'AL All-Stars': '', 'D-backs': '109', 'Braves': '144', 'Orioles': '110', 'Red Sox': '111', 'Cubs': '112', 'White Sox': '145', 'Reds': '113', 'CLE/TB': '11', 'Guardians': '114', 'Rockies': '115', 'Tigers': '116', 'HOU/NYY': '11', 'Astros': '117', 'Royals': '118', 'LAD/SD': '11', 'Lg Champ #1': '11', 'Lg Champ #2': '11', 'Angels': '108', 'Dodgers': '119', 'Marlins': '146', 'Brewers': '158', 'Twins': '142', 'NL All-Stars': '', 'Mets': '121', 'Yankees': '147', 'NL Central Champ': '11', 'NL Champ': '11', 'NL Div. Winner #1': '11', 'NL East Champ': '11', 'NL East Runner-Up': '11', 'NL Lower Seed': '11', 'NL Wild Card #2': '11', 'NL Wild Card #3': '11', 'No team': '', 'NYM/SD': '11', 'NYY/CLE': '11', 'Athletics': '133', 'OOC': '', 'Phillies': '143', 'Pirates': '134', 'Padres': '135', 'Giants': '137', 'Mariners': '136', 'Cardinals': '138', 'STL/PHI': '11', 'Rays': '139', 'Rangers': '140', "TBD - ALDS 'A'": '11', 'Blue Jays': '141', 'TOR/SEA': '11', 'Nationals': '120', 'WBC TBD': '11'}

Step 2: Write the get_players function

Copy your part1.py to part2.py so you have a working backup.

After Step 1, you have a dictionary of teams so you can lookup the user's team ID. You must now query for all players on that team. This is one more remote call to the website. Your output here will be this:

Required output:

Team? Orioles
Fernando Abad (P)	bats L	height 6'2''	weight 235
Jesus Aguilar (DH)	bats R	height 6'3''	weight 277
Keegan Akin (P)		bats L	height 5'11''	weight 235
Logan Allen (P)		bats R	height 6'3''	weight 200
Shaun Anderson (P)	bats R	height 6'6''	weight 228
Jonathan Arauz (2B)	bats S	height 6'0''	weight 195
Shawn Armstrong (P)	bats R	height 6'2''	weight 225
Bryan Baker (P)		bats R	height 6'6''	weight 245
Rylan Bannon (3B)	bats R	height 5'8''	weight 180
Manny Barreda (P)	bats R	height 5'11''	weight 195
Mike Baumann (P)	bats R	height 6'4''	weight 235
Felix Bautista (P)	bats R	height 6'5''	weight 190
Anthony Bemboom (C)	bats L	height 6'2''	weight 200
Kyle Bradish (P)	bats R	height 6'4''	weight 220
Zack Burdi (P)		bats R	height 6'3''	weight 210
Yennier Cano (P)	bats R	height 6'4''	weight 185
Robinson Chirinos (C)	bats R	height 6'1''	weight 220    
...

Required: Define a Player class:

  1. Your class must be instantiated by the following statement:
    p = Player(name, position, bats, heightf, heightin, weight)
  2. Your class must have one function: pretty_print()

Required: Write a function called get_players(id):

  1. It has one argument: a team's int ID (e.g., 118)
  2. It returns a List of Player objects (players on that team)

Your function will query the MLB website for the given team's ID, process the JSON of all the players, and build a List of Player objects. You need to query for a team's roster of players, so here it is:

params = { 'start_season':2021, 'end_season':2022, 'team_id':team_id }
r = requests.get(url="http://lookup-service-prod.mlb.com/json/named.roster_team_alltime.bam", params=params)
data = r.json()

Look at the team_id variable in the params dictionary. That should be the user's team's ID, so adjust the name accordingly to your variable.

Your function must return a List of Player objects. You will define a Player class, and then you will make a Player object for each player in the JSON, and put them in a list. Your job here is to use your class to then create a list of Player objects. Your class must be usable like so:

# Create one player with all attributes
p = Player(name, position, bats, heightf, heightin, weight)
# Print the player in a nice format
p.pretty_print()

Still confused? We want you to explore the JSON that is returned. You will make a Player object, giving it 6 different values ... so your job is to find those 6 values in the JSON for each player. Then make a Player object and append it to a list that you'll return after looping over the JSON.

Still confused? You might want to review the examples on the bottom of the lecture notes. Those examples are looping over JSON values of baseball teams. You'll do something similar but obviously with different dictionary attributes for the players.

Step 3: Complete the program

Copy part2.py to part3.py to keep a working backup.

You're almost finished. You now have a function to get all the players for one team. This final step is to write the core program that interacts with the user to retrieve positions. The main loop that asks the user for their player positions is straightforward. Remember, they just enter a team name once at the start, and then you loop for player positions until they type "quit".

You must use the functions you wrote above.

Required: your output should match EXACTLY the program output at the top of this lab.

(optional) Extra Credit: make it more useful

Create a separate extracredit.py program for this.

  1. Allow the user to switch to a different team without quitting the program. You can makeup a new user command as needed.
  2. When you print the matching players, sort them by height. Search the Web for how to sort a list of custom objects like we have here.
  3. The API for the baseball database is here. Add some other neat search feature...up to you!

Document what you did if you completed any of this extra credit. Include instructions on the proper user commands to execute your features. Points awarded will be based on difficulty and substance. #1 above is minimal, but #3 is more.

What to turn in

Visit the submit website and upload your three programs.

submit -c=sd211 -p=lab11 part1.py part2.py part3.py

Visit the submit website and upload your programs.