Today we look at a small dataset of billion dollar weather disasters in the US. We will step through the data processing and analysis steps together as a class, writing some python code together, writing some alone, and perhaps some in small groups.
Dataset: the dataset comes from NOAA's Center for Environmental Information where they compiled a list of weather disasters since 1980 that caused an estimated $1+ billion of damage. You can download the CSV file here. The dataset begins with these lines:
Weather and Climate Billion-Dollar Disasters to affect the U.S. from 1980-2022 (CPI-Adjusted) "Name","Disaster","Begin Date","End Date","Total CPI-Adjusted Cost (Millions of Dollars)","Deaths" "Southern Severe Storms and Flooding (April 1980)","Flooding",19800410,19800417,2551.4,7 "Hurricane Allen (August 1980)","Tropical Cyclone",19800807,19800811,2071.0,13 ... ...
The first line is just some title information and the second line contains column names. You have to jump to the third line to begin seeing the actual weather events, which is what we care about.
In class we will compare two different approaches:
Our goal will be to sort by cost (or deaths), compute various statistics, and perhaps make a search interface.
We will slowly build this program in class to parse the weather disaster CSV file.