Lab 10: Tabular Data with pandas and Streamlit

Today we’ll redo some past exercises using pandas and Streamlit. Here are some useful reference pages:

We’ll start with a Streamlit warmup.

Temperature Converter

To begin, create a new folder called lab10 and create a new file in that folder called temp_convert.py, starting from template.py as usual.

Add this code chunk just after the docstring (don’t try to understand how it works). It makes it so you can Run the file within Thonny instead of needing to use a System Shell.

# Shim to make "Run" in Thonny behave like "streamlit run this_file.py" on Terminal
import sys, streamlit.web.cli, streamlit.runtime.runtime
if '__file__' not in globals(): raise Exception("Save the file first!")
if not streamlit.runtime.runtime.Runtime.exists():
    sys.argv[:] = ['streamlit', 'run', __file__]; sys.exit(streamlit.web.cli.main())
# End shim.
  1. Make a Streamlit app that has a st.number_input() widget for a temperature in degrees C and st.write()s the corresponding temperature in degrees F.

    • The label for the number input should describe the purpose (e.g., “Temperature in degrees C”)
    • The output should describe what it is, a temperature in degrees F.
    • Optional, come back to this: You could try using a st.metric widget instead of the st.write. Try a st.columns also.
  2. Add a st.radio widget above the number input to allow the user to choose whether to convert C to F or F to C. Modify the label for the number_input, the conversion logic, and the output message accordingly. For the list of options, you can use ["F to C", "C to F"].

The formulas are:

Test the following:

Employees

In Lab 8, we read employee data from a file and wrote code to compute the average salary by rank. pandas makes this very easy, as we’ll see.

  1. In a file named employee_summary.py, import pandas and streamlit using their canonical names (pd, st).

  2. Use pd.read_csv to read the employees.txt file from Lab 8.

    • Since the CSV doesn’t include a header row, you’ll need to pass names=["given_name", "family_name", "rank", "salary"] as a second argument to read_csv to tell Pandas what the column names are.
    • Assign the result to a variable called employees.
    • Use st.dataframe(employees) to show the employees data in the Streamlit app; check that you have 100 rows and the correct column names.
  3. Sort the employees data frame by the salary column: employees.sort_values(by = 'salary', inplace=True). The inplace argument tells pandas to mutate the dataframe instead of returning a sorted copy. Do this before the st.dataframe so you can see the result. (What changes if you also pass ascending = False?)

  4. Compute the average salary by rank, using a single line of code involving groupby() and either mean() or agg(). See the class slides or the Tutorial section on grouping. Use st.dataframe() to report the results. Use sort_values() (no arguments needed this time) to sort by salary. (Can you make it so that the rank with the highest average salary comes first?)

Optional Extensions:

  1. Replace the file input with a st.file_uploader widget. When the page first runs, the file_uploader will return None, so you’ll need to check if employee_file is not None: or the like before running the rest of the employee-handling code. (Or: if employee_file is None: st.stop() to stop execution early.)

  2. Report the highest and lowest paid employees. This can be done simply by passing employees.iloc[0] (or -1) to st.dataframe after sorting by salary.

  3. Make the display more user-friendly somehow. For example, you might add headers and subheaders, clean up debugging output, use st.write() to show the data for the highest and lowest paid, or change the column names on the salary-by-rank table.

Climate Plot

In Lab 9, we plotted some global climate data by reading in the CSV and manually converting the data types. Now let’s do it with pandas.

  1. Create a climate_plot.py from the template, with streamlit and pandas imports as usual.

  2. Read the 1880-2024.csv file that you had downloaded for that lab. This time the file does have headers (so we don’t need to pass names=), but it has extra rows that we need to skip. So use read_csv(filename, skiprows=N), using the appropriate filename and the appropriate number for N, the number of extra rows at the top of the file we need to skip. Make sure not to skip the names this time.

  3. Make a quick and dirty plot using st.line_chart. See if you can figure out how to tell it what to put on the x and y axes.

Optional extension: Try using plotly to make a more refined chart. (Here’s the documentation for plotly)

Optional extension: use st.pyplot() to display the same figure that you made for that lab. You’ll need to:

Optional

Streamlit Session State
temp = st.number_input("Temperature", key="temp")
def on_change_unit():
    print("Old unit:", unit, "New unit:", st.session_state.unit)
    if st.session_state.unit == "F":
        # Changing from C to F
        st.session_state.temp = temp * 9/5 + 32
    else:
        # Changing from F to C
        st.session_state.temp = (temp - 32) * 5/9
unit = st.radio("Unit", ["F", "C"], on_change=on_change_unit, key="unit")

Notice that: