The purpose of today's lab is to review the use of classes, and to practice working with modules and files.
Our goal today will be to process a data file of university employee salary data. Each line of the file describes an employee. We will begin by modeling an employee using a class, and then write a driver program to use this model to compute some statistics about the university employees based on salary and rank.
As usual, begin by creating a folder called
lab09
. Create a class in this folder to model an employee as follows:
employee.py
Employee
__init__ method with no
parameters except for self
. In the body of the method initialize the following instance
variables to values of your choice:
self._firstself._lastself._rankself._salaryLast name, first initial: rank ($salary)Jones, B.: associate ($23000.34)
With our model of an employee created, we'd like to test our implementation. Do this by adding a test section to the bottom of your file that checks if the file is being run as a script. Remember, you achieve this behaviour using the following check:
if __name__ == '__main__':
and then putting your tests in the body of the if statement. Add
code to create a default employee and verify that the employee can be
printed as required.
We now have a model of an employee, but we'd like to add some additional functionality. In particular, as we consider our problem specification, we note that we will need access to the rank and salary information of each employee.
Add accessors for the rank and salary of an employee, as well as appropriate tests in the testing section of your file.
The last form of functionality we will need involves creating an employee from information that is originally contained in a file. There are a two different possible approaches:
emp1 =
Employee('Sally', 'Smith', 'full', 90352.12)
emp1 = Employee('Sally
Smith full 90352.12')
Though there are arguments that could be made for either approach, we will choose here to modify the class to deal with the details related to creating an Employee instance from a single string (i.e., the latter approach). Note that more advanced Python code typically uses the pickle utility to read/write classes from/to files; we do not introduce that utility here.
Update the
__init__
method to receive an optional
line
parameter in addition to self. Do this as follows:
line and give it the
default value '' (i.e., the empty string)
"first last rank salary"):
strings
strings to initialize
an instance variable of the employee. Make sure you store the salary as an int or float -- not as a string.stderr and then crash the
program.
Check your code from the previous lab if you do not remember how to output to stderr.With a functional Employee class ready to go, we can turn to the analysis of the employee data file.
Download the employee data file from here:
code/employees.txt. Save this file
in your lab09 folder.
Make sure you call the file employees.txt.
Analyzing the data file is not something that should be the responsibility of a single employee. Instead, this task belongs in a separate space from the definition of the employee, that is, in a different file.
Create a new python file in your folder called
driver.py
. Implement the following algorithm:
import to gain access to the Employee
class.
employeeswith statement to open the data file and
automatically close it when we are finished processing as follows:
With the data successfully imported in a format that we can use, we are ready to do our processing. Here are the statistics we would like to compute:
How are we going to compute the average salaries, by rank -- i.e., the average salary of all Managers and average salary of all Staff, etc.? To do this, we need to compute the sum of all the salaries of each rank ("Manager" or "Staff", etc.) and count how many employees of each rank we have (i.e., how many "Manager"s we have). To do this, we'll use two dictionaries, one mapping rank to sum-of-salaries and one mapping rank to number-of-employees-of-that-rank. So, e.g., when the dictionaries are full populated they might look like this:
| Totals Dictionary | Count Dictionary |
|---|---|
| "Manager" → 10101010 | "Manager" → 9 |
| "Staff" → 9939339 | "Staff"→ 53 |
| "CEO" → 100010 | "CEO" → 1 |
If you don't remember how to access, add entries, and update entries in a dictionary, you probably want to review that by pulling up the textbook chapter that covered dictionaries.
We could write separate methods to compute each of these statistics, but then we would end up reading through the (possibly very large) list of employees multiple times. Instead, we will do the processing directly, reading through the file a single time, and then writing the results to a file. Doing all of these steps at once would be a bit much, so let's start by just getting the information we will need.
Use the following algorithm to gather the information we need to compute the 3 statistics listed above:
totals
and countsmax_employee equal to
the first employee in the list of employeesmin_employee equal to
the first employee in the list of employees.totals or counts dictionaries.
Note: You can use if emp.get_rank() in
totals for this check.
totals dictionary.
counts
dictionary.
totals
dictionary.
counts
dictionary.
max_employee and update max_employee
if appropriate.
min_employee and update min_employee
if appropriate.
We have now gathered in the information we need to compute our statistics, and are ready to write the relevant information to a new file.
Writing the employee information for the employees with the maximum
and minimum salaries is (relatively) straightforward, but printing a
table indicating the average salary by rank is slightly more
complicated. Create a function called
print_averages
that implements the following algorithm.
outFile. The assumption of this function is that the keys of the
totals dictionary match the keys of the counts dictionary and that
the file handle was opened in write mode by the caller .outFile.write('Rank\tAverage Salary\n')
Do not close the file handle, as we want to leave the handle in the same state it was passed to the function.
Finally, back in your main code section, add statements at the end that implement the following algorithm:
employee_stats.txt in write mode write
requires a string argument, but you can use str(max_employee)
to create a string representation of the employee with the maximum
salary.print_averages function to print the
table of averages by rank.
with statement).
Submit all the code and supporting files for the exercises in this lab. We will grade this exercise according to the following criteria:
If you’re working on a lab computer, don’t forget to log off of your machine when you are finished!