There is a popular cliché … that computers only do exactly what you tell them to, and that therefore computers are never creative. The cliché is true only in the crashingly trivial sense, the same sense in which Shakespeare never wrote anything except what his first schoolteacher taught him to write—words. — R. Dawkins, The Blind Watchmaker, 1986.

Validation

We now introduce the validation set.

Exercise 8.1

Validation.

Questions:

  1. Submit solutions to tasks 1–5.
  2. Give a one-paragraph summary of what you learned about using training, validation and testing datasets.

Save your answers in lab08_1.txt.

Feature Engineering

Finish the Representation programming exercises.

Exercise 8.2

Feature Sets

Questions:

  1. What does the Pearson correlation coefficient measure? Identify one example value from the correlation table you compute and explain why it makes sense.
  2. Submit your solutions to tasks 1–2. Include the features you selected for task 1 and the synthetic features you developed for task 2; include the final RMS errors but not the training output. Did you beat the Google-provided baselines?

Save your answers in lab08_2.txt.

Finish the Feature Crosses programming exercises.

Exercise 8.3

Feature Crosses

Questions:

  1. They recommend FTRL for L1 optimization, but the code specifies the same rate (learning_rate) for all runs. How is FTRL managing the learning rate?
  2. What good does the bucketing/binning do?
  3. Submit your solutions to tasks 1–2. Did you find their task 1 bucketing to make sense? Identify one unique feature cross for task 2 and explain how it could be useful.

Save your answers in lab08_3.txt.

Keras — Regression

Do a regression model using Keras.

Exercise 8.4

Predicting house prices: a regression example* — Get this code to run and then answer the following questions.

  1. What good did the K-fold validation do in this exercise?
  2. Chollet claims that it would be problematic to use data values with “wildly different ranges”. Why is this?
  3. Chollet also claims that smaller datasets “prefer” smaller networks. Do you agree? Explain your answer.
  4. Try networks with one more and one less hidden layer, and wider or narrower layers. Do any of your alternatives do better than the suggested architecture? Why or why not?

Save your answers in lab08_4.txt.

*This exercise is F. Chollet, Chapter 3.6 (3.7 online). We will generally use Keras for neural models rather than TensorFlow.

Checking in

We will grade your work according to the following criteria:

See the policies page for lab due-dates and times.