We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. … We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry. — A. Krizhevsky, I. Sutskever, G.E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, Communications of the ACM, 60(6): 84–90.
  1. Google’s Machine Learning Crash Course

    1. Regularization for Simplicity
      1. The Lambda term.
      2. Compare and contrast Loss vs. Structural Risk Minimization.
    2. Logistic Regression
      1. Terms:
        • Sigmoid
        • Log Loss
        • Early Stopping
      2. Compare and contrast Logistic vs. Linear Regression.
    3. Classification
      1. Terms:
        • Thresholding
        • ROC curve & AUC
        • Prediction bias
      2. Compare and contrast:
        • accuracy vs. precision vs. recall.
    4. Regularization for Sparsity
      1. Compare and contrast L0 vs. L1 vs. L2 regularization.
  2. Classifying movie reviews: a binary classification example — We’ll do this exercise in this unit’s lab. For now, read the exercise and review the following issues.

    1. Investigate the size and shape of the IMDB dataset.
    2. Neural networks can only accept numeric values, not strings. How does this exercise address this issue?
    3. Where in our course have we seen something related to “binary cross-entropy” (cf. Cross Entropy)? How is it relevant here?