Andrew Ng: Deep Learning, Self-Taught Learning and Unsupervised Feature Learning


45 min in-depth talk about “deep”, self-taught and unsupervised machine learning.

Not necessary to view, but some important concepts are covered for basic understanding of the deep learning breakthrough techniques. See notes below.


  • previous machine learning used human classified training sets (supervised learning) and was bound by the size of the training set
    • the more data the better the performance
  • unsupervised learning had no training sets, just any videos or images available on the net (i.e. effectively infinite), and then evolves feature detects to create a “sparse” encoding of the pixel data, these encodings can be thought of as feature detectors
    • ex: feature detector are edge detectors on a 14x14 patch
    • these feature detectors can be combined to describe a similar sized area of the image: area A = 0.3 * FeatureDetector1 + 0.6 * FC5 + 0.1 * FD24
    • feature detectors can then b combined for larger areas as well: edges combined to create face shapes, face shapes combined to create face types
    • performance of unsupervised learning bounded by size of the model (number of connections, feature detectors, etc), the algorithm matters as does the compute power spent on training
  • so the breakthrough is that now we aren’t bound by data classification done by humans but instead by the exponentially bound power of the hardware/software
  • perceptual part of brain (maybe 40-60% of brain function) is relatively well simulated by deep learning

Stanford Machine Learning online course

Google Brain’s Co-inventor Tells Why He’s Building Chinese Neural Networks

Forbes article about Andrew’s concern for jobs

  • the U.S. took 200 years to get from 98% to 2% farming employment
    • RK: how about other countries hat started later?