I am deciding if I would like to study Deep Learning. Can you help with a dumb question? In the work of Google, I gather that their ANN unsupervisedly learns a "cat face feature" best activated given a cat face stumulus. Is the training set absolutely random youtube frames, or they are youtube frames all with cats in them perhaps? It is intriguing. Thanks!
Yes, the training set is random frames from youtube, one per video, unbiased by any particular searches. The sample was random, it just so happens that if you want to encode youtube frames well, you'll need a human face detector/encoder, and a cat face encoder.
Their ANN is a classifier that answers the question "Does this frame contain a cat face?". As any other classifier, it needs a training dataset for unsupervised learning that is somewhat balanced. However, taking random Youtube frames would probably give you a very skewed dataset (too many negative samples). To get a more balanced training set, they probably use keywords in the video title or manual selection of videos to get more positive samples and fewer negative ones.
From what I understand, the deep dream network is trained on far more pictures of cats than anything else. The consequential bias for cat faces stems from "overfitting", which is a common problem with neural networks. Overfitting occurs when a network's weights become overly modified and specific. In a sense the network gets too good at classifying the training data, and thus has trouble generalizing to new inputs.