How to Make an Image Classifier - Intro to Deep Learning #6

By: Siraj Raval

866   33   47306

Uploaded on 02/18/2017

We're going to make our own Image Classifier for cats & dogs in 40 lines of Python! First we'll go over the history of image classification, then we'll dive into the concepts behind convolutional networks and why they are so amazing.

Coding challenge for this video:

Charles-David's winning code:

Dalai's runner-up code:

More Learning Resources:'s-Guide-To-Understanding-Convolutional-Neural-Networks/

Join other Wizards in our Slack channel:

Please subscribe! And like. And comment. That's what keeps me going.

And please support me on Patreon:

Comments (4):

By anonymous    2017-09-20

I managed to classify time series data using Convolutional Neural Network. Convolution Neural Network is basically the same as Artificial Neural Network. The only difference is that, the input to the ANN must be convolved first to extract specific features. In an intuitive way, convolution operation basically highlights specific features of some data. It is best depicted by flashlight shined through different parts of the images. By doing so, we can highlight specific features of the image.

That's the main idea of CNN. It is inherently designed to extract spatial features. The convolution operation is usually stacked, which means you have (row,column,dimensions) so the output of the convolution is 3 dimension. The downside of this process is large computation time. To reduce that, we need pooling or downsampling which basically reduce the size of the feature detectors without losing essential features/information. For example before pooling you have 12 of 6,6 matrix as feature detectors. And after pooling you have 12 convolved data with size of 3,3. You can do these two steps over and over again before flattening which basically squash all those into (n,1) dimensional array. Afterwards, you can do normal ANN steps.

In short, the steps to classify time series data can be done using CNN. Here are the steps:

4.Full connection (normal ANN steps)

You can add convolution and pooling layers as much as you like, but watch out for training time. There's this video by my favourite youtuber, Siraj Raval. By the way, I suggest you to use Keras for Deep Learning. Hands down the easiest deep learning library to use. Hope it helps.

Original Thread

By anonymous    2017-09-20

There are basically two types of neural networks, supervised and unsupervised learning. Both need a training set to "learn". Imagine training set as a massive book where you can learn specific information. In supervised learning, the book is supplied with answer key but without the solution manual, in contrast, unsupervised learning comes without answer key or solution manual. But the goal is the same, which is that to find patterns between the questions and answers (supervised learning) and questions (unsupervised learning).

Now we have differentiate between those two, we can go into the models. Let's discuss about supervised learning, which basically has 3 main models:

  • artificial neural network (ANN)

  • convolutional neural network (CNN)

  • recurrent neural network (RNN)

ANN is the simplest of all three. I believe that you have understand it, so we can move forward to CNN.

Basically in CNN all you have to do is to convolve our input with feature detectors. Feature detectors are matrices which have the dimension of (row,column,depth(number of feature detectors). The goal of convolving our input is to extract informations related to spatial data. Let's say you want to distinguish between cats and dogs. Cats have whiskers but dogs does not. Cats also have different eyes than dogs and so on. But the downside is, the more convolution layers will result in slower computation time. To mitigate that, we do some kind of processing called pooling or downsampling. Basically, this reduce the size of feature detectors while minimizing lost features or information. Then the next step would be flattening or squashing all those 3d matrix into (n,1) dimension so you can input it into ANN. Then the next step is self explanatory, which is normal ANN. Because CNN is inherently able to detect certain features, it mostly(maybe always) used for classification, for example image classification, time series classification, or maybe even video classification. For a crash course in CNN, check out this video by Siraj Raval. He's my favourite youtuber of all time!

Arguably the most sophisticated of all three, RNN is bestly described as neural networks that have "memory" by introducing "loops" within them which allow information to persist. Why is this important? As you are reading this, your brain use previous memory to comprehend all of this information. You don't seem to rethink everything from scratch again and this is what traditional neural networks do, which is to forget everything and re-learn again. But native RNN aren't effective so when people talk about RNN they mostly refer to LSTM which stands for Long Short-Term Memory. If that seems confusing to you, Cristopher Olah will give you in depth explanation in a very simple way. I advice you to check out his link for complete understanding about how RNN, especially LSTM variant

As for unsupervised learning, I'm so sorry that I haven't got the time to learn them, so this is the best I can do. Good luck and have fun!

Original Thread

Submit Your Video

If you have some great dev videos to share, please fill out this form.