Zero Initialization

Initializing weights to zero DOES NOT WORK. Then Why have I mentioned it here? To understand the need for weight initialization, we need to understand why initializing weights to zero WON’T work.

Fig 1. Simple Network. Image by the Author.

Let us consider a simple network like the one shown above. Each input is just one scaler X₁…


What is regularization?

Regularization is a method to constraint the model to fit our data accurately and not overfit. It can also be thought of as penalizing unnecessary complexity in our model. There are mainly 3 types of regularization techniques deep learning practitioners use. They are:

  1. L1 Regularization or Lasso…


For the longest time, I had not completely understood Cross-Entropy loss. Why did we take exponents (softmax)? Why did we then take the log? Why did we take negative of this log? How did we end up with a positive loss that we have to minimize?

These questions and more…


When I was 5, I remember thinking my father was the coolest, smartest person who could do anything he wanted. My father was a morally sound man, did not want to cheat people, believed in honesty and not harming others. …

Recurrent Neural Network

Recurrent Neural Network (RNN). Image from Wikipedia under CC BY-SA 4.0 License.

Recurrent neural networks are special architectures that take into account temporal information. The hidden state of an RNN at time t takes in information from both the input at time t and activations from hidden units at time t-1, to calculate outputs for time t. This can be seen in…

I have recently started working in the USA, and I knew nothing about taxes. I did not know when to pay them, or how to pay them. My time for reckoning came when I came across a video on youtube about how I could save on taxes if I registered…

Decision Trees

What are they?

Decision trees are a tree algorithm that split the data based on certain decisions. Look at the image below of a very simple decision tree. We want to decide if an animal is a cat or a dog based on 2 questions.

  1. Are the ears pointy?
  2. Does the animal bark?

Convolutional Neural Networks


The idea of convolutions was first introduced by Kunihiko Fukushima in this paper. The neocognitron introduced 2 types of layers, convolutional layers and downsampling layers.

Then next key advancement was by Yann LeCun et al. when they used back-propagation to learn the coefficients of the convolutional kernel from images. This…

This episode: Learning rate (LR)

LR before fastai

The general consensus on finding the best LR was usually to train a model fully, until the desired metric was achieved, with different optimizers at different LRs. The optimal LR and optimizer are picked depending on what combination of them worked best in the picking phase. …

The easiest installation of windows I have found is with Anaconda. Anaconda is a Package manager that helps you install and maintain the correct versions of packages, and also allows you to make virtual environments.

Steps to install Fastai

  1. Install Anaconda
  2. Install Cudatoolkit

conda install -c anaconda cudatoolkit

3. Install pytorch, you can find instructions at Make sure to select the newest version of CUDA, and select CONDA as your package.

4. Install fastai with the git clone + pip install method.

git clone
pip install -e "fastai[dev]"

Note: As of now, num_workers has to be 0 for fastai. So whenever you’re making a dataloader, make sure to set num_workers =0.

