In this article, I evaluate the many ways of weight initialization and current best practices.

Zero Initialization

Initializing weights to zero DOES NOT WORK. Then Why have I mentioned it here? To understand the need for weight initialization, we need to understand why initializing weights to zero WON’T work.

Fig 1. Simple Network. Image by the Author.

Let us consider a simple network like the one shown above. Each input is just one scaler X₁…

In this article, I want to take an in-depth look at regularization.


What is regularization?

Regularization is a method to constraint the model to fit our data accurately and not overfit. It can also be thought of as penalizing unnecessary complexity in our model. There are mainly 3 types of regularization techniques deep learning practitioners use. They are:

  1. L1 Regularization or Lasso…

An in-depth look at Cross-Entropy, the intuitions, and reasoning behind its necessity and utility.


For the longest time, I had not completely understood Cross-Entropy loss. Why did we take exponents (softmax)? Why did we then take the log? Why did we take negative of this log? How did we end up with a positive loss that we have to minimize?

These questions and more…

A story about my relationship with my father


When I was 5, I remember thinking my father was the coolest, smartest person who could do anything he wanted. My father was a morally sound man, did not want to cheat people, believed in honesty and not harming others. …

In this article, I will discuss what I think are the three most important architectures to be aware of for NLP.

Recurrent Neural Network

Recurrent Neural Network (RNN). Image from Wikipedia under CC BY-SA 4.0 License.

Recurrent neural networks are special architectures that take into account temporal information. The hidden state of an RNN at time t takes in information from both the input at time t and activations from hidden units at time t-1, to calculate outputs for time t. This can be seen in…

This is a beginner blog post for people who don’t know anything or know very little about the tax system.

I have recently started working in the USA, and I knew nothing about taxes. I did not know when to pay them, or how to pay them. My time for reckoning came when I came across a video on youtube about how I could save on taxes if I registered…

In this article, I will discuss some of the most widely used DecisionTree-based algorithms for machine learning.

Decision Trees

What are they?

Decision trees are a tree algorithm that split the data based on certain decisions. Look at the image below of a very simple decision tree. We want to decide if an animal is a cat or a dog based on 2 questions.

  1. Are the ears pointy?
  2. Does the animal bark?

In this article, I list my top 5 neural network architectures for computer vision in no particular order

Convolutional Neural Networks


The idea of convolutions was first introduced by Kunihiko Fukushima in this paper. The neocognitron introduced 2 types of layers, convolutional layers and downsampling layers.

Then next key advancement was by Yann LeCun et al. when they used back-propagation to learn the coefficients of the convolutional kernel from images. This…

This is part 1 of a multipart series: The things I love the most about my favourite deep learning library, fastai.

This episode: Learning rate (LR)

LR before fastai

The general consensus on finding the best LR was usually to train a model fully, until the desired metric was achieved, with different optimizers at different LRs. The optimal LR and optimizer are picked depending on what combination of them worked best in the picking phase. …

The easiest installation of windows I have found is with Anaconda. Anaconda is a Package manager that helps you install and maintain the correct versions of packages, and also allows you to make virtual environments.

Steps to install Fastai

  1. Install Anaconda
  2. Install Cudatoolkit

conda install -c anaconda cudatoolkit

3. Install pytorch, you can find instructions at Make sure to select the newest version of CUDA, and select CONDA as your package.

4. Install fastai with the git clone + pip install method.

git clone
pip install -e "fastai[dev]"

Note: As of now, num_workers has to be 0 for fastai. So whenever you’re making a dataloader, make sure to set num_workers =0.

Akash Shastri

I love anything that makes me think. Check out my github here: Get in touch with me on LinkedIn at

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store