Initializing weights to zero DOES NOT WORK. Then Why have I mentioned it here? To understand the need for weight initialization, we need to understand why initializing weights to zero WON’T work.
Let us consider a simple network like the one shown above. Each input is just one scaler X₁, X₂, X₃. And the weights of each neuron are W₁ and W₂. Each weight update is as below:
Out₁ = X₁*W₁ + X₂*W₁ + X₃*W₁
Out₂ = X₁*W₂ + X₂*W₂ + X₃*W₂
As you can see by now, if the weight matrix W = [W₁ W₂] is initialized to zero…
What is regularization?
Regularization is a method to constraint the model to fit our data accurately and not overfit. It can also be thought of as penalizing unnecessary complexity in our model. There are mainly 3 types of regularization techniques deep learning practitioners use. They are:
Sidebar: Other techniques can also have a regularizing effect on our model. You can prevent overfitting by also having more data to constraint the search space of our function. This can be done with techniques like data augmentation, that create more data to…
For the longest time, I had not completely understood Cross-Entropy loss. Why did we take exponents (softmax)? Why did we then take the log? Why did we take negative of this log? How did we end up with a positive loss that we have to minimize?
These questions and more boggled my mind, to the point where I just accepted that I just had to use cross-entropy for multilabel classification and didn’t think about it much.
Recently I started going through fastai’s 2020 Course, where Jeremy was explaining Cross entropy, and even though I think he did a good job…
Recurrent neural networks are special architectures that take into account temporal information. The hidden state of an RNN at time t takes in information from both the input at time t and activations from hidden units at time t-1, to calculate outputs for time t. This can be seen in the image above. This gives the RNN memory, or the ability to remember previous inputs and their outputs.
This is extremely important for Natural language processing, as in NLP the input data does not have a fixed size, and the next word is highly dependent on previous words. Context is…
I have recently started working in the USA, and I knew nothing about taxes. I did not know when to pay them, or how to pay them. My time for reckoning came when I came across a video on youtube about how I could save on taxes if I registered as an LLC. Although I am not that well versed, from my research, yet to help you figure out if and why you should register as an LLC, I will help you understand, as best as I have understood, the Tax Jargon for the USA.
In this article, I will…
Decision trees are a tree algorithm that split the data based on certain decisions. Look at the image below of a very simple decision tree. We want to decide if an animal is a cat or a dog based on 2 questions.
We can answer each question and depending on the answer, we can classify the animal as either a dog or a cat. The red lines represent the answer “NO” and the green line, “YES”.
This way the decision process can be laid out like a tree. The question nodes are…
Then next key advancement was by Yann LeCun et al. when they used back-propagation to learn the coefficients of the convolutional kernel from images. This made learning automatic and not laboriously handcrafted. According to Wikipedia, this approach became a foundation for modern computer vision.
Then came “ImageNet Classification with Deep Convolutional Neural Networks” in 2012, by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, which is widely regarded as the most influential paper on convolutional neural…
The general consensus on finding the best LR was usually to train a model fully, until the desired metric was achieved, with different optimizers at different LRs. The optimal LR and optimizer are picked depending on what combination of them worked best in the picking phase. This is an ok technique, although computationally expensive.
Note: As I was introduced early in my deep learning career to fastai, I do not know a lot about how things are done without/before fastai, so please let me know if this was a bit inaccurate, also take this section with a grain of salt.
The easiest installation of windows I have found is with Anaconda. Anaconda is a Package manager that helps you install and maintain the correct versions of packages, and also allows you to make virtual environments.
conda install -c anaconda cudatoolkit
3. Install pytorch, you can find instructions at https://pytorch.org/. Make sure to select the newest version of CUDA, and select CONDA as your package.
4. Install fastai with the git clone + pip install method.
git clone https://github.com/fastai/fastai
pip install -e "fastai[dev]"
Note: As of now, num_workers has to be 0 for fastai. So whenever you’re making a dataloader, make sure to set num_workers =0.
Things to accomplish:
1. Find a picture you want to watermark. This will be referred to as background
2. Find a logo or text that will be your watermark, this will be referred to as watermark.
3. The objective is to paste a translucent watermark over the background.
Here is my approach:
Step 1: Import PIL library
Step 2: Create your watermark
if your watermark is text, then you need to create an image with just the text with a transparent background. Here’s how thats done:
1. Create an ‘RGBA’ image with a transparent background. …