Posted on Mar 12, 2022

## Why ReLU is not differentiable at x=0 (zero)?

Posted in Machine Learning

ReLU is one of the widely used activation functions. For any $$x > 0$$, the output of ReLU is $$x$$ and $$0$$ otherwise. So,\ We can also write it as $$ReLU(x) = max(0, x)$$. For rest of the post let’s say $$f(x) = ReLU(x)$$....

Posted on Nov 27, 2021

## Weight Decay: basics with implementations

Posted in Machine Learning

Whoever is out there, working with Machine Learning models, overfitting must be a challenge for us. We can overcome this challenge by going out and collecting more data. But this can be costly, time-consuming, or sometimes even impossible for individuals! So, what do we do? We can follow some regularization techniques. Weight Decay is...

Posted on Oct 01, 2021

## Byte Pair Encoding (BPE) and Subword Tokenization

Posted in Machine Learning

In almost every application related to NLP, we use text as a part of the data. To the models, the input is generally a list of words or sentences like “We will live on Mars soon”. To a model, we feed the text as a sequence of tokens. The tokens can be characters, space-separated...

Posted on Jul 16, 2021

## Language Model Integration in Encoder-Decoder Speech Recognition

Posted in Machine Learning

Currently, Attention-based recurrent Encoder-Decoder models provide an elegant way of building end-to-end models for different tasks, like automatic speech recognition (ASR), machine translation (MT), etc. An end-to-end ASR model folds traditional acoustic model, pronunciation model, and language model (LM) into a single network. An encoder maps the input speech to a sequence of higher-level...

Posted on Jul 10, 2021

## Machine Learning Metrics: When to Use What

Posted in Machine Learning

Throughout the evolution, several metrics have been introduced to evaluate the performance of a Machine Learning algorithm or model. Sometimes it can be tricky to choose the correct metric for evaluating our model. In this article I have tried to discuss some basic matrics used in ML-related tasks and when to use which metric....

Posted on Jul 04, 2021

## What is Label Smoothing in Machine Learning?

Posted in Machine Learning

Nowadays for a lot of tasks, we are using deep neural networks or deep learning. While working with deep learning very often we face situations like overfitting and overconfidence. Overfitting is relatively well studied and can be tackled with several strategies like dropout, weight regularization, early stopping, etc. We have tools for tackling overconfidence...

Scroll to top