Adaptive Richard's Curve Weighted Activation

Deep Neural Networks (DNNs) are ubiquitous in modern machine learning tasks like image and speech recognition. They take in input data and make decisions based on that input. The activation function used in the DNNs is an essential component that determines the output. In this context, a new activation unit has been introduced called Adaptive Richard's Curve weighted Activation (ARiA). The following discussion is an overview of ARiA and its significance over traditional Rectified Linear Units (R

Collapsing Linear Unit

CoLU is a cleverly crafted activation function that has numerous unique properties favorable to the performance of deeper neural networks. Developed alongside similar activation functions, Swish and Mish, CoLU boasts properties such as smoothness, differentiability, and being unbounded above while simultaneously being bounded below. It is also non-saturating and non-monotonic. What is an Activation Function? Before discussing the properties and benefits of CoLU, it is essential to understand

Cosine Linear Unit

What is CosLU? CosLU, short for Cosine Linear Unit, is an activation function used in Artificial Neural Networks. It uses a combination of trainable parameters and the cosine function to map the input data to a non-linear output. CosLU is defined using the following formula: $$CosLU(x) = (x + \alpha \cos(\beta x))\sigma(x)$$ Where $\alpha$ and $\beta$ are multiplier parameters that are learned during training, and $\sigma(x)$ is a standard activation function like the sigmoid or the rectifie

CReLU

Introduction to CReLU CReLU, or Concatenated Rectified Linear Units, is an activation function used in deep learning. It involves concatenating the output of a layer with its negation and then applying the ReLU activation function to each concatenated part. This results in an activation function that preserves both positive and negative information while enforcing non-linearity. What is an Activation Function? Before we dive deeper into CReLU, let's first understand what an activation functi

DELU

The DELU activation function is a type of activation function that uses trainable parameters and employs the complex linear and exponential functions in the positive dimension while using the SiLU function in the negative dimension. This unique combination of functions allows for flexibility in modeling complex functions in neural networks, making it a popular choice among machine learning practitioners. What is an Activation Function? Before understanding how the DELU activation function wor

EvoNorms

EvoNorms are a new type of computation layer used in designing neural networks. Neural networks are a type of artificial intelligence that attempts to mimic the way the human brain processes information by using layers of nodes that work together to make predictions or decisions. In order for these networks to work effectively, normalization and activation are critical components that ensure the data is processed correctly. EvoNorms take these concepts to a new level by combining them into a sin

Exponential Linear Squashing Activation

The Exponential Linear Squashing Activation Function, or ELiSH, is a type of activation function commonly used in neural networks. It is similar to the Swish function, which combines ELU and Sigmoid functions, but has unique properties that make it useful for various machine learning tasks. What is an Activation Function? Before we dive into ELiSH, let's first review what an activation function is and why it's important for neural networks. In a neural network, each neuron has an activation f

Exponential Linear Unit

In machine learning, an activation function is applied to the output of each neuron in a neural network. The exponential linear unit (ELU) is an activation function that is commonly used in neural networks. Mean Unit Activations ELUs have negative values which allows them to push mean unit activations closer to zero. This is similar to batch normalization, but with lower computational complexity. Mean shifts toward zero speed up learning by bringing the normal gradient closer to the unit natu

Gated Linear Unit

Gated Linear Unit, or GLU, is a mathematical formula that is commonly used in natural language processing architectures. It is designed to compute the importance of features for predicting the next word. This is important for language modeling tasks because it allows the system to select information that is relevant to the task at hand. What is GLU? GLU stands for Gated Linear Unit. It is a function that takes two inputs, $a$ and $b$, and outputs their product multiplied by a sigmoidal functi

Gaussian Error Linear Units

The Gaussian Error Linear Unit, or GELU, is an activation function that is commonly used in artificial neural networks. It was first introduced in a 2018 paper by Hendrycks and Gimpel titled "A baseline for detecting misclassified and out-of-distribution examples in neural networks". What is an activation function? An activation function is a mathematical function that is applied to the output of a neuron in a neural network. It is used to introduce non-linearity into the model, which allows

GeGLU

GeGLU is a powerful activation function that enhances deep learning models in neural networks. It is a variant of the GLU activation function, and it works by multiplying the output of a GELU activation function with a second input. This second input is calculated by multiplying the input with another set of parameters and adding a bias term. What is an Activation Function? Before understanding the details of GeGLU, it is essential to know what an activation function is and why it is essentia

Growing Cosine Unit

Overview of GCU If you're interested in artificial intelligence and machine learning, you've probably heard of the GCU. It stands for Gaussian Curvature-based Convolutional Unit, and it's an oscillatory function that is used in deep learning networks to improve performance on several benchmarks. Before we dive too deep into the specifics of the GCU, let's first take a look at convolutional neural networks. CNNs are a type of deep learning network that are commonly used in image processing appl

Gumbel Cross Entropy

The Gumbel activation function is a mathematical formula used for transforming the unnormalized output of a model to probability. This function is an alternative to the traditional sigmoid or softmax activation functions. What is Gumbel Activation function? Gumbel activation function is defined using the cumulative Gumbel distribution, which can be used to perform Gumbel regression. The Gumbel activation function $\eta_{Gumbel}$ can be expressed as: $\eta_{Gumbel}(q_i) = exp(-exp(-q_i))$ In

Hard Sigmoid

Neural networks are used for a wide range of applications, including image and speech recognition, predictive modeling, and more. One important aspect of neural networks is their activation function, which determines the output of each neuron based on the input it receives. The Hard Sigmoid is one such activation function that has gained popularity in recent years. What is the Hard Sigmoid? The Hard Sigmoid is a mathematical function that is used to transform the input of a neuron into its ou

Hard Swish

Hard Swish is a type of activation function that is based on a concept called Swish. Swish is a mathematical formula that is used to help machines learn, and it is an important component of machine learning algorithms. Hard Swish is a variation of Swish that replaces a complicated formula with a simpler one. What is an Activation Function? Before discussing Hard Swish, it is important to understand what an activation function is. In machine learning, an activation function is used to determin

HardELiSH

HardELiSH is a mathematical equation used as an activation function for neural networks. This particular equation is a combination of the HardSigmoid and ELU in the negative region and a combination of the Linear and HardSigmoid in the positive region. In simpler terms, it alters the input data before it is input into the network, making it easier for the neural network to learn and classify data more accurately. What is an Activation Function? Before diving into the specifics of HardELiSH, i

Hardtanh Activation

A Hardtanh activation function is a mathematical formula that is used in artificial neural networks. It is an updated version of the tanh activation function, which is a more complex formula that requires more computational power. The Hardtanh activation function is simpler and less expensive in terms of computational resources. What is an Activation Function? Before diving into understanding Hardtanh activation, it is important to define what an activation function is. An activation function

Hermite Polynomial Activation

The Hermite Activations are a type of activation function used in artificial neural networks. They differ from the widely used ReLU functions, which are non-smooth, in that they use a smooth finite Hermite polynomial base. What Are Activation Functions? Activation functions are mathematical equations that determine the output of a neuron in a neural network. The inputs received by the neuron are weighted, and the activation function determines whether the neuron is activated or not based on t

1234 1 / 4 Next