Machine Learning Overview

Activation Function _ day 11





Activation Functions in Neural Networks


Activation Functions in Neural Networks: Why They Matter ?

Activation functions are pivotal in neural networks, transforming the input of each neuron to its output signal, thus determining the neuron’s activation level. This process allows neural networks to handle tasks such as image recognition and language processing effectively.

The Role of Different Activation Functions

Neural networks employ distinct activation functions in their inner and outer layers, customized to the specific requirements of the network:

  • Inner Layers: Functions like ReLU (Rectified Linear Unit) introduce necessary non-linearity, allowing the network to learn complex patterns in the data. Without these functions, neural networks would not be able to model anything beyond simple linear relationships.
  • Outer Layers: Depending on the task, different functions are used. For example, a softmax function is used for multiclass classification to convert the logits to probabilities that sum to one, which are essential for classification tasks.

Practical Application

Understanding the distinction and application of different activation functions is crucial for designing networks that perform efficiently across various tasks.





Neural Network Configuration Example


Building a Neural Network for Image Classification

This example demonstrates setting up a neural network in Python using TensorFlow/Keras, designed to classify images into three categories.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D

model = Sequential([
    Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(64, 64, 3)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(3, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()
  

The code above outlines each step in constructing the network, from input to output, detailing how each layer contributes to the task of image classification. The use of ‘relu’ in hidden layers and ‘softmax’ in the output layer optimizes the model for accuracy and efficiency.








Activation Functions in Neural Networks


Activation Function Formula Used in Layer Purpose Linearity
Sigmoid $$\sigma(x) = \frac{1}{1 + e^{-x}}$$ Output Maps input to the range (0, 1), useful for binary classification. Non-linear
Softmax $$\text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}}$$ Output Converts logits to probabilities for multiclass classification. Non-linear
Tanh $$\tanh(x) = \frac{e^x – e^{-x}}{e^x + e^{-x}}$$ Hidden Maps values between -1 and 1, zero-centered. Non-linear
ReLU $$f(x) = \max(0, x)$$ Hidden Replaces negative values with 0, allowing for faster convergence. Non-linear
Leaky ReLU $$f(x) = \max(0.01x, x)$$ Hidden Allows a small, non-zero output for negative inputs. Non-linear
PReLU $$f(x) = \max(\alpha x, x)$$ Hidden Parameter \(\alpha\) is learnable, improving learning dynamics. Non-linear
ELU $$f(x) = x \text{ if } x > 0 \text{ else } \alpha(e^x – 1)$$ Hidden Controls the vanishing gradient problem by outputting values below zero for negative inputs. Non-linear
modSwish $$f(x) = x \cdot \sigma(\beta x)$$ Hidden A modified version of Swish, continuously differentiable and non-monotonic. Non-linear




don't miss our new posts. Subscribe for updates

We don’t spam! Read our privacy policy for more info.