Day 11 _ Activation Function

July 27, 2024
Posted by ingoampt

Activation Functions in Neural Networks

Activation Functions in Neural Networks: Why They Matter

Activation functions are pivotal in neural networks, transforming the input of each neuron to its output signal, thus determining the neuron’s activation level. This process allows neural networks to handle tasks such as image recognition and language processing effectively.

The Role of Different Activation Functions

Neural networks employ distinct activation functions in their inner and outer layers, customized to the specific requirements of the network:

Inner Layers: Functions like ReLU (Rectified Linear Unit) introduce necessary non-linearity, allowing the network to learn complex patterns in the data. Without these functions, neural networks would not be able to model anything beyond simple linear relationships.
Outer Layers: Depending on the task, different functions are used. For example, a softmax function is used for multiclass classification to convert the logits to probabilities that sum to one, which are essential for classification tasks.

Practical Application

Understanding the distinction and application of different activation functions is crucial for designing networks that perform efficiently across various tasks.

Neural Network Configuration Example

Building a Neural Network for Image Classification

This example demonstrates setting up a neural network in Python using TensorFlow/Keras, designed to classify images into three categories.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D

model = Sequential([
    Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(64, 64, 3)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(3, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()

The code above outlines each step in constructing the network, from input to output, detailing how each layer contributes to the task of image classification. The use of ‘relu’ in hidden layers and ‘softmax’ in the output layer optimizes the model for accuracy and efficiency.

Activation Functions in Neural Networks

Activation Function	Formula	Used in Layer	Purpose	Linearity
Sigmoid	$$\sigma(x) = \frac{1}{1 + e^{-x}}$$	Output	Maps input to the range (0, 1), useful for binary classification.	Non-linear
Softmax	$$\text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_{j} e^{x_j}}$$	Output	Converts logits to probabilities for multiclass classification.	Non-linear
Tanh	$$\tanh(x) = \frac{e^x – e^{-x}}{e^x + e^{-x}}$$	Hidden	Maps values between -1 and 1, zero-centered.	Non-linear
ReLU	$$f(x) = \max(0, x)$$	Hidden	Replaces negative values with 0, allowing for faster convergence.	Non-linear
Leaky ReLU	$$f(x) = \max(0.01x, x)$$	Hidden	Allows a small, non-zero output for negative inputs.	Non-linear
PReLU	$$f(x) = \max(\alpha x, x)$$	Hidden	Parameter $\alpha$ is learnable, improving learning dynamics.	Non-linear
ELU	$$f(x) = x \text{ if } x > 0 \text{ else } \alpha(e^x – 1)$$	Hidden	Controls the vanishing gradient problem by outputting values below zero for negative inputs.	Non-linear
modSwish	$$f(x) = x \cdot \sigma(\beta x)$$	Hidden	A modified version of Swish, continuously differentiable and non-monotonic.	Non-linear

Day 11 _ Activation Function

Activation Functions in Neural Networks: Why They Matter

The Role of Different Activation Functions

Practical Application

Building a Neural Network for Image Classification

don't miss our new posts. Subscribe for updates

Solo Developer’s Guide to Building Competitive Language Model Application – day 9

Where to Get Data for Machine Learning and Deep Learning Model Creation – day 8

Fine-Tuning in Deep Learning with a practical example – day 6

Effortlessly Integrate ChatGPT-4 API on Mac with Xcode: Your Ultimate 2024 Guide to Token-Based Access Without a Monthly Subscription

Mastering NLP: Unlocking the Math Behind It for Breakthrough Insights with a scientific paper study – day 71

Do you want to read a summery of what is BERT in 2 min read? (Bidirectional Encoder Representations from Transformers) – day 67

Transformers Deep Learning – day 66

RNNs or Transformers for NLP? Lets see the math behind it – day 64

RNNs or Transformers for NLP? Lets see the math behind it – day 64

Iterative Forecasting which is Predicting One Step at a Time 2- Direct Multi-Step Forecasting with RNN 3- Seq2Seq Models for Time Series Forecasting – day 61

Step-by-Step Explanation of RNN for Time Series Forecasting – part 6 – day 60

Step-by-Step Explanation of RNN for Time Series Forecasting – part 6 – day 60

Education

Day 11 _ Activation Function

Activation Functions in Neural Networks: Why They Matter

The Role of Different Activation Functions

Practical Application

Building a Neural Network for Image Classification

don't miss our new posts. Subscribe for updates

Related Posts