Introduction to Deep Learning and Neural Networks with a Focus on Perceptrons
Deep Learning is a subset of machine learning that uses neural networks with many layers (hence “deep”) to model and understand complex patterns in data. These networks are inspired by the human brain and are particularly powerful for tasks like image and speech recognition.
Neural Networks consist of interconnected layers of nodes, or neurons. Each neuron receives input, processes it, and passes it to the next layer. The simplest form of a neural network is the Perceptron, which is a single-layer neural network used for binary classification tasks.
Perceptron Explained
A Perceptron is a fundamental unit of a neural network, performing binary classification by making predictions based on a linear predictor function. It works by:
- Receiving Input: Taking input features .
- Weight Multiplication: Multiplying each input by a corresponding weight .
- Summation: Summing the weighted inputs and adding a bias term .
- Activation Function: Passing the result through an activation function (typically a step function for a perceptron).
The mathematical formula for a perceptron can be written as:
where is the activation function.
Training a Perceptron
Training involves adjusting the weights and bias to minimize classification errors on the training data. This is typically done using algorithms like the Perceptron Learning Algorithm, which updates weights based on the error in predictions.
Mathematical Foundations
Linear Predictor
Purpose: Calculates the weighted sum of inputs and bias, determining the linear combination that will be used for prediction.
Activation Function
where is often a step function:
Purpose: Transforms the linear predictor into a binary output, classifying the input as either 1 or 0.
Weight Update Rule
where with being the learning rate, the target value, and the predicted value.
Purpose: Adjusts the weights to reduce the error in future predictions. This iterative process helps the perceptron learn from the training data.
Iterative Process
Each iteration involves the following steps:
- Prediction: Calculate the linear predictor and apply the activation function to obtain the predicted output .
- Error Calculation: Determine the difference between the actual target and the predicted output .
- Weight Update: Adjust the weights and bias based on the error. This helps in refining the decision boundary that the perceptron uses to classify the input data.
How Perceptron Helps the Algorithm
- Learning from Data: The perceptron adjusts its weights based on the input data and corresponding labels, learning to classify the data correctly over iterations.
- Updating Weights: The weight update rule ensures that the perceptron corrects its mistakes, moving towards a model that can generalize well to unseen data.
- Binary Classification: By using a step function as the activation function, the perceptron can classify inputs into two categories, making it useful for binary classification tasks.
Perception and mathematics behind it is explained below in 3 Practical ways:
Perceptrons form the building blocks for more complex neural networks used in deep learning. Understanding their working and mathematical foundation is crucial for delving into more advanced topics in neural networks and deep learning.
For a detailed exploration and understand math behind it, continue reading our article below.
Here are the 3 ways of:
1- Manual Perceptron Training Code (without importing perceptron)
2- Algorithm with Importing Perceptron
3- Mathematics Behind Perceptron (detailed without code)
We showed all these steps to help you gain a deep understanding of perceptrons and the mathematics behind them. Check numbers 1, 2, and 3 below:
1- Manual Perceptron Training Code
Below is the code that performs manual updates and visualizes the decision boundary at each step:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
# Load the Iris dataset
iris = load_iris()
X = iris.data[:, (2, 3)] # Using petal length and petal width as features
y = (iris.target == 0).astype(int) # Binary target: Iris Setosa vs. others
# Initial parameters
weights = np.array([0.01, -0.02])
bias = 0.0
learning_rate = 0.1
# Function to calculate the net input
def net_input(X, weights, bias):
return np.dot(X, weights) + bias
# Function to apply the activation function (step function)
def predict(X, weights, bias):
return np.where(net_input(X, weights, bias) >= 0.0, 1, 0)
# Function to plot the decision boundary
def plot_decision_boundary(weights, bias, X, y, iteration):
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01),
np.arange(y_min, y_max, 0.01))
Z = predict(np.c_[xx.ravel(), yy.ravel()], weights, bias)
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.Paired)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel("Petal Length (cm)")
plt.ylabel("Petal Width (cm)")
plt.title(f"Decision Boundary after Iteration {iteration}")
plt.show()
# Training loop with detailed logging and plotting
def manual_perceptron_train(X, y, weights, bias, learning_rate, num_iterations):
for iteration in range(num_iterations):
print(f"Iteration {iteration + 1}")
for xi, target in zip(X, y):
z = net_input(xi, weights, bias)
y_hat = predict(xi.reshape(1, -1), weights, bias)[0]
error = target - y_hat
print(f"Data Point: {xi}, True Label: {target}, Net Input: {z}, Predicted: {y_hat}, Error: {error}")
if error != 0:
weights += learning_rate * error * xi
bias += learning_rate * error
print(f"Updated Weights: {weights}, Updated Bias: {bias}")
plot_decision_boundary(weights, bias, X, y, iteration + 1)
print(f"Weights: {weights}, Bias: {bias}")
# Ensure both classes are present in the subset for training
X_train = np.array([[2.5, 1.5], [1.0, 0.5]])
y_train = np.array([1, 0])
# Train the perceptron and plot decision boundaries
manual_perceptron_train(X_train, y_train, weights, bias, learning_rate, 3)
This code demonstrates how the weights and bias are updated at each step based on the perceptron learning rule. The decision boundary is plotted after each iteration to visualize how it evolves as the model learns from the data.
We hope this detailed walkthrough of the perceptron training algorithm and manual implementation helps you understand the inner workings of this fundamental machine learning model.
Results
2- Algorithm with Importing Perceptron
This part demonstrates the Perceptron training using sklearn's Perceptron class. This approach abstracts the manual weight updates and leverages sklearn's built-in methods:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron
# Load the Iris dataset
iris = load_iris()
X = iris.data[:, (2, 3)]
y = (iris.target == 0).astype(int)
# Initialize the Perceptron model
per_clf = Perceptron(max_iter=1, eta0=0.1, random_state=42, tol=None, warm_start=True)
# Function to plot decision boundary
def plot_decision_boundary(model, X, y, iteration):
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01), np.arange(y_min, y_max, 0.01))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.Paired)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel("Petal Length (cm)")
plt.ylabel("Petal Width (cm)")
plt.title(f"Perceptron Decision Boundary after Iteration {iteration}")
plt.show()
# Train the perceptron for multiple iterations
def perceptron_train(X, y, model, num_iterations):
for iteration in range(num_iterations):
if iteration == 0:
model.fit(X, y)
else:
model.partial_fit(X, y, classes=np.array([0, 1]))
plot_decision_boundary(model, X, y, iteration + 1)
# Train the perceptron
perceptron_train(X, y, per_clf, 3)
Results:
3- Mathematic Behind the code with imported Perception
Initial weights:
Initial bias:
Learning rate:
Iteration 1
First Training Instance ([2.5, 1.5]) with Label ( 1 )
Prediction:
Weight Update:
Second Training Instance ([1.0, 0.5]) with Label ( 0 )
Prediction:
Weight Update:
Results after Iteration 1:
Iteration | Weights | Bias |
---|---|---|
1 | [0.16, 0.08] | 0.0 |
Iteration 2
First Training Instance ([2.5, 1.5]) with Label ( 1 )
Prediction:
Weight Update:
No update since
Second Training Instance ([1.0, 0.5]) with Label ( 0 )
Prediction:
Weight Update:
Results after Iteration 2:
Iteration | Weights | Bias |
---|---|---|
2 | [0.06, 0.03] | -0.1 |
Iteration 3
First Training Instance ([2.5, 1.5]) with Label ( 1 )
Prediction:
Weight Update:
No update since
Second Training Instance ([1.0, 0.5]) with Label ( 0 )
Prediction:
Weight Update:
No update since
Results after Iteration 3:
Iteration | Weights | Bias |
---|---|---|
3 | [0.06, 0.03] | -0.1 |
Summary and Final Results
The Perceptron has undergone three iterations. Below are the weights and bias at each iteration:
Iteration | Weights () | Bias () |
---|---|---|
1 | ||
2 | ||
3 |
Now lets Visualise the table:
import numpy as np
import matplotlib.pyplot as plt
# Define the weights and biases from the table
iterations = [1, 2, 3]
weights = np.array([[0.16, 0.08], [0.06, 0.03], [0.06, 0.03]])
biases = [0.0, -0.1, -0.1]
# Define the training data
X_train = np.array([[2.5, 1.5], [1.0, 0.5]])
y_train = np.array([1, 0])
# Function to plot the decision boundary for a given iteration
def plot_decision_boundary(weights, bias, X_train, y_train, iteration):
x_min, x_max = X_train[:, 0].min() - 1, X_train[:, 0].max() + 1
y_min, y_max = X_train[:, 1].min() - 1, X_train[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01), np.arange(y_min, y_max, 0.01))
Z = np.dot(np.c_[xx.ravel(), yy.ravel()], weights) + bias
Z = np.where(Z >= 0, 1, 0).reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.Paired)
plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel("Petal Length (cm)")
plt.ylabel("Petal Width (cm)")
plt.title(f"Decision Boundary after Iteration {iteration}")
plt.show()
# Plot the decision boundary for each iteration
for i in range(len(iterations)):
plot_decision_boundary(weights[i], biases[i], X_train, y_train, iterations[i])
Here is the result of drawing it :
You May Still Wonder, How Decision Boundary Is Drawn?
The only question you might have is how it knows how to draw the decision boundary line to separate data.
Understanding the Decision Boundary in Perceptron Learning
Here, we'll explore how a perceptron learnt in our models above to classify data by adjusting its decision boundary through iterations.
Concepts to Understand
- Decision Boundary: This is the line (or hyperplane in higher dimensions) where the perceptron classifier decides between two classes. It's defined as:
where is the weight vector, is the feature vector, and is the bias term.
- Learning Process: The perceptron adjusts and to minimize errors by updating them based on the classification results.
Decision Boundary After Iteration 1:
Equation:
Rearranged:
Iteration 2
- For :
- Net Input Calculation:
- Prediction: Since is greater than 0, the perceptron predicts class 1.
- Actual Label: 1 (No Error)
- For :
- Net Input Calculation:
- Prediction: Since is greater than 0, the perceptron predicts class 1.
- Actual Label: 0 (Error = 0 - 1 = -1)
- Update Weights and Bias:
Decision Boundary After Iteration 2:
Equation:
Rearranged:
Iteration 3
- For :
- Net Input Calculation:
- Prediction: Since is greater than 0, the perceptron predicts class 1.
- Actual Label: 1 (No Error)
- For :
- Net Input Calculation:
- Prediction: Since is less than 0, the perceptron predicts class 0.
- Actual Label: 0 (No Error)
Decision Boundary After Iteration 3:
Equation:
Rearranged:
Conclusion
As we can see from the iterations, the perceptron adjusts its weights and bias to better fit the data. The decision boundary evolves through these adjustments:
- Iteration 1: Adjusts based on the errors from both data points, resulting in a boundary that may not perfectly separate the classes.
- Iteration 2: Further adjustments refine the boundary, moving closer to an optimal separation.
- Iteration 3: Shows that the decision boundary has stabilized with correct classifications for the given data points.
The decision boundary is defined by the equation . In this case, it separates the two classes (Iris Setosa vs. others) by adjusting weights and bias based on errors from the training data.