Machine Learning Overview

Day 14 _ sequential , functional and model subclassing API in Keras

In our last blog on day 13 we explained what’s Keras and we showed a code example which was using the sequential API but did not discuss its API type.

# Understanding Keras APIs and Their Use Cases

In our previous blog post on day 13, we introduced Keras and provided a code example using the Sequential API. In this post, we will delve into the different types of Keras APIs: Sequential, Functional, and Model Subclassing. We will explain each API, its inventor, appropriate use cases, and whether they can be used interchangeably. We will also analyze the code examples provided to illustrate the differences between these approaches.

## Sequential API

**Inventor:** François Chollet, the creator of Keras.

**Overview:** The Sequential API is the simplest and most straightforward way to build a neural network in Keras. It allows you to create a model layer-by-layer in a linear stack.

**Use Cases:**
– Simple models with a single input and a single output.
– Beginners and quick prototyping.
– Basic feedforward neural networks and simple CNNs.

**Mathematical Foundation:**
The Sequential API models are compositions of functions, where each layer L_i applies a transformation f_i:

    \[ y = L_n \circ L_{n-1} \circ \ldots \circ L_1(x) \]


This means the output of one layer is the input to the next.

**Advantages:**
– Easy to use and understand.
– Ideal for simple, linear architectures.

**Limitations:**
– Limited flexibility: cannot handle models with multiple inputs/outputs or complex topologies like shared layers and residual connections.

**Performance:**
– Fast to set up and train for simple models, but less efficient for complex architectures due to its limitations in handling non-linear connections and multiple inputs/outputs.

For more information, visit the [official Keras documentation](https://keras.io/api/models/sequential/).

Functional API

Inventor: François Chollet.

Overview: The Functional API is a more flexible way to build models. It allows for the creation of complex models with multiple inputs and outputs, shared layers, and non-linear connections.

Use Cases:

  • Models with multiple inputs and outputs.
  • Complex architectures like branching and merging paths.
  • Shared layers, such as in Siamese networks.

Mathematical Foundation:

The Functional API models are represented as Directed Acyclic Graphs (DAGs) of layers:

    \[ y = f(x_1, x_2, \ldots, x_n) \]

This flexibility allows for constructing more complex architectures.

Advantages:

  • Supports arbitrary model architectures.
  • Suitable for advanced architectures such as residual networks and multi-modal inputs.

Limitations:

  • Slightly more complex to understand and use compared to the Sequential API.

Performance:

  • Efficient for complex architectures due to its ability to handle multiple inputs/outputs and non-linear connections.

For more details, visit the Keras documentation.

Model Subclassing API

Overview: The Model Subclassing API provides the highest level of flexibility and control. It involves creating a custom model by subclassing the tf.keras.Model class and defining the layers and forward pass manually.

Use Cases:

  • Research and development of novel architectures.
  • Models requiring custom behaviors and complex operations.

Mathematical Foundation:

In this API, you explicitly define the forward pass in the call method, giving full control over data flow through the layers:

    \[ y = \text{model}(x) \]

This method allows for implementing complex operations and unique behaviors.

Advantages:

  • Maximum flexibility and control.
  • Ideal for custom behaviors and complex models.

Limitations:

  • Requires a deeper understanding of Keras and TensorFlow.
  • More complex to implement compared to the Sequential and Functional APIs.

Performance:

  • Optimal for custom and complex models due to the direct control over the model architecture.

For further information, refer to the Keras guide on making new layers and models via subclassing.

Comparing the APIs

  • Sequential API: Best for simple, linear models.
  • Functional API: Suitable for complex models with multiple inputs/outputs and non-linear connections.
  • Model Subclassing API: Provides full control and customization, ideal for research and highly specialized models.

Can All APIs Be Used for the Same Problem?

Flexibility and Choice:

While all three APIs can technically be used to solve the same problem, the choice depends on the complexity and requirements of the model.

  • Sequential API: Limited to simple, linear models. Not suitable for complex architectures.
  • Functional API: Offers more flexibility and is suitable for complex models. Preferred for most use cases where complexity is involved.
  • Model Subclassing API: Provides full control and is best for novel or highly customized models.

Example:

For a simple classification task, the Sequential API is sufficient. For a model with multiple inputs and outputs, the Functional API is better suited. If the model requires custom training loops or complex behaviors, the Model Subclassing API would be the best choice.

Other APIs in Keras

Keras also includes specialized APIs for preprocessing, tuning, and serialization, among other tasks. These APIs support a wide range of workflows, making Keras a versatile library for deep learning.

By understanding the strengths and appropriate use cases of each Keras API, you can select the most suitable approach for your machine learning projects and build models effectively and efficiently.

For further reading and detailed information, you can explore the Keras Models API documentation.


Now is Time to Code:

Sequential API

The Sequential API is the simplest and most straightforward way to build a neural network in Keras. It allows you to create a model layer-by-layer in a linear stack. This method is best suited for models where each layer has one input tensor and one output tensor. The simplicity of this API makes it ideal for beginners and for building simple models quickly.

Mathematical Foundation

Linear Composition: The Sequential API models are a composition of functions, where each layer Li applies a transformation fi.

y = L<sub>n</sub> ○ L<sub>n-1</sub> ○ ... ○ L<sub>1</sub> (x)

This means that the output of one layer is the input to the next. This linear composition simplifies the model building process but limits the flexibility in designing complex architectures.

Code

</p>
<pre><code class="language-python">import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd

# Load Fashion MNIST dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()

# Split data into training, validation, and test sets
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]

# Scale pixel values to the 0-1 range
X_train, X_valid, X_test = X_train / 255.0, X_valid / 255.0, X_test / 255.0

# Class names for Fashion MNIST
class_names = [&quot;T-shirt/top&quot;, &quot;Trouser&quot;, &quot;Pullover&quot;, &quot;Dress&quot;, &quot;Coat&quot;, &quot;Sandal&quot;, &quot;Shirt&quot;, &quot;Sneaker&quot;, &quot;Bag&quot;, &quot;Ankle boot&quot;]

# Display the first few images and labels
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(X_train[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[y_train[i]])
plt.show()

# Set random seed for reproducibility
tf.random.set_seed(42)

# Build the model
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=[28, 28]),
    tf.keras.layers.Dense(300, activation=&quot;relu&quot;),
    tf.keras.layers.Dense(100, activation=&quot;relu&quot;),
    tf.keras.layers.Dense(10, activation=&quot;softmax&quot;)
])

# Compile the model
model.compile(loss=&quot;sparse_categorical_crossentropy&quot;, optimizer=&quot;sgd&quot;, metrics=[&quot;accuracy&quot;])

# Train the model
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f&quot;Test accuracy: {test_acc:.4f}&quot;)

# Plot training and validation accuracy and loss
def plot_learning_curves(history):
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True)
    plt.gca().set_ylim(0, 1)  # set the vertical range to [0-1]
    plt.show()

plot_learning_curves(history)

# Make predictions
y_pred = model.predict(X_test)

# Plot the first 25 test images, their predicted labels, and the true labels.
# Color correct predictions in blue and incorrect predictions in red.
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(X_test[i], cmap=plt.cm.binary)
    predicted_label = class_names[y_pred[i].argmax()]
    true_label = class_names[y_test[i]]
    color = &#039;blue&#039; if predicted_label == true_label else &#039;red&#039;
    plt.xlabel(f&quot;{predicted_label} ({true_label})&quot;, color=color)
plt.show()
</code></pre>
<p>

Functional API

The Functional API in Keras is a more flexible way to build models. Unlike the Sequential API, it allows for the creation of complex models with multiple inputs and outputs, shared layers, and non-linear connections. This API is suitable for building models that cannot be represented as a simple stack of layers.

Mathematical Foundation

Directed Acyclic Graph (DAG): The Functional API models are represented as DAGs of layers, where layers can have multiple inputs and outputs.

y = f(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>)

This flexibility allows for the construction of more complex architectures, such as models with branching, merging, and multiple inputs/outputs.

Code

</p>
<pre><code class="language-python">import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd

# Load Fashion MNIST dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()

# Split data into training, validation, and test sets
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]

# Scale pixel values to the 0-1 range
X_train, X_valid, X_test = X_train / 255.0, X_valid / 255.0, X_test / 255.0

# Class names for Fashion MNIST
class_names = [&quot;T-shirt/top&quot;, &quot;Trouser&quot;, &quot;Pullover&quot;, &quot;Dress&quot;, &quot;Coat&quot;, &quot;Sandal&quot;, &quot;Shirt&quot;, &quot;Sneaker&quot;, &quot;Bag&quot;, &quot;Ankle boot&quot;]

# Display the first few images and labels
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(X_train[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[y_train[i]])
plt.show()

# Set random seed for reproducibility
tf.random.set_seed(42)

# Build the model using Functional API
inputs = tf.keras.Input(shape=(28, 28))
x = tf.keras.layers.Flatten()(inputs)
x = tf.keras.layers.Dense(300, activation=&quot;relu&quot;)(x)
x = tf.keras.layers.Dense(100, activation=&quot;relu&quot;)(x)
outputs = tf.keras.layers.Dense(10, activation=&quot;softmax&quot;)(x)

model = tf.keras.Model(inputs, outputs)

# Compile the model
model.compile(loss=&quot;sparse_categorical_crossentropy&quot;, optimizer=&quot;sgd&quot;, metrics=[&quot;accuracy&quot;])

# Train the model
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f&quot;Test accuracy: {test_acc:.4f}&quot;)

# Plot training and validation accuracy and loss
def plot_learning_curves(history):
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True)
    plt.gca().set_ylim(0, 1)  # set the vertical range to [0-1]
    plt.show()

plot_learning_curves(history)

# Make predictions
y_pred = model.predict(X_test)

# Plot the first 25 test images, their predicted labels, and the true labels.
# Color correct predictions in blue and incorrect predictions in red.
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(X_test[i], cmap=plt.cm.binary)
    predicted_label = class_names[y_pred[i].argmax()]
    true_label = class_names[y_test[i]]
    color = &#039;blue&#039; if predicted_label == true_label else &#039;red&#039;
    plt.xlabel(f&quot;{predicted_label} ({true_label})&quot;, color=color)
plt.show()
</code></pre>
<p>

Model Subclassing API

The Model Subclassing API in Keras provides the highest level of flexibility and control. It involves creating a custom model by subclassing the tf.keras.Model class and defining the layers and forward pass manually. This approach is ideal for research and for building novel or highly customized neural network architectures.

Mathematical Foundation

Custom Forward Pass: In this API, you explicitly define the forward pass in the call method, which gives you full control over how data flows through the layers.

y = \text{model}(x)

This method allows for implementing complex operations and unique behaviors that are not possible with the Sequential or Functional APIs.

Code

<br />
import tensorflow as tf<br />
import matplotlib.pyplot as plt<br />
import pandas as pd</p>
<p># Load Fashion MNIST dataset<br />
fashion_mnist = tf.keras.datasets.fashion_mnist<br />
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()</p>
<p># Split data into training, validation, and test sets<br />
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]<br />
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]</p>
<p># Scale pixel values to the 0-1 range<br />
X_train, X_valid, X_test = X_train / 255.0, X_valid / 255.0, X_test / 255.0</p>
<p># Class names for Fashion MNIST<br />
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]</p>
<p># Display the first few images and labels<br />
plt.figure(figsize=(10,10))<br />
for i in range(25):<br />
    plt.subplot(5, 5, i + 1)<br />
    plt.xticks([])<br />
    plt.yticks([])<br />
    plt.grid(False)<br />
    plt.imshow(X_train[i], cmap=plt.cm.binary)<br />
    plt.xlabel(class_names[y_train[i]])<br />
plt.show()</p>
<p># Set random seed for reproducibility<br />
tf.random.set_seed(42)</p>
<p># Build the model using Model Subclassing API<br />
class MyModel(tf.keras.Model):<br />
    def __init__(self):<br />
        super(MyModel, self).__init__()<br />
        self.flatten = tf.keras.layers.Flatten()<br />
        self.dense1 = tf.keras.layers.Dense(300, activation="relu")<br />
        self.dense2 = tf.keras.layers.Dense(100, activation="relu")<br />
        self.dense3 = tf.keras.layers.Dense(10, activation="softmax")</p>
<p>    def call(self, inputs):<br />
        x = self.flatten(inputs)<br />
        x = self.dense1(x)<br />
        x = self.dense2(x)<br />
        return self.dense3(x)</p>
<p>model = MyModel()</p>
<p># Compile the model<br />
model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd", metrics=["accuracy"])</p>
<p># Train the model<br />
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))</p>
<p># Evaluate the model on the test set<br />
test_loss, test_acc = model.evaluate(X_test, y_test)<br />
print(f"Test accuracy: {test_acc:.4f}")</p>
<p># Plot training and validation accuracy and loss<br />
def plot_learning_curves(history):<br />
    pd.DataFrame(history.history).plot(figsize=(8, 5))<br />
    plt.grid(True)<br />
    plt.gca().set_ylim(0, 1)  # set the vertical range to [0-1]<br />
    plt.show()</p>
<p>plot_learning_curves(history)</p>
<p># Make predictions<br />
y_pred = model.predict(X_test)</p>
<p># Plot the first 25 test images, their predicted labels, and the true labels.<br />
# Color correct predictions in blue and incorrect predictions in red.<br />
plt.figure(figsize=(10,10))<br />
for i in range(25):<br />
    plt.subplot(5, 5, i + 1)<br />
    plt.xticks([])<br />
    plt.yticks([])<br />
    plt.grid(False)<br />
    plt.imshow(X_test[i], cmap=plt.cm.binary)<br />
    predicted_label = class_names[y_pred[i].argmax()]<br />
    true_label = class_names[y_test[i]]<br />
 


<p>

Detailed Explanations

Why Use the Model Subclassing API?

  • Flexibility: Provides the ability to define any architecture and custom behavior, which is especially useful for complex or non-standard models.
  • Customization: Allows for implementing custom layers and operations within the forward pass, offering unparalleled control over the model's inner workings.
  • Research and Innovation: Ideal for experimenting with novel architectures and approaches in deep learning research.

How It Affects the Results:

  • Precision and Control: By having full control over the model’s operations, you can optimize and customize the architecture to potentially achieve better performance and efficiency for specific tasks.
  • Complexity Management: This approach can manage more complex dependencies and layer interactions, which might not be possible with simpler APIs.

Mathematical Insight:

In the Model Subclassing API, you explicitly define the transformation of the input tensor x through various layers, leading to the output tensor y. This direct control allows for implementing sophisticated transformations and handling multiple inputs and outputs more effectively.

By understanding and leveraging these three different APIs in Keras, you can choose the best approach based on your specific needs, complexity, and the nature of your machine learning or deep
learning tasks.

Differences in Model Definitions and Explanations

The exact lines in each of the three code examples that differ are the lines where the model is defined. These lines are different because each API provides a distinct way to define the architecture of the neural network.

Sequential API

The model is defined using a sequential approach, where each layer is added one after another in a linear stack.

# Sequential API model definition
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=[28, 28]),
    tf.keras.layers.Dense(300, activation="relu"),
    tf.keras.layers.Dense(100, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax")
])

Functional API

The model is defined using a functional approach, which allows for more complex architectures with multiple inputs and outputs.

# Functional API model definition
inputs = tf.keras.Input(shape=(28, 28))
x = tf.keras.layers.Flatten()(inputs)
x = tf.keras.layers.Dense(300, activation="relu")(x)
x = tf.keras.layers.Dense(100, activation="relu")(x)
outputs = tf.keras.layers.Dense(10, activation="softmax")(x)

model = tf.keras.Model(inputs, outputs)

Model Subclassing API

The model is defined by subclassing the tf.keras.Model class, providing the most flexibility and control over the model's architecture and forward pass.

# Model Subclassing API model definition
class MyModel(tf.keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.flatten = tf.keras.layers.Flatten()
        self.dense1 = tf.keras.layers.Dense(300, activation="relu")
        self.dense2 = tf.keras.layers.Dense(100, activation="relu")
        self.dense3 = tf.keras.layers.Dense(10, activation="softmax")

    def call(self, inputs):
        x = self.flatten(inputs)
        x = self.dense1(x)
        x = self.dense2(x)
        return self.dense3(x)

model = MyModel()

These differences exist because:

  • Sequential API: Best for simple models where the layers are stacked in a linear order.
  • Functional API: Suitable for complex models with non-linear topology, multiple inputs/outputs, and shared layers.
  • Model Subclassing API: Provides full control and customization over the model, ideal for research and novel architectures.

Keras API Comparison in these 3 given code above, Explanation

Comparison of Keras APIs: Sequential, Functional, and Model Subclassing in our code above, is the resuls the same ?

Results Comparison Across All 3 APIs

The results for models built using the Sequential API, Functional API, and Model Subclassing API will generally be the same, assuming the architectures, initialization, training process, and hyperparameters are identical. The underlying mathematical operations and computations are the same across these APIs; they merely offer different ways to define and build the model.

Key Factors for Same Results

  • Identical Architecture: All three APIs use the same layers in the same order: Flatten -> Dense (300 units, ReLU) -> Dense (100 units, ReLU) -> Dense (10 units, Softmax).
  • Same Initialization and Training Process: The random seed is set for reproducibility. The models are compiled with the same loss function (sparse_categorical_crossentropy), optimizer (sgd), and metric (accuracy). The training process (epochs, batch size, validation split) is identical.

Given these factors, the results in terms of accuracy and loss should be the same across the three APIs because the underlying computations performed by the models are identical.

Still Sequential api is better approach for this task, Now Why Sequential API is Better for This Task?

Simplicity and Readability

The Sequential API allows for the simplest and most readable code when building straightforward, linear models. This reduces the potential for errors and makes the code easier to understand and maintain.

Ease of Use

For tasks that do not require complex architectures, the Sequential API is the quickest and easiest way to build a model. There’s no need to define the input and output layers explicitly as in the Functional API or to implement a custom class as in the Model Subclassing API.

Focused Scope

The Sequential API is designed specifically for models where layers are stacked sequentially. This focused scope makes it a better fit for simple models, avoiding the overhead and additional code required by the other APIs.

Results Comparison for the Provided Example

The models built using the Sequential API, Functional API, and Model Subclassing API perform the same computations. Therefore, the training and evaluation results (accuracy and loss) for all three models should be the same.

Expected Results

Assuming the same architecture, initialization, and training procedure, the models should produce identical results. For example:

  • Test accuracy: 0.8745 (expected for all three APIs)

Conclusion

Result Similarity: The results are the same across all three APIs because the models perform identical computations.

Why Sequential is Better: For this particular task (a straightforward feedforward neural network), the Sequential API is better because it provides a simpler, more readable, and more efficient way to define and train the model without unnecessary complexity. It aligns perfectly with the problem's requirements, offering the quickest and most straightforward path to implementation.


Keras APIs Comparison

Aspect Sequential API Functional API Model Subclassing API
Mathematical Foundation y = L_n \circ L_{n-1} \circ \ldots \circ L_1(x) y = f(x_1, x_2, \ldots, x_n) y = \text{model}(x)
Transformation Example a_{i+1} = f_i(W_i \cdot a_i + b_i) y_1 = f_1(W_1 \cdot x + b_1)
y_2 = f_2(W_2 \cdot y_1 + b_2)
Custom defined in call method:

    \[ \begin{aligned}                     &a = f_1(W_1 \cdot x + b_1) \\                     &b = f_2(W_2 \cdot a + b_2) \\                     &y = f_3(a, b)                     \end{aligned} \]

Algorithm Linear flow from input to output DAG with multiple branches and merges Custom forward pass
Implementation Layers added sequentially Layers defined as a DAG Layers and forward pass defined explicitly
Performance Fast for simple models Efficient for complex models Varies, can be optimized for specific tasks
Flexibility Limited (single input/output) High (multiple inputs/outputs, shared layers) Maximum (complete control over architecture)
Ease of Use Easiest, ideal for beginners Moderate, requires understanding of DAGs Most complex, requires deep knowledge
Use Cases Simple models, prototypes Complex architectures (e.g., ResNet, Inception) Custom architectures, novel research models

Detailed Mathematical Examples

1. Sequential API

Transformation:

a_1 = f_1(W_1 \cdot x + b_1)
y = f_2(W_2 \cdot a_1 + b_2)

Backpropagation:

\frac{\partial \mathcal{L}}{\partial W_2} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial W_2}
\frac{\partial \mathcal{L}}{\partial W_1} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial a_1} \cdot \frac{\partial a_1}{\partial W_1}

2. Functional API

Transformation:

y_1 = f_1(W_1 \cdot x + b_1)
y_2 = f_2(W_2 \cdot y_1 + b_2)
y = f_3(y_1, y_2)

Backpropagation:

\frac{\partial \mathcal{L}}{\partial W_2} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial y_2} \cdot \frac{\partial y_2}{\partial W_2}
\frac{\partial \mathcal{L}}{\partial W_1} = \frac{\partial \mathcal{L}}{\partial y} \cdot \left( \frac{\partial y}{\partial y_1} \cdot \frac{\partial y_1}{\partial W_1} + \frac{\partial y}{\partial y_2} \cdot \frac{\partial y_2}{\partial y_1} \cdot \frac{\partial y_1}{\partial W_1} \right)

3. Model Subclassing API

Transformation:

class MyModel(tf.keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.dense1 = tf.keras.layers.Dense(64, activation='relu')
        self.dense2 = tf.keras.layers.Dense(10)

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)
    

Backpropagation:

\frac{\partial \mathcal{L}}{\partial W_1} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial x_1} \cdot \frac{\partial x_1}{\partial W_1}
\frac{\partial \mathcal{L}}{\partial W_2} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial W_2}

Gradient Descent and Optimization

In all APIs, the training process involves:

Loss Calculation:

\mathcal{L} = \frac{1}{N} \sum_{i=1}^{N} \mathcal{L}(\hat{y}_i, y_i)

Gradient Calculation (Backpropagation):

\frac{\partial \mathcal{L}}{\partial W_i}






Lets Do One More, Final Comparison




Final Note: Lets Look at the Differences Between Keras Sequential & Functional API again for better understanding

Sequential API

Design Philosophy:
- Linear Structure: The Sequential API is designed to stack layers in a strictly linear order.
- Single Input and Output: It supports only one input tensor and one output tensor.
- Ease of Use: It's simple and intuitive, making it easy to build straightforward models quickly.

Internal Mechanism:
- Layer Management: Layers are added sequentially to an internal list. Each layer's input is the previous layer's output.
- Model Graph: The underlying computation graph is straightforward because it flows in one direction without any branches or merges.

Limitations:
- Fixed Architecture: You can't branch or merge layers, and you can't use multiple inputs or outputs.
- No Shared Layers: Each layer is used exactly once in the sequence.

Functional API

Design Philosophy:
- Graph Structure: The Functional API allows you to create models as directed acyclic graphs (DAGs).
- Multiple Inputs and Outputs: It supports models with multiple input and output tensors.
- Flexibility and Versatility: It can handle complex architectures with non-linear connections, shared layers, and dynamic behavior.

Internal Mechanism:
- Layer Management: Layers are connected via their inputs and outputs explicitly. You define the flow of data through the network by connecting tensors.
- Model Graph: The underlying computation graph can have branches, merges, and shared layers, which is more complex but also more powerful.

Capabilities:
- Branches and Merges: Layers can be connected in non-linear ways.
- Shared Layers: The same layer can be used in different parts of the model.
- Multiple Inputs/Outputs: Supports complex models that require processing different types of data simultaneously and producing multiple predictions.

Sequential API Example


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(64, activation='relu', input_shape=(784,)),
    Dense(32, activation='tanh'),
    Dense(10, activation='softmax')
])

Internals:

- Layer Order: Maintains an internal list of layers: [Dense(64), Dense(32), Dense(10)].
- Forward Pass: Iterates over the list and applies each layer in sequence.
- Graph: Simple linear graph: input -> Dense(64) -> Dense(32) -> Dense(10) -> output.

Functional API Example


from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, concatenate

# Define two input layers
input1 = Input(shape=(784,), name='input_1')
input2 = Input(shape=(784,), name='input_2')

# First branch
x1 = Dense(64, activation='relu', name='dense_1')(input1)

# Second branch
x2 = Dense(64, activation='relu', name='dense_2')(input2)

# Concatenate the outputs
concat = concatenate([x1, x2], name='concat')

# Final layers
x = Dense(32, activation='tanh', name='dense_3')(concat)
outputs = Dense(10, activation='softmax', name='output')(x)

# Define the model with two inputs and one output
model = Model(inputs=[input1, input2], outputs=outputs)

Internals:

- Layer Connections: Tracks each layer's input and output explicitly, creating a graph: input_1 -> Dense(64) -> concat <- Dense(64) <- input_2.
- Forward Pass: Computes the outputs based on the graph structure, handling branches and merges.
- Graph: More complex graph with branches and a merge: input_1 -> Dense(64) \ -> concat -> Dense(32) -> Dense(10) -> output<br /> input_2 -> Dense(64) /.

Summary of these Differences mentioned Now

Aspect Sequential API Functional API
Design Linear stack of layers Directed acyclic graph (DAG) of layers
Input/Output Single input and output Multiple inputs and outputs
Layer Connections Each layer connected to the previous layer in sequence Layers connected via explicitly defined tensors, allowing for complex connections
Graph Structure Simple, linear computation graph Complex, branched, and merged computation graph
Flexibility Limited to simple, sequential models Supports complex architectures, including branches, merges, and shared layers
Use Cases Quick prototyping, simple feed-forward networks Advanced models requiring complex data flows, multi-modal inputs, and multi-task outputs

Why Functional API Can Do What Sequential API Cannot

- Graph-Based Design: The Functional API's graph-based design allows for defining complex, non-linear relationships between layers, supporting multiple inputs and outputs, and enabling the creation of more sophisticated models.
- Explicit Connections: In the Functional API, you explicitly define how data flows between layers, which is not possible in the Sequential API that relies on a fixed sequence.
- Multiple Inputs and Outputs: The ability to handle multiple inputs and outputs directly comes from the Functional API's flexibility in defining and managing multiple data streams and their interactions.

Conclusion

The Sequential API and the Functional API are designed with different purposes in mind:
- Sequential API: For straightforward, linear models where simplicity and ease of use are prioritized.
- Functional API: For complex models requiring flexible architectures with multiple inputs, outputs, and sophisticated data flows.
Understanding these differences helps you choose the right API based on the complexity and requirements of your neural network model, rather than any underlying hardware differences.


Please check day 15 to understand better the difference between Sequential vs Functional Keras in a simpler approach