Machine Learning Overview

sequential , functional and model subclassing API in Keras _ day 14



In our last blog on day 13, we explained what’s Keras and we showed a code example which was using the Sequential API but did not discuss its API type.

Understanding Keras APIs and Their Use Cases

In our previous blog post on Day 13, we introduced Keras and provided a code example using the Sequential API. In this post, we will delve into the different types of Keras APIs: Sequential, Functional, and Model Subclassing. We will explain each API, its inventor, appropriate use cases, and whether they can be used interchangeably. We will also analyze the code examples provided to illustrate the differences between these approaches.

Sequential API

Inventor: François Chollet, the creator of Keras.

Overview: The Sequential API is the simplest and most straightforward way to build a neural network in Keras. It allows you to create a model layer-by-layer in a linear stack.

Use Cases:
– Simple models with a single input and a single output.
– Beginners and quick prototyping.
– Basic feedforward neural networks and simple CNNs.

Mathematical Foundation:
The Sequential API models are compositions of functions, where each layer L_i applies a transformation f_i:

    \[y = L_n \circ L_{n-1} \circ \ldots \circ L_1(x)\]


This means the output of one layer is the input to the next.

Advantages:
– Easy to use and understand.
– Ideal for simple, linear architectures.

Limitations:
– Limited flexibility: cannot handle models with multiple inputs/outputs or complex topologies like shared layers and residual connections.

Performance:
– Fast to set up and train for simple models, but less efficient for complex architectures due to its limitations in handling non-linear connections and multiple inputs/outputs.

For more information, visit the [official Keras documentation](https://keras.io/api/models/sequential/).

Functional API

Inventor: François Chollet.

Overview: The Functional API is a more flexible way to build models. It allows for the creation of complex models with multiple inputs and outputs, shared layers, and non-linear connections.

Use Cases:

  • Models with multiple inputs and outputs.
  • Complex architectures like branching and merging paths.
  • Shared layers, such as in Siamese networks.

Mathematical Foundation:

The Functional API models are represented as Directed Acyclic Graphs (DAGs) of layers:

    \[ y = f(x_1, x_2, \ldots, x_n) \]

This flexibility allows for constructing more complex architectures.

Advantages:

  • Supports arbitrary model architectures.
  • Suitable for advanced architectures such as residual networks and multi-modal inputs.

Limitations:

  • Slightly more complex to understand and use compared to the Sequential API.

Performance:

  • Efficient for complex architectures due to its ability to handle multiple inputs/outputs and non-linear connections.

For more details, visit the Keras documentation.

Model Subclassing API

Overview: The Model Subclassing API provides the highest level of flexibility and control. It involves creating a custom model by subclassing the tf.keras.Model class and defining the layers and forward pass manually.

Use Cases:

  • Research and development of novel architectures.
  • Models requiring custom behaviors and complex operations.

Mathematical Foundation:

In this API, you explicitly define the forward pass in the call method, giving full control over data flow through the layers:

    \[ y = \text{model}(x) \]

This method allows for implementing complex operations and unique behaviors.

Advantages:

  • Maximum flexibility and control.
  • Ideal for custom behaviors and complex models.

Limitations:

  • Requires a deeper understanding of Keras and TensorFlow.
  • More complex to implement compared to the Sequential and Functional APIs.

Performance:

  • Optimal for custom and complex models due to the direct control over the model architecture.

For further information, refer to the Keras guide on making new layers and models via subclassing.

Comparing the APIs

  • Sequential API: Best for simple, linear models.
  • Functional API: Suitable for complex models with multiple inputs/outputs and non-linear connections.
  • Model Subclassing API: Provides full control and customization, ideal for research and highly specialized models.

Can All APIs Be Used for the Same Problem?

Flexibility and Choice:

While all three APIs can technically be used to solve the same problem, the choice depends on the complexity and requirements of the model.

  • Sequential API: Limited to simple, linear models. Not suitable for complex architectures.
  • Functional API: Offers more flexibility and is suitable for complex models. Preferred for most use cases where complexity is involved.
  • Model Subclassing API: Provides full control and is best for novel or highly customized models.

Example:

For a simple classification task, the Sequential API is sufficient. For a model with multiple inputs and outputs, the Functional API is better suited. If the model requires custom training loops or complex behaviors, the Model Subclassing API would be the best choice.

Other APIs in Keras

Keras also includes specialized APIs for preprocessing, tuning, and serialization, among other tasks. These APIs support a wide range of workflows, making Keras a versatile library for deep learning.

By understanding the strengths and appropriate use cases of each Keras API, you can select the most suitable approach for your machine learning projects and build models effectively and efficiently.

For further reading and detailed information, you can explore the Keras Models API documentation.

Now is Time to Code:




Detailed Explanations


Why Use the Model Subclassing API?



  • Flexibility: Provides the ability to define any architecture and custom behavior, which is especially useful for complex or non-standard models.

  • Customization: Allows for implementing custom layers and operations within the forward pass, offering unparalleled control over the model’s inner workings.

  • Research and Innovation: Ideal for experimenting with novel architectures and approaches in deep learning research.


How It Affects the Results:



  • Precision and Control: By having full control over the model’s operations, you can optimize and customize the architecture to potentially achieve better performance and efficiency for specific tasks.

  • Complexity Management: This approach can manage more complex dependencies and layer interactions, which might not be possible with simpler APIs.


Mathematical Insight:


In the Model Subclassing API, you explicitly define the transformation of the input tensor x through various layers, leading to the output tensor y. This direct control allows for implementing sophisticated transformations and handling multiple inputs and outputs more effectively.


By understanding and leveraging these three different APIs in Keras, you can choose the best approach based on your specific needs, complexity, and the nature of your machine learning or deep
learning tasks.


Differences in Model Definitions and Explanations


The exact lines in each of the three code examples that differ are the lines where the model is defined. These lines are different because each API provides a distinct way to define the architecture of the neural network.


Sequential API


The model is defined using a sequential approach, where each layer is added one after another in a linear stack.


<br># Sequential API model definition<br>model = tf.keras.Sequential([<br>    tf.keras.layers.Flatten(input_shape=[28, 28]),<br>    tf.keras.layers.Dense(300, activation=&amp;amp;quot;relu&amp;amp;quot;),<br>    tf.keras.layers.Dense(100, activation=&amp;amp;quot;relu&amp;amp;quot;),<br>    tf.keras.layers.Dense(10, activation=&amp;amp;quot;softmax&amp;amp;quot;)<br>])<br><br>

Functional API


The model is defined using a functional approach, which allows for more complex architectures with multiple inputs and outputs.


<br># Functional API model definition<br>inputs = tf.keras.Input(shape=(28, 28))<br>x = tf.keras.layers.Flatten()(inputs)<br>x = tf.keras.layers.Dense(300, activation=&amp;amp;quot;relu&amp;amp;quot;)(x)<br>x = tf.keras.layers.Dense(100, activation=&amp;amp;quot;relu&amp;amp;quot;)(x)<br>outputs = tf.keras.layers.Dense(10, activation=&amp;amp;quot;softmax&amp;amp;quot;)(x)<br><br>model = tf.keras.Model(inputs, outputs)<br>

Model Subclassing API


The model is defined by subclassing the tf.keras.Model class, providing the most flexibility and control over the model’s architecture and forward pass.


<br># Model Subclassing API model definition<br>class MyModel(tf.keras.Model):<br>    def __init__(self):<br>        super(MyModel, self).__init__()<br>        self.flatten = tf.keras.layers.Flatten()<br>        self.dense1 = tf.keras.layers.Dense(300, activation=&amp;amp;quot;relu&amp;amp;quot;)<br>        self.dense2 = tf.keras.layers.Dense(100, activation=&amp;amp;quot;relu&amp;amp;quot;)<br>        self.dense3 = tf.keras.layers.Dense(10, activation=&amp;amp;quot;softmax&amp;amp;quot;)<br><br>    def call(self, inputs):<br>        x = self.flatten(inputs)<br>        x = self.dense1(x)<br>        x = self.dense2(x)<br>        return self.dense3(x)<br><br>model = MyModel()<br>

These differences exist because:



  • Sequential API: Best for simple models where the layers are stacked in a linear order.

  • Functional API: Suitable for complex models with non-linear topology, multiple inputs/outputs, and shared layers.

  • Model Subclassing API: Provides full control and customization over the model, ideal for research and novel architectures.







Comparison of Keras APIs: Sequential, Functional, and Model Subclassing in our code above, is the resuls the same ?


Results Comparison Across All 3 APIs


The results for models built using the Sequential API, Functional API, and Model Subclassing API will generally be the same, assuming the architectures, initialization, training process, and hyperparameters are identical. The underlying mathematical operations and computations are the same across these APIs; they merely offer different ways to define and build the model.


Key Factors for Same Results



  • Identical Architecture: All three APIs use the same layers in the same order: Flatten -> Dense (300 units, ReLU) -> Dense (100 units, ReLU) -> Dense (10 units, Softmax).

  • Same Initialization and Training Process: The random seed is set for reproducibility. The models are compiled with the same loss function (sparse_categorical_crossentropy), optimizer (sgd), and metric (accuracy). The training process (epochs, batch size, validation split) is identical.


Given these factors, the results in terms of accuracy and loss should be the same across the three APIs because the underlying computations performed by the models are identical.


Still Sequential api is better approach for this task, Now Why Sequential API is Better for This Task?


Simplicity and Readability


The Sequential API allows for the simplest and most readable code when building straightforward, linear models. This reduces the potential for errors and makes the code easier to understand and maintain.


Ease of Use


For tasks that do not require complex architectures, the Sequential API is the quickest and easiest way to build a model. There’s no need to define the input and output layers explicitly as in the Functional API or to implement a custom class as in the Model Subclassing API.


Focused Scope


The Sequential API is designed specifically for models where layers are stacked sequentially. This focused scope makes it a better fit for simple models, avoiding the overhead and additional code required by the other APIs.


Results Comparison for the Provided Example


The models built using the Sequential API, Functional API, and Model Subclassing API perform the same computations. Therefore, the training and evaluation results (accuracy and loss) for all three models should be the same.


Expected Results


Assuming the same architecture, initialization, and training procedure, the models should produce identical results. For example:



  • Test accuracy: 0.8745 (expected for all three APIs)


Conclusion


Result Similarity: The results are the same across all three APIs because the models perform identical computations.


Why Sequential is Better: For this particular task (a straightforward feedforward neural network), the Sequential API is better because it provides a simpler, more readable, and more efficient way to define and train the model without unnecessary complexity. It aligns perfectly with the problem’s requirements, offering the quickest and most straightforward path to implementation.


 


Keras APIs Comparison





























































AspectSequential APIFunctional APIModel Subclassing APIMathematical Foundationy = L_n \circ L_{n-1} \circ \ldots \circ L_1(x)y = f(x_1, x_2, \ldots, x_n)y = \text{model}(x)Transformation Examplea_{i+1} = f_i(W_i \cdot a_i + b_i)y_1 = f_1(W_1 \cdot x + b_1)
y_2 = f_2(W_2 \cdot y_1 + b_2)
Custom defined in call method:

    \[ \begin{aligned}&a = f_1(W_1 \cdot x + b_1) \\&b = f_2(W_2 \cdot a + b_2) \\&y = f_3(a, b)\end{aligned} \]


AlgorithmLinear flow from input to outputDAG with multiple branches and mergesCustom forward passImplementationLayers added sequentiallyLayers defined as a DAGLayers and forward pass defined explicitlyPerformanceFast for simple modelsEfficient for complex modelsVaries, can be optimized for specific tasksFlexibilityLimited (single input/output)High (multiple inputs/outputs, shared layers)Maximum (complete control over architecture)Ease of UseEasiest, ideal for beginnersModerate, requires understanding of DAGsMost complex, requires deep knowledgeUse CasesSimple models, prototypesComplex architectures (e.g., ResNet, Inception)Custom architectures, novel research models

Detailed Mathematical Examples


1. Sequential API


Transformation:


a_1 = f_1(W_1 \cdot x + b_1)
y = f_2(W_2 \cdot a_1 + b_2)


Backpropagation:


\frac{\partial \mathcal{L}}{\partial W_2} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial W_2}
\frac{\partial \mathcal{L}}{\partial W_1} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial a_1} \cdot \frac{\partial a_1}{\partial W_1}


2. Functional API


Transformation:


y_1 = f_1(W_1 \cdot x + b_1)
y_2 = f_2(W_2 \cdot y_1 + b_2)
y = f_3(y_1, y_2)


Backpropagation:


\frac{\partial \mathcal{L}}{\partial W_2} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial y_2} \cdot \frac{\partial y_2}{\partial W_2}
\frac{\partial \mathcal{L}}{\partial W_1} = \frac{\partial \mathcal{L}}{\partial y} \cdot \left( \frac{\partial y}{\partial y_1} \cdot \frac{\partial y_1}{\partial W_1} + \frac{\partial y}{\partial y_2} \cdot \frac{\partial y_2}{\partial y_1} \cdot \frac{\partial y_1}{\partial W_1} \right)


3. Model Subclassing API


Transformation:


class MyModel(tf.keras.Model):<br>    def __init__(self):<br>        super(MyModel, self).__init__()<br>        self.dense1 = tf.keras.layers.Dense(64, activation='relu')<br>        self.dense2 = tf.keras.layers.Dense(10)<br><br>    def call(self, inputs):<br>        x = self.dense1(inputs)<br>        return self.dense2(x)<br>    

Backpropagation:


\frac{\partial \mathcal{L}}{\partial W_1} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial x_1} \cdot \frac{\partial x_1}{\partial W_1}
\frac{\partial \mathcal{L}}{\partial W_2} = \frac{\partial \mathcal{L}}{\partial y} \cdot \frac{\partial y}{\partial W_2}


Gradient Descent and Optimization


In all APIs, the training process involves:


Loss Calculation:


\mathcal{L} = \frac{1}{N} \sum_{i=1}^{N} \mathcal{L}(\hat{y}_i, y_i)


Gradient Calculation (Backpropagation):


\frac{\partial \mathcal{L}}{\partial W_i}


 













Final Note: Lets Look at the Differences Between Keras Sequential & Functional API again for better understanding



Sequential API


Design Philosophy:
Linear Structure: The Sequential API is designed to stack layers in a strictly linear order.
Single Input and Output: It supports only one input tensor and one output tensor.
Ease of Use: It’s simple and intuitive, making it easy to build straightforward models quickly.


Internal Mechanism:
Layer Management: Layers are added sequentially to an internal list. Each layer’s input is the previous layer’s output.
Model Graph: The underlying computation graph is straightforward because it flows in one direction without any branches or merges.


Limitations:
Fixed Architecture: You can’t branch or merge layers, and you can’t use multiple inputs or outputs.
No Shared Layers: Each layer is used exactly once in the sequence.


Functional API


Design Philosophy:
Graph Structure: The Functional API allows you to create models as directed acyclic graphs (DAGs).
Multiple Inputs and Outputs: It supports models with multiple input and output tensors.
Flexibility and Versatility: It can handle complex architectures with non-linear connections, shared layers, and dynamic behavior.


Internal Mechanism:
Layer Management: Layers are connected via their inputs and outputs explicitly. You define the flow of data through the network by connecting tensors.
Model Graph: The underlying computation graph can have branches, merges, and shared layers, which is more complex but also more powerful.


Capabilities:
Branches and Merges: Layers can be connected in non-linear ways.
Shared Layers: The same layer can be used in different parts of the model.
Multiple Inputs/Outputs: Supports complex models that require processing different types of data simultaneously and producing multiple predictions.


Sequential API Example


<br>from tensorflow.keras.models import Sequential<br>from tensorflow.keras.layers import Dense<br><br>model = Sequential([<br>    Dense(64, activation='relu', input_shape=(784,)),<br>    Dense(32, activation='tanh'),<br>    Dense(10, activation='softmax')<br>])<br>

Internals:


Layer Order: Maintains an internal list of layers: [Dense(64), Dense(32), Dense(10)].
Forward Pass: Iterates over the list and applies each layer in sequence.
Graph: Simple linear graph: input -> Dense(64) -> Dense(32) -> Dense(10) -> output.


Functional API Example


<br>from tensorflow.keras.models import Model<br>from tensorflow.keras.layers import Input, Dense, concatenate<br><br># Define two input layers<br>input1 = Input(shape=(784,), name='input_1')<br>input2 = Input(shape=(784,), name='input_2')<br><br># First branch<br>x1 = Dense(64, activation='relu', name='dense_1')(input1)<br><br># Second branch<br>x2 = Dense(64, activation='relu', name='dense_2')(input2)<br><br># Concatenate the outputs<br>concat = concatenate([x1, x2], name='concat')<br><br># Final layers<br>x = Dense(32, activation='tanh', name='dense_3')(concat)<br>outputs = Dense(10, activation='softmax', name='output')(x)<br><br># Define the model with two inputs and one output<br>model = Model(inputs=[input1, input2], outputs=outputs)<br>

Internals:


Layer Connections: Tracks each layer’s input and output explicitly, creating a graph: input_1 -> Dense(64) -> concat <- Dense(64) <- input_2.
Forward Pass: Computes the outputs based on the graph structure, handling branches and merges.
Graph: More complex graph with branches and a merge: input_1 -> Dense(64) \ -> concat -> Dense(32) -> Dense(10) -> output<br> input_2 -> Dense(64) /.


Lets compare one more time the Sequential vs Functional API





































AspectSequential APIFunctional APIDesignLinear stack of layersDirected acyclic graph (DAG) of layersInput/OutputSingle input and outputMultiple inputs and outputsLayer ConnectionsEach layer connected to the previous layer in sequenceLayers connected via explicitly defined tensors, allowing for complex connectionsGraph StructureSimple, linear computation graphComplex, branched, and merged computation graphFlexibilityLimited to simple, sequential modelsSupports complex architectures, including branches, merges, and shared layersUse CasesQuick prototyping, simple feed-forward networksAdvanced models requiring complex data flows, multi-modal inputs, and multi-task outputs

Why Functional API Can Do What Sequential API Cannot


Graph-Based Design: The Functional API’s graph-based design allows for defining complex, non-linear relationships between layers, supporting multiple inputs and outputs, and enabling the creation of more sophisticated models.
Explicit Connections: In the Functional API, you explicitly define how data flows between layers, which is not possible in the Sequential API that relies on a fixed sequence.
Multiple Inputs and Outputs: The ability to handle multiple inputs and outputs directly comes from the Functional API’s flexibility in defining and managing multiple data streams and their interactions.


Conclusion


The three primary Keras APIs—Sequential, Functional, and Model Subclassing—are each designed with different modeling scenarios in mind:

  • Sequential API: Ideal for straightforward, linear stacks of layers where simplicity and ease of use are prioritized. It’s perfect for models that follow a single input-to-output pathway without branching or complex layer sharing.
  • Functional API: Suited for building complex architectures involving multiple inputs, multiple outputs, shared layers, or non-linear data flows. It provides a flexible way to construct models that require more than a simple stack of layers, allowing for arbitrary connectivity patterns among layers.
  • Model Subclassing API: Offers the highest level of customization and flexibility. By subclassing the Model class, you gain full control over the model’s architecture and forward pass, which is invaluable for implementing very complex or non-standard behaviors that are difficult or impossible to define with the Sequential or Functional APIs alone.

Understanding these differences:
Choosing the right API depends on the complexity and specific requirements of your neural network model:

  • Use the Sequential API for simple, linear stacks where clarity and ease-of-use are your primary goals.
  • Opt for the Functional API when your model involves multiple inputs/outputs, layer sharing, or non-linear architectures that require more flexible connections.
  • Select the Model Subclassing API when you need ultimate control over the training and inference process or wish to implement custom behaviors that go beyond the scope of the built-in layers and configurations.

This choice should be guided by the architectural needs of your project rather than any underlying hardware differences, ensuring that you pick the most appropriate tool for your model’s complexity and desired behavior.

lying hardware differences.


 

don't miss our new posts. Subscribe for updates

We don’t spam! Read our privacy policy for more info.