Deep learning is a branch of
machine learning that uses artificial neural networks to learn from large
amounts of data and perform complex tasks such as image recognition, natural
language processing, speech synthesis, and more. Deep learning models are composed
of multiple layers of neurons that process information and pass it to the next
layer. The more layers a model has, the deeper it is and the more capable it is
of learning abstract and high-level features from the data. There are many
types of deep learning models, each with its own advantages and disadvantages.
In this blog, we will introduce some of the most common and popular ones and
explain when and how to use them.
Supervised Models
Supervised models are trained
with labelled data, meaning that each input has a corresponding output or
target value. Supervised models can be used for tasks such as classification
and regression, where the goal is to predict a category or a numerical value
for a given input.
Classic Neural Networks
(Multilayer Perceptron’s)
Classic neural networks, also
known as multilayer perceptron’s (MLPs), are the simplest and most basic type
of deep learning models. They consist of an input layer, one or more hidden
layers, and an output layer. Each layer is fully connected to the next one,
meaning that every neuron in one layer receives input from every neuron in the
previous layer. Each neuron applies a nonlinear activation function to its
input and produces an output. Classic neural networks can be used
for tabular data formatted in rows and columns (CSV files), as well as for
classification and regression problems where a set of real values is given as
the input. They offer a high level of flexibility and can be applied to
different types of data.
Convolutional Neural
Networks (CNNs)
Convolutional neural networks
(CNNs) are a more advanced and powerful type of deep learning models that are
designed for image data. They can also handle other types of data that have a
spatial structure, such as audio, video, or text. CNNs use convolutional layers
instead of fully connected layers to extract features from the input. A
convolutional layer applies a set of filters or kernels to the input, producing
feature maps that capture local patterns in the data. CNNs also use pooling
layers to reduce the size and complexity of the feature maps, as well as
activation functions and fully connected layers at the end. CNNs are very
effective for image classification problems, where the goal is to assign a
label to an image based on its content. They can also be used for other tasks
such as object detection, face recognition, image segmentation, image
generation.
Recurrent neural networks
(RNNs) are a type of deep learning models that are specialized for sequential
data, such as text, speech, or time series. RNNs have a recurrent structure
that allows them to process each element of a sequence in relation to the
previous ones. They have a hidden state that stores information from previous
inputs and updates it with each new input. RNNs can also have multiple layers
and different architectures, such as bidirectional RNNs or long short-term
memory (LSTM) networks. RNNs are widely used for natural language processing
tasks, such as text classification, sentiment analysis, machine translation,
text summarization, question answering, and more. They can also be used for
speech recognition, speech synthesis, music generation, anomaly detection.
Unsupervised Models
Unsupervised models are trained with unlabelled data, meaning that there is no output or target value for each input. Unsupervised models can be used for tasks such as clustering and association rule learning, where the goal is to discover patterns or relationships in the data without any prior knowledge.
Self-Organizing Maps
(SOMs)
Self-organizing maps (SOMs)
are a type of unsupervised models that use neural networks to map
high-dimensional data onto a low-dimensional grid. Each node in the grid
represents a prototype or a cluster centre that is similar to some inputs in
the data. The nodes are arranged in such a way that neighbouring nodes are more
similar than distant ones. SOMs can be used for data visualization,
dimensionality reduction, clustering, anomaly detection.
Boltzmann Machines
Boltzmann machines are a type
of unsupervised models that use stochastic neural networks to model the
probability distribution of the data. A Boltzmann machine consists of a network
of binary units that can be either visible or hidden. The units are connected
by symmetric weights and have biases. The state of each unit is determined by a
stochastic function that depends on the energy of the network. The energy of
the network is defined as:
Where:
w i j {\displaystyle w_ {ij}}
is the connection strength between unit j {\displaystyle j} and unit i
{\displaystyle i} .
s i {\displaystyle s_ {i}} is
the state, s i ∈ {
0 , 1 } {\displaystyle s_ {i}\in \ {0,1\}} , of unit i {\displaystyle i} .
θ i {\displaystyle \theta _
{i}} is the bias of unit i {\displaystyle i} in the global energy function. ( −
θ i {\displaystyle -\theta _ {i}} is the activation threshold for the unit.)1
Boltzmann machines can be
used for generative modeling, feature extraction, dimensionality reduction.
Autoencoders are a type of
unsupervised models that use neural networks to learn a compressed
representation of the data. An autoencoder consists of two parts: an encoder
and a decoder. The encoder takes the input data and transforms it into a
lower-dimensional latent space. The decoder takes the latent representation and
reconstructs the original input data. The goal of an autoencoder is to minimize
the reconstruction error, which is the difference between the input and the
output. Autoencoders can be used for data compression, denoising, anomaly
detection, generative modelling.
Conclusion
In this blog, we have introduced some of the most common and
popular deep learning models and explained when and how to use them. We have
also seen that deep learning models can be classified into supervised and
unsupervised models, depending on whether they use labelled or unlabelled data.
We hope that this blog has given you a brief overview of deep learning models
and inspired you to learn more about them.








No comments:
Post a Comment