Intro to Self Organizing Map and SOM Python Implementation
Self Organizing Map (SOM) is also known as Kohonen Map that is considered as an Artificial Neural Network model which resembles mammalian cerebral cortex characteristics. SOM is an unsupervised learning algorithm that employs the vector quantization method. In this tutorial, we are going to learn the core concepts in SOM and how to implement a SOM by using Python3 programming language. To follow this tutorial, you should have a good idea about vector spaces and you should have a basic idea about Machine Learning. If you need to know the basics in ML, you can refer to Zero to Hero in ML book.
Vector Quantization
Now you may ask what vector quantization is. In vector quantization, simply it attempts to quantize the vector space by distributing prototype vectors. After segregating the vector space into regions, each region will be represented by a vector called the centroid. The simplified quantization algorithm is
- Pick a random data point (row in the dataset)
- Find the nearest neuron
- Move the nearest neuron towards the data point by a small fraction of the distance
- Repeat
Self Organizing Map vs K Means Clustering
Both SOM and K Means algorithms can be used for clustering data. In K Means algorithm, you should define the K value (the number of clusters you want). But in SOM, you don't need to pre-define the number of clusters because SOM identifies the internal organization of data points. The Euclidean distance between neurons, that calculated after training the SOM, will let you know how many clusters are there because it uses the vector quantization method to segregate the vector space into regions. So, you can use SOM to define the K value in the K Means clustering algorithm as well as using the elbow method.
SOM Artificial Neural Network Structure
When we initialize the SOM structure, we have to consider whether it's a rectangular grid or a hexagonal grid. Then we should define the dimensions of the structure (size of the 2D neural network). It can be a 10*10 neural network or a 15*20 neural network. you can define it as you need. The next thing we have to consider when initializing the SOM structure is whether we are going to initialize random weights (random positions in the vector space) to neurons or assign random data points' weights (data points' position in the vector space) to neurons.
Self Organizing Map Kohonen Algorithm
- Define the structure of the neural network
- Initialize weights for the neurons
- Randomly pick an input vector x (a data point)
- Select the Best Matching Unit (BMU)/ the closest neuron of the x
- Define the neighborhood N for the BMU
- Update weights of all the neurons within N to pull them closer to the selected input vector
- Repeat from step 3 to step 7 for the given number of iterations
How to select the BMU - the closest neuron for the selected data point
How to define the neighborhood for the BMU
After finding the BMU, we have to find the neighbor neurons around the BMU also because we should update their weights also to make it happen properly. To do that, we have to define an initial N value and this value shrinks monotonically with time (with each iteration). You can identify the N value as the radius around the BMU point to define the neighborhood area.How to adjust weights of the BMU neuron and the neighbor neurons
- Wi,j(t+1) = New weight of the neuron
- Wi,j(t) = Current weight of the neuron
- 𝛼(𝑡) = Learning rate which is a function of time/ iterations. So, the adjusting weight will be decreased with time.
0 Comments
Post a Comment