Algorithms_in_C 1.0.0
Set of algorithms implemented in C.
Loading...
Searching...
No Matches
kohonen_som_topology.c File Reference

Kohonen self organizing map (topological map) More...

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
Include dependency graph for kohonen_som_topology.c:

Data Structures

struct  kohonen_array_3d
 to store info regarding 3D arrays More...
 

Macros

#define _USE_MATH_DEFINES
 required for MS Visual C
 
#define max(a, b)   (((a) > (b)) ? (a) : (b))
 shorthand for maximum value
 
#define min(a, b)   (((a) < (b)) ? (a) : (b))
 shorthand for minimum value
 

Functions

double * kohonen_data_3d (const struct kohonen_array_3d *arr, int x, int y, int z)
 Function that returns the pointer to (x, y, z) ^th location in the linear 3D array given by:
 
double _random (double a, double b)
 Helper function to generate a random number in a given interval.
 
int save_2d_data (const char *fname, double **X, int num_points, int num_features)
 Save a given n-dimensional data martix to file.
 
int save_u_matrix (const char *fname, struct kohonen_array_3d *W)
 Create the distance matrix or U-matrix from the trained weights and save to disk.
 
void get_min_2d (double **X, int N, double *val, int *x_idx, int *y_idx)
 Get minimum value and index of the value in a matrix.
 
double kohonen_update_weights (const double *X, struct kohonen_array_3d *W, double **D, int num_out, int num_features, double alpha, int R)
 Update weights of the SOM using Kohonen algorithm.
 
void kohonen_som (double **X, struct kohonen_array_3d *W, int num_samples, int num_features, int num_out, double alpha_min)
 Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given datset.
 
void test_2d_classes (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
 
void test1 ()
 Test that creates a random set of points distributed in four clusters in 2D space and trains an SOM that finds the topological pattern.
 
void test_3d_classes1 (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
 
void test2 ()
 Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that finds the topological pattern.
 
void test_3d_classes2 (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
 
void test3 ()
 Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM that finds the topological pattern.
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 Convert clock cycle difference to time in seconds.
 
int main (int argc, char **argv)
 Main function.
 

Detailed Description

Kohonen self organizing map (topological map)

This example implements a powerful unsupervised learning algorithm called as a self organizing map. The algorithm creates a connected network of weights that closely follows the given data points. This thus creates a topological map of the given data i.e., it maintains the relationship between various data points in a much higher dimensional space by creating an equivalent in a 2-dimensional space. Trained topological maps for the test cases in the program

Author
Krishna Vedala
Warning
MSVC 2019 compiler generates code that does not execute as expected. However, MinGW, Clang for GCC and Clang for MSVC compilers on windows perform as expected. Any insights and suggestions should be directed to the author.
See also
kohonen_som_trace.c

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds.

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
664{
665 return (double)(end_t - start_t) / (double)CLOCKS_PER_SEC;
666}

◆ main()

int main ( int  argc,
char **  argv 
)

Main function.

670{
671#ifdef _OPENMP
672 printf("Using OpenMP based parallelization\n");
673#else
674 printf("NOT using OpenMP based parallelization\n");
675#endif
676 clock_t start_clk, end_clk;
677
678 start_clk = clock();
679 test1();
680 end_clk = clock();
681 printf("Test 1 completed in %.4g sec\n",
682 get_clock_diff(start_clk, end_clk));
683
684 start_clk = clock();
685 test2();
686 end_clk = clock();
687 printf("Test 2 completed in %.4g sec\n",
688 get_clock_diff(start_clk, end_clk));
689
690 start_clk = clock();
691 test3();
692 end_clk = clock();
693 printf("Test 3 completed in %.4g sec\n",
694 get_clock_diff(start_clk, end_clk));
695
696 printf("(Note: Calculated times include: writing files to disk.)\n\n");
697 return 0;
698}
void test2()
Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that...
Definition: kohonen_som_topology.c:506
void test1()
Test that creates a random set of points distributed in four clusters in 2D space and trains an SOM t...
Definition: kohonen_som_topology.c:406
double get_clock_diff(clock_t start_t, clock_t end_t)
Convert clock cycle difference to time in seconds.
Definition: kohonen_som_topology.c:663
void test3()
Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM ...
Definition: kohonen_som_topology.c:609
Here is the call graph for this function:

◆ test1()

void test1 ( )

Test that creates a random set of points distributed in four clusters in 2D space and trains an SOM that finds the topological pattern.

The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random U-matrix
  • w12.csv: trained SOM U-matrix
407{
408 int j, N = 300;
409 int features = 2;
410 int num_out = 30; // image size - N x N
411
412 // 2D space, hence size = number of rows * 2
413 double **X = (double **)malloc(N * sizeof(double *));
414
415 // cluster nodex in 'x' * cluster nodes in 'y' * 2
416 struct kohonen_array_3d W;
417 W.dim1 = num_out;
418 W.dim2 = num_out;
419 W.dim3 = features;
420 W.data = (double *)malloc(num_out * num_out * features *
421 sizeof(double)); // assign rows
422
423 for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
424 {
425 if (i < N) // only add new arrays if i < N
426 X[i] = (double *)malloc(features * sizeof(double));
427 if (i < num_out) // only add new arrays if i < num_out
428 {
429 for (int k = 0; k < num_out; k++)
430 {
431#ifdef _OPENMP
432#pragma omp for
433#endif
434 // preallocate with random initial weights
435 for (j = 0; j < features; j++)
436 {
437 double *w = kohonen_data_3d(&W, i, k, j);
438 w[0] = _random(-5, 5);
439 }
440 }
441 }
442 }
443
444 test_2d_classes(X, N); // create test data around circumference of a circle
445 save_2d_data("test1.csv", X, N, features); // save test data points
446 save_u_matrix("w11.csv", &W); // save initial random weights
447 kohonen_som(X, &W, N, features, num_out, 1e-4); // train the SOM
448 save_u_matrix("w12.csv", &W); // save the resultant weights
449
450 for (int i = 0; i < N; i++) free(X[i]);
451 free(X);
452 free(W.data);
453}
int save_u_matrix(const char *fname, struct kohonen_array_3d *W)
Create the distance matrix or U-matrix from the trained weights and save to disk.
Definition: kohonen_som_topology.c:139
int save_2d_data(const char *fname, double **X, int num_points, int num_features)
Save a given n-dimensional data martix to file.
Definition: kohonen_som_topology.c:102
double * kohonen_data_3d(const struct kohonen_array_3d *arr, int x, int y, int z)
Function that returns the pointer to (x, y, z) ^th location in the linear 3D array given by:
Definition: kohonen_som_topology.c:67
void kohonen_som(double **X, struct kohonen_array_3d *W, int num_samples, int num_features, int num_out, double alpha_min)
Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given...
Definition: kohonen_som_topology.c:314
double _random(double a, double b)
Helper function to generate a random number in a given interval.
Definition: kohonen_som_topology.c:87
#define max(a, b)
shorthand for maximum value
Definition: kohonen_som_topology.c:39
void test_2d_classes(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_topology.c:366
#define malloc(bytes)
This macro replace the standard malloc function with malloc_dbg.
Definition: malloc_dbg.h:18
#define free(ptr)
This macro replace the standard free function with free_dbg.
Definition: malloc_dbg.h:26
to store info regarding 3D arrays
Definition: kohonen_som_topology.c:48
int dim1
lengths of first dimension
Definition: kohonen_som_topology.c:49
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that finds the topological pattern.

The following CSV files are created to validate the execution:

  • test2.csv: random test samples points
  • w21.csv: initial random U-matrix
  • w22.csv: trained SOM U-matrix
507{
508 int j, N = 500;
509 int features = 3;
510 int num_out = 30; // image size - N x N
511
512 // 3D space, hence size = number of rows * 3
513 double **X = (double **)malloc(N * sizeof(double *));
514
515 // cluster nodex in 'x' * cluster nodes in 'y' * 2
516 struct kohonen_array_3d W;
517 W.dim1 = num_out;
518 W.dim2 = num_out;
519 W.dim3 = features;
520 W.data = (double *)malloc(num_out * num_out * features *
521 sizeof(double)); // assign rows
522
523 for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
524 {
525 if (i < N) // only add new arrays if i < N
526 X[i] = (double *)malloc(features * sizeof(double));
527 if (i < num_out) // only add new arrays if i < num_out
528 {
529 for (int k = 0; k < num_out; k++)
530 {
531#ifdef _OPENMP
532#pragma omp for
533#endif
534 for (j = 0; j < features; j++)
535 { // preallocate with random initial weights
536 double *w = kohonen_data_3d(&W, i, k, j);
537 w[0] = _random(-5, 5);
538 }
539 }
540 }
541 }
542
543 test_3d_classes1(X, N); // create test data
544 save_2d_data("test2.csv", X, N, features); // save test data points
545 save_u_matrix("w21.csv", &W); // save initial random weights
546 kohonen_som(X, &W, N, features, num_out, 1e-4); // train the SOM
547 save_u_matrix("w22.csv", &W); // save the resultant weights
548
549 for (int i = 0; i < N; i++) free(X[i]);
550 free(X);
551 free(W.data);
552}
void test_3d_classes1(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_topology.c:465
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM that finds the topological pattern.

The following CSV files are created to validate the execution:

  • test3.csv: random test samples points
  • w31.csv: initial random U-matrix
  • w32.csv: trained SOM U-matrix
610{
611 int j, N = 500;
612 int features = 3;
613 int num_out = 30;
614 double **X = (double **)malloc(N * sizeof(double *));
615
616 // cluster nodex in 'x' * cluster nodes in 'y' * 2
617 struct kohonen_array_3d W;
618 W.dim1 = num_out;
619 W.dim2 = num_out;
620 W.dim3 = features;
621 W.data = (double *)malloc(num_out * num_out * features *
622 sizeof(double)); // assign rows
623
624 for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
625 {
626 if (i < N) // only add new arrays if i < N
627 X[i] = (double *)malloc(features * sizeof(double));
628 if (i < num_out) // only add new arrays if i < num_out
629 {
630 for (int k = 0; k < num_out; k++)
631 {
632#ifdef _OPENMP
633#pragma omp for
634#endif
635 // preallocate with random initial weights
636 for (j = 0; j < features; j++)
637 {
638 double *w = kohonen_data_3d(&W, i, k, j);
639 w[0] = _random(-5, 5);
640 }
641 }
642 }
643 }
644
645 test_3d_classes2(X, N); // create test data around the lamniscate
646 save_2d_data("test3.csv", X, N, features); // save test data points
647 save_u_matrix("w31.csv", &W); // save initial random weights
648 kohonen_som(X, &W, N, features, num_out, 0.01); // train the SOM
649 save_u_matrix("w32.csv", &W); // save the resultant weights
650
651 for (int i = 0; i < N; i++) free(X[i]);
652 free(X);
653 free(W.data);
654}
void test_3d_classes2(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_topology.c:564
Here is the call graph for this function:

◆ test_2d_classes()

void test_2d_classes ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
367{
368 const double R = 0.3; // radius of cluster
369 int i;
370 const int num_classes = 4;
371 const double centres[][2] = {
372 // centres of each class cluster
373 {.5, .5}, // centre of class 1
374 {.5, -.5}, // centre of class 2
375 {-.5, .5}, // centre of class 3
376 {-.5, -.5} // centre of class 4
377 };
378
379#ifdef _OPENMP
380#pragma omp for
381#endif
382 for (i = 0; i < N; i++)
383 {
384 int class =
385 rand() % num_classes; // select a random class for the point
386
387 // create random coordinates (x,y,z) around the centre of the class
388 data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
389 data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
390
391 /* The follosing can also be used
392 for (int j = 0; j < 2; j++)
393 data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
394 */
395 }
396}
Definition: prime_factoriziation.c:25
Here is the call graph for this function:

◆ test_3d_classes1()

void test_3d_classes1 ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
466{
467 const double R = 0.2; // radius of cluster
468 int i;
469 const int num_classes = 4;
470 const double centres[][3] = {
471 // centres of each class cluster
472 {.5, .5, .5}, // centre of class 1
473 {.5, -.5, -.5}, // centre of class 2
474 {-.5, .5, .5}, // centre of class 3
475 {-.5, -.5 - .5} // centre of class 4
476 };
477
478#ifdef _OPENMP
479#pragma omp for
480#endif
481 for (i = 0; i < N; i++)
482 {
483 int class =
484 rand() % num_classes; // select a random class for the point
485
486 // create random coordinates (x,y,z) around the centre of the class
487 data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
488 data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
489 data[i][2] = _random(centres[class][2] - R, centres[class][2] + R);
490
491 /* The follosing can also be used
492 for (int j = 0; j < 3; j++)
493 data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
494 */
495 }
496}
Here is the call graph for this function:

◆ test_3d_classes2()

void test_3d_classes2 ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
565{
566 const double R = 0.2; // radius of cluster
567 int i;
568 const int num_classes = 8;
569 const double centres[][3] = {
570 // centres of each class cluster
571 {.5, .5, .5}, // centre of class 1
572 {.5, .5, -.5}, // centre of class 2
573 {.5, -.5, .5}, // centre of class 3
574 {.5, -.5, -.5}, // centre of class 4
575 {-.5, .5, .5}, // centre of class 5
576 {-.5, .5, -.5}, // centre of class 6
577 {-.5, -.5, .5}, // centre of class 7
578 {-.5, -.5, -.5} // centre of class 8
579 };
580
581#ifdef _OPENMP
582#pragma omp for
583#endif
584 for (i = 0; i < N; i++)
585 {
586 int class =
587 rand() % num_classes; // select a random class for the point
588
589 // create random coordinates (x,y,z) around the centre of the class
590 data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
591 data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
592 data[i][2] = _random(centres[class][2] - R, centres[class][2] + R);
593
594 /* The follosing can also be used
595 for (int j = 0; j < 3; j++)
596 data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
597 */
598 }
599}
Here is the call graph for this function: