Applied unsupervised learning

In neural networks, there are a number of architectures implementing unsupervised learning; however, the scope of this book will cover only two: a neural network of radial basis functions and a Kohonen neural network.

Neural network of radial basis functions

This neural network architecture has three layers and combines two types of learning, as shown in the following figure:

Neural network of radial basis functions

For the hidden layer, competitive learning is applied in order to activate one of the radial basis functions in the hidden neurons. The radial basis function takes the form of Gaussian functions:

Neural network of radial basis functions

where d is the distance vector between the input x and the weights w of the neuron i:

Neural network of radial basis functions

The output of the neural network will be the linear sum of all the values produced by the neurons of the hidden layer:

Neural network of radial basis functions

Radial basis functions (RBFs) perform clustering only in the first hidden layer, whereas in the output layer, supervised learning is applied to find the output weights. Because the clusters defined in the RBF network are internal, we are not going to use this network now in this chapter; however, it will be detailed in Chapter 9, Neural Networks Optimization and Adaptation.

Kohonen neural network

Kohonen networks, which have been covered in Chapter 4, Self-Organizing Maps, are now used in a modified fashion. Kohonen can produce a shape in one or two dimensions at the output, but here, we are interested in clustering, which can be reduced to only one dimension. In addition, clusters may be related or not to each other, so the vicinity of neurons can be ignored for now in this chapter; this means that only one neuron will be activated and its neighbors will remain unchanged. Therefore, the neural network will adjust its weights to match the data to an array of clusters. The following figure shows a clustering layer in a Kohonen Neural Network:

Kohonen neural network

The training algorithm will be competitive learning, wherein the neuron with the greatest output has its weights adjusted. By the end of training, all the clusters of a neural network are expected to be defined. Note that there are no links between output neurons, meaning that only one input is active at the output.

Types of data

In practical applications, data can be classified in the following ways:

  • Numerical
    • Continuous or real
    • Discrete
  • Categorical
    • Ordinal
    • Unscaled

Tip

So far, we have been working mostly with numerical data, which is in principle easier to handle with neural networks. However, in more complex applications, one needs to handle non-numerical data, which involves translating the data into a "numeric universe," where the neural networks can be applied over it.

Examples of numerical data are values of temperature (continuous) and the number of days (discrete). The non-numerical data (categorical) can be ordinal, where there is a scale between the categories, or be unscaled, when all categories are in the same level, or no scale can be applied to it. Examples of ordinal categorical data are satisfaction degrees (dissatisfied, poorly satisfied, and well satisfied), whereas unscaled categorical data may be city names.

Numerical data can be easily inserted into neural networks, where one may need to only apply some normalization or preprocessing. However, categorical data needs some attention. If the data can be scaled (ordinal), it can be "discretized." Taking the example of satisfaction degree, we may create the following corresponding table:

Satisfaction Degree

Scaled Value

Dissatisfied

0

Poorly Satisfied

1

Very Satisfied

2

However, for unscaled categorical data, it is not recommended to apply numbers that might induce scaling on the considered variable. So, it is better to treat each categorical value as one binary variable, meaning 1 in the presence of the considered value or 0 in the absence of this value:

City Names

Neural Input

London

Tokyo

New York

Cape Town

Sydney

London

1

0

0

0

0

Tokyo

0

1

0

0

0

New York

0

0

1

0

0

Cape Town

0

0

0

1

0

Sydney

0

0

0

0

1

This mechanism of binary variables may eventually result in sparse data matrices containing a lot of zeros. However, there are techniques such as single value decomposition (SVD) that address this problem. The reader will learn more about this in the references.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset