Chapter 3. Handling Perceptrons

In this chapter, we are going to explore one of the most popular and basic types of neural network architecture: the perceptrons. This chapter also presents their extended generalized version, the so-called multilayer perceptrons, as well as their features, learning algorithms, and parameters. Also, the reader will learn how to implement them in Java and how to use them for solving some basic problems:

  • Perceptrons
    • Applications and limitations
  • Multilayer perceptrons
    • Classification
    • Regression
  • Backpropagation algorithm
  • Java implementation
  • Practical problems

Studying the perceptron neural network

Perceptron is the most simple neural network architecture. Projected by Frank Rosenblatt in 1957, it has just one layer of neurons, receiving a set of inputs and producing a set of outputs. This was one of the first representations of neural networks to gain attention, particularly because of its simplicity. The structure of a single neuron is shown as follows:

Studying the perceptron neural network

Applications and limitations of perceptrons

However, scientists did not take long to conclude that a perceptron neural network could only be applied to simple tasks because of its simplicity. At that time, neural networks were being used for simple classification problems, but perceptrons usually failed when faced with more complex datasets. Let's review the first example of Chapter 2, How Neural Networks Learn, (AND) to better understand this issue.

Linear separation

The example consists of an AND function that takes two inputs x1 and x2. This function can be plotted in a two-dimensional chart as follows:

Linear separation

Now, let's examine how the neural network evolves in the training by using the perceptron rule, considering a pair of two weights w1 and w2, initially 0.5, and a bias value of 0.5. Assume that the learning rate η equals 0.2.

Epoch

x1

x2

w1

w2

b

y

t

E

Δw1

Δw2

Δb

1

0

0

0,5

0,5

0,5

0,5

0

-0,5

0

0

-0,1

1

0

1

0,5

0,5

0,4

0,9

0

-0,9

0

-0,18

-0,18

1

1

0

0,5

0,32

0,22

0,72

0

-0,72

-0,144

0

-0,144

1

1

1

0,356

0,32

0,076

0,752

1

0,248

0,0496

0,0496

0,0496

2

0

0

0,406

0,370

0,126

0,126

0

-0,126

0,000

0,000

-0,025

2

0

1

0,406

0,370

0,100

0,470

0

-0,470

0,000

-0,094

-0,094

2

1

0

0,406

0,276

0,006

0,412

0

-0,412

-0,082

0,000

-0,082

2

1

1

0,323

0,276

-0,076

0,523

1

0,477

0,095

0,095

0,095

          

89

0

0

0,625

0,562

-0,312

-0,312

0

0,312

0

0

0,062

89

0

1

0,625

0,562

-0,25

0,313

0

-0,313

0

-0,063

-0,063

89

1

0

0,625

0,500

-0,312

0,313

0

-0,313

-0,063

0

-0,063

89

1

1

0,562

0,500

-0,375

0,687

1

0,313

0,063

0,063

0,063

After 89 epochs, we find the network to produce values close to the desired output. Since in this example, the outputs are binary (zero or one), we can assume that any value produced by the network that is below 0.5 is considered to be 0 and any value above 0.5 is considered to be 1. So, we can draw a function Y = x1w1 + x2w2 + b=0.5, with the final weights and bias found by the learning algorithm w1 = 0.562, w2 = 0.5, and b = -0.375, defining the linear boundary as shown in the following chart:

Linear separation

This boundary is a definition of all classifications given by the network. You can see that the boundary is linear, given that the function is linear. Thus, the perceptron network is really suitable for problems whose patterns are linearly separable.

Classical XOR case

Let's analyze the XOR case, whose chart can be seen in the following figure:

Classical XOR case

We see that in two dimensions, it is impossible to draw a line to separate the two patterns. What would happen if we tried to train a single-layer perceptron to learn this function? Suppose that we tried; let's see what happened through the following table:

Epoch

x1

x2

w1

w2

b

y

t

E

Δw1

Δw2

Δb

1

0

0

0,5

0,5

0,5

0,5

0

-0,5

0

0

-0,1

1

0

1

0,5

0,5

0,4

0,9

1

0,1

0

0,02

0,02

1

1

0

0,5

0,52

0,42

0,92

1

0,08

0,016

0

0,016

1

1

1

0,516

0,52

0,436

1,472

0

-1,472

-0,294

-0,294

-0,294

2

0

0

0,222

0,226

0,142

0,142

0

-0,142

0,000

0,000

-0,028

2

0

1

0,222

0,226

0,113

0,339

1

0,661

0,000

0,132

0,132

2

1

0

0,222

0,358

0,246

0,467

1

0,533

0,107

0,000

0,107

2

1

1

0,328

0,358

0,352

1,038

0

-1,038

-0,208

-0,208

-0,208

          

127

0

0

-0,250

-0,125

0,625

0,625

0

-0,625

0,000

0,000

-0,125

127

0

1

-0,250

-0,125

0,500

0,375

1

0,625

0,000

0,125

0,125

127

1

0

-0,250

0,000

0,625

0,375

1

0,625

0,125

0,000

0,125

127

1

1

-0,125

0,000

0,750

0,625

0

-0,625

-0,125

-0,125

-0,125

The perceptron just could not find any pair of weights that would drive the error below 0.625. This can be explained mathematically as we have already perceived from the chart that this function cannot be linearly separable in two dimensions. So, what if we add another dimension? Let's see the previous XOR chart in three dimensions:

Classical XOR case

In three dimensions, it is possible to draw a plane that would separate the patterns, provided that this additional dimension could properly transform the input data. Okay, but now, there is an additional problem: How can we derive this additional dimension since we have only two input variables? One obvious but "workaround" answer would be adding a third variable as a derivation from the two original ones. With this third variable a (derivation), our neural network would probably get the following shape:

Classical XOR case

Okay, now, the perceptron has three inputs, one of them being a composition of the other two. This also leads to a new question: How should this composition be processed? We can see that this component can act as a neuron, thereby giving the neural network a nested architecture. If so, there would be another new question: How would the weights of this new neuron be trained, since the error is on the output neuron?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset