Chapter 2. Getting Neural Networks to Learn

Now that you have been introduced to neural networks, it is time to learn about their learning process. In this chapter, we're going to explore the concepts involved with neural network learning, along with their implementation in Java. We will make a review on the foundations and inspirations for the neural learning process that will guide us in implementation of learning algorithms in Java to be applied on our neural network code. In summary, these are the concepts addressed in this chapter:

  • Learning ability
  • How learning helps
  • Learning paradigms
  • Supervised
  • Unsupervised
  • The learning process
  • Optimization foundations
  • The cost function
  • Error measurement
  • Learning algorithms
  • Delta rule
  • Hebbian rule
  • Adaline/perceptron
  • Training, test, and validation
  • Dataset splitting
  • Overfitting and overtraining
  • Generalization

Learning ability in neural networks

What is really amazing in neural networks is their capacity to learn from the environment, just like brain-gifted beings are able to do so. We, as humans, experience the learning process through observations and repetitions, until some task, or concept is completely mastered. From the physiological point of view, the learning process in the human brain is a reconfiguration of the neural connections between the nodes (neurons), which results in a new thinking structure.

While the connectionist nature of neural networks distributes the learning process all over the entire structure, this feature makes this structure flexible enough to learn a wide variety of knowledge. As opposed to ordinary digital computers that can execute only tasks they are programmed to do, neural systems are able to improve and perform new activities according to some satisfaction criteria. In other words, neural networks don't need to be programmed; they learn the program by themselves.

How learning helps solving problems

Considering that every task to solve may have a huge number of theoretically possible solutions, the learning process seeks to find an optimal solution that can produce a satisfying result. The use of structures such as artificial neural networks (ANN) is encouraged due to their ability to acquire knowledge of any type, strictly by receiving input stimuli, that is, data relevant to the task/problem. At first, the ANN will produce a random result and an error, and based on this error, the ANN parameters are adjusted.

Tip

We can then think of the ANN parameters (weights) as the components of a solution. Let's imagine that each weight corresponds to a dimension and one single solution represents a single point in the solution hyperspace. For each single solution, there is an error measure informing how far that solution is from the satisfaction criteria. The learning algorithm then iteratively seeks a solution closer to the satisfaction criteria.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset