1
Single Solution Based Metaheuristics

1.1. Introduction

In daily life, optimizing is a common activity for both individuals and professionals. For example, the process of optimizing involves minimizing production time and expenditure, and maximizing profit and performance. As a result, optimization has become a discipline in itself to solve various problems, particularly in the economic sector [SAM 69], in the field of biology [GOT 82] and in various industries [YOS 00]. To solve these problems, technology is used to implement algorithms that simulate real problems in order to achieve results that will subsequently be used according to the intended purpose. These simulations can be simple but also very complex. The developed algorithms can be accurate or approximated, leading to the global optimum or to a solution closed to the global optimum. The objective of optimization is to find the global and/or the local optimum or optima. Depending on the optimization problem being addressed, one or more methods can be applied, and one of them may be more suitable than the others.

These methods include the class of path-based methods also called single-solution metaheuristics. These methods are algorithms in which the search for the optimum is achieved by manipulating a single solution throughout the progression of the algorithm. From an initial solution, the latter evolves throughout the algorithm following a certain mechanism until the stopping criterion is reached. Furthermore, the acceptance of a solution instead of another is carried out by means of various ways based on the proposed model. The principal characteristic of pattern-search heuristics is reflected by the fact that they mainly favor the use of exploitation by focusing their search in a given region of the search space.

In this chapter, we are going to introduce a number of path-based methods starting with most common algorithms, notably descent methods, simulated annealing, microcanonical annealing and even tabu search. We will proceed with exploratory local search algorithms that incorporate other path-based methods such as the Greedy Randomized Adaptive Search (GRASP) method, variable neighborhood search, guided local search and iterated local search. We are also going to present other methods such as Nelder and Mead’s simplex method, the noising method and smoothing methods.

1.2. The descent method

The descent method (called “hill climbing” in maximization problems) is based on a randomly initialized solution or initialized with a greedy method. It then selects a solution in its neighborhood that strictly improves the current solution. This selection can be done in a number of different ways. The retained solution may be the first feasible solution that improves the objective function or the best feasible solution of the entire neighborhood. Depending on this choice, the method is, respectively, called simple descent method or steepest descent method.

Algorithm 1.1: The steepest descent method

Batch-1_image_14_6.jpg

The descent method is one of the simplest methods found in the literature; however, it presents a significant limitation. It may find itself easily trapped in a local minimum and therefore encourages exploitation and not exploration. In order to overcome this problem and be able to visit different regions of the search space, a variant called the restarted descent method has been implemented and involves executing the algorithm several times, thus allowing that various regions of the search space be visited. Nonetheless, another technique can be applied in order to escape from local minima, notably by accepting a degrading solution with a certain probability, a technique that can be found in simulated annealing.

1.3. Simulated annealing

Simulated annealing [KIR 84] is known for being the first metaheuristic to propose a process to escape local optima, and so by implementing the Metropolis algorithm proposed in 1953 [MET 53]. It is a method inspired from the annealing process practiced in metallurgy, which consists of melting metal at high temperature to be then cooled to a stable condition, called thermodynamic equilibrium, is obtained. This stable state may be “good” or “bad”, that is with minimal energy or not. Indeed, when the metal is quickly cooled, deformations can arise, whereas a “perfect” metal is achieved if the cooling process is adequate. The temperature corresponds to the control parameter of the metal stability.

By analogy, based on a random solution s, a solution s is generated in the neighborhood of s. If this neighbor solution improves the current solution, s is updated. Otherwise, i can also be accepted following the probability exp(−Δf/T). This probability makes it possible to accept a degrading solution in the case where the solution s presents a small degradation Δf with respect to s or when the temperature T is large enough. Therefore, exploration is preferred. However, this probability becomes smaller when knowing that the temperature follows a decreasing function and is updated at each iteration, thereby making exploitation more suitable.

Algorithm 1.2: Simulated annealing

Batch-1_image_16_2.jpg

Simulated annealing therefore depends on the temperature and its evolution. If it follows a strongly decreasing function, the algorithm is trapped in a local optimum. This is then referred to as premature convergence. Otherwise, the global optimum can be reached but optimality remains unguaranteed.

On the other hand, simulated annealing implements the Metropolis algorithm proposed in 1953 [MET 53], which may require a quite significant computation time hence microcanonical annealing.

1.4. Microcanonical annealing

Microcanonical annealing was introduced in 1987 by Bernard [BAR 87] and its model is described by algorithm 1.3. This method has similarities with the principles implemented in simulated annealing. Microcanonical annealing involves reducing the total energy based on a total high energy by decreasing the kinetic energy between two levels. This reduction follows a given decreasing function and allows the algorithm to converge toward optimal solutions. This method utilizes the algorithm proposed by Creutz in 1983 [CRE 83] whose objective was to maximize the entropy for a constant total energy by means of a succession of transitions.

Algorithm 1.3: Microcanonical annealing

Batch-1_image_17_2.jpg

In general, starting from a randomly drawn initial state s0 with a high energy E0 and a demon energy initialized with a value of 0, an energy reduction with a fixed percentage is applied followed by the Creutz algorithm until it reaches a thermodynamic equilibrium state. These two operations are executed until there is no more improvement.

Microcanonical annealing is known to be similar to simulated annealing. This method is just as effective as its predecessor in terms of results except that it is simpler and faster in terms of computational times [HER 93]. This speed is due to the implementation of the Creutz algorithm. However, it still remains possible to find the same local optimum repeatedly during the search, even if we have escaped from it previously. This is what tabu search tries to take into consideration by making use of the notion of memory.

1.5. Tabu search

Tabu search, introduced by Glover in 1986 [GLO 86a], is based on the principle of human memory. This algorithm makes it possible to memorize previously encountered solutions by storing them in what we call a tabu list.

Tabu search consists of exploring the neighborhood starting from a given position, and selecting the best solution encountered not being included in the tabu list. Consequently, it is possible to keep a degrading solution that therefore allows escaping from local minima and the fact of prohibiting a move to already encountered solutions avoids falling back into these local minima.

Algorithm 1.4: Tabu search

Batch-1_image_18_5.jpg

The size of this tabu list thus represents the major parameter of this method. By increasing the size of this list, the exploration mechanism is favored. On the other hand, by decreasing the size of the tabu list, the exploitation mechanism is favored. This list can have a variable size during the search [TAI 91] or be reactive [BAT 94] depending on the solutions obtained; in other words, if the same solution is often obtained, the size of the list is increased or decreased if the current solution is only rarely improved.

1.6. Pattern search algorithms

In this section, we are going to present more recent path-based algorithms, in particular the GRASP method, variable neighborhood search, guided local search and iterated local search.

1.6.1. The GRASP method

The GRASP method was proposed by Feo and Resende [FEO 95] and is one of the simplest methods in metaheuristics. This is an iterative method that at each iteration undergoes a step constructing the solution, followed by a step involving a local search to return the best workable solution to the given problem, as described in algorithm 1.5.

Algorithm 1.5: The GRASP method

Batch-1_image_19_4.jpg

The construction step (algorithm 1.6) inserts an element at each iteration in the workable solution. The choice of this element is carried out randomly based on a restrictive list of candidates comprising only the best ones. The quality of a candidate is obtained based on the benefits that it brings to the solution under construction and is updated during the following iterations. Therefore, the result is a solution in which local optimality cannot be guaranteed. A local search method is then applied, which brings us to the second step of the algorithm. Based on the solution obtained during the construction phase, a local search method such as the descent method, simulated annealing or even tabu search is applied in order to improve this solution.

The GRASP method is a simple method that executes with a relatively small computation time and which leads to good results. It can therefore easily be integrated among other metaheuristics.

Algorithm 1.6: The construction step of the solution

Batch-1_image_20_2.jpg

1.6.2. Variable neighborhood search

Variable neighborhood search [MLA 97] is based on the systematic change in the neighborhood during the search.

We start by selecting a set of neighborhood structures Nk where k = 1, 2, …, kmax and determine an initial solution s. Then, three steps are applied: perturbation, local search and the move or continuation starting from s. Perturbation consists of randomly generating a solution s in the neighborhood of s defined by the kth neighborhood structure. From s, a local search method is applied which returns s′′. The move makes it possible to refocus the search around s′′. As a matter of fact, if s′′ improves the best known solution, then s is updated with s′′ and the algorithm is restarted from the first neighborhood, otherwise the following neighborhood is considered. This method then allows the exploration of several types of neighborhoods until the current solution is improved. Once at this stage, the neighborhoods of the new solution are explored and so on until a stopping criterion is satisfied.

Thus, when the variable neighborhood search is performed, it is necessary to determine the number and type of neighborhood to utilize, the exploration order of the neighborhood during the search (in general, in an increasing manner depending on the size of the neighborhood), the strategy for changing neighborhood, the local search method and the stopping criterion.

Algorithm 1.7: Variable neighborhood search

Batch-1_image_21_2.jpg

Variants of this method have been proposed such as variable neighborhood descent [HAN 01a], variable neighborhood search and decomposition [HAN 01b] and biased variable neighborhood search [HAN 00].

Variable neighborhood descent makes use of the steepest descent method during the local search and we stop if the solution is improved, otherwise the following type of neighborhood is considered.

Search and variable neighborhood decomposition utilize the same steps as the variable neighborhood search except that one selects a neighbor s of s and its neighborhood is generated following the same type of neighborhood such that the intersection of the neighborhoods of s and s be empty and the steepest descent method is applied. As a result, the local optimum s′′ is obtained. The intersection of the neighborhood of s and s is taken, s′′ is inserted and once again the steepest descent method is applied. The move is then carried from the last local optimum obtained.

Biased variable neighborhood search is capable of keeping the best solution sopt found throughout the whole search and in order to encourage the exploration of the search space, the refocusing of the search is done by using a parameter α and a function ρ that calculates the distance between two solutions s and s. In effect, based on a solution s, a neighbor s is selected by employing the first type of neighborhood, then the local search method is applied thereto yielding the local optimum s′′. If s′′ improves sopt, then sopt is updated with s′′. Then, the search is refocused on s′′ if f(s′′) − αρ(s, s) < f(s), the search is restarted from the first neighborhood with s′′ as a starting solution; otherwise, the next neighborhood is addressed.

1.6.3. Guided local search

To escape local optima, some methods use several neighborhood structures such as variable neighborhood search while others addtionally employ some memory to avoid revisiting a solution such as tabu search. Introduced by Voudouris, guided local search [VOU 99] utilizes a technique whose solution set and neighborhood structure remains identical throughout the search, but that dynamically modifies the objective function. This technique makes it possible to orientate the search toward another region of the search space, thus encouraging exploration.

In the case of minimization problems, equation [1.1] represents this modification that is reflected in the increase in the objective function with a set of penalizations.

where:

  • g(s) is the objective function of a given problem;
  • λ is the regularization parameter;
  • pi are the penalization parameters;
  • M is the number of attributes defining the solutions;
  • Ii(s) allows specifying whether the solution s contains the attribute i and therefore takes 1 or 0 as values.

The regularization parameter λ enables us to determine the significance of the variation in the objective function and to control the influence of the information related to the search.

The penalization parameters pi correspond to the weights of the constraints of each attribute i with respect to the solution. At each iteration, only the attributes trapped in a local optimum that maximize the utility are penalized by incrementing pi by 1. This utility is calculated according to the expression [1.2] based on the cost of this attribute ci and its penalization parameter pi.

Algorithm 1.8: Guided local search

Batch-1_image_23_9.jpg

1.6.4. Iterated local search

Iterated local search, proposed by Lourenço in 2001 [LOU 01, LOU 03], is the most general model of exploratory algorithms. In effect, from a solution s, an intermediate solution s is selected by applying a perturbation, then a local search is performed culminating in a local optimum (s′′) that can be kept based on the acceptation criterion.

Algorithm 1.9: Iterated local search

Batch-1_image_24_3.jpg

It should be noted that the perturbation is intended to direct the search toward another basin of attraction. Thus, it significantly impacts the search, it must be neither too small nor too large. A value that is too small will not make it possible to explore more solutions and the algorithm will quickly stagnate. On the other hand, a large value will give the algorithm a character similar to “random-restart” algorithms, which will bias the search. As a result, non-deterministic perturbation is a means to avoid going through the same cycles again. It is characterized by the so-called “perturbation force” that can be fixed or variable, random or adaptive. If the force is adaptive, exploration and exploitation criteria can be controlled. By increasing this force, exploration is preferred and by decreasing it, exploitation is preferred.

The objective of the acceptance criterion is to determine the conditions that the new local optimum must satisfy in order to replace the current solution. This criterion thus enables the implementation of an equilibrium by keeping the newly found local optimum or by restarting from the previously found local optimum.

1.7. Other methods

In this section, we will introduce other path-based algorithms, in particular the Nelder–Mead simplex method, smoothing methods and the noising method.

1.7.1. The Nelder–Mead simplex method

Initially proposed by Spendley in 1962 [SPE 62], then improved by Nelder and Mead in 1965 [NEL 65], the simplex method aims at solving unconstrained optimization problems by utilizing a succession of simplex transformations.

Initially, a set of (n +1)vertices Batch-1_Inline_25_11.gif in a n-dimensional space is initialized therefore constituting the current simplex. This initialization can be achieved by generating a point Batch-1_Inline_25_12.gif then by defining the other points around it toward each of the directions Batch-1_Inline_25_13.gif of the space with an amplitude a:

[1.3]Batch-1_Inline_25_22.jpg

Until the stopping condition is reached, the simplex is transformed with reflection α, expansion γ and contraction β operators applied to the worst vertex Batch-1_Inline_25_14.gif defined by equations [1.5][1.8], the goal being to replace the vertex with the highest cost with another having a lower cost.

In the following, vector cbar.jpg represents the centroid with respect to all vertices of the simplex except Batch-1_Inline_25_15.gif

[1.4]Batch-1_Inline_25_23.jpg
  • Reflection operator:

The reflection operator is applied on Batch-1_Inline_25_16.gif whose cost is fh in order to obtain Batch-1_Inline_25_17.gif with cost fr. If flfr < fs, vertex Batch-1_Inline_25_18.gif is replaced by vertex Batch-1_Inline_25_19.gif with fl the cost of the best vertex Batch-1_Inline_25_20.gif and fs the cost of vertex Batch-1_Inline_25_21.gif classified in the second position.

  • Expansion operator:

If fr < fl, the expansion operator is applied on vertex Batch-1_Inline_26_17.gif obtaining vertex Batch-1_Inline_26_18.gif of cost fe. In the case where fe < fr, vertex Batch-1_Inline_26_19.gif is accepted otherwise vertex Batch-1_Inline_26_20.gif is.

[1.6]Batch-1_Inline_26_13.jpg
  1. Contraction operator:

If frfs, the contraction operator is applied and vertex Batch-1_Inline_26_21.gif of cost fc is obtained. If fsfr < fh, vertex Batch-1_Inline_26_22.gif is obtained with equation [1.7] and if fcfr, vertex Batch-1_Inline_26_26.gif is accepted. On the other hand, if frfh, vertex Batch-1_Inline_26_23.gif is obtained with equation [1.8] and if fc′fh, vertex Batch-1_Inline_26_24.gif is accepted.

If no vertex is accepted during the previous steps, all vertices Batch-1_Inline_26_27.gif are updated:

[1.9]Batch-1_Inline_26_16.jpg

Just like all path-based algorithms, this method leads to a convergence toward a local optimum and in order to overcome this limitation of the conventional method, it is possible to restart it once a local optimum is reached, reinitializing Batch-1_Inline_26_25.gif every time.

1.7.2. The noising method

Introduced by Charon in 1993 [CHA 93], the noising method is a single-solution metaheuristic based on the descent approach. From an initial solution and at each iteration, noise is added to the objective function such that the values of the objective function be directed toward a descent. The noise added to the current solution changes from a given initial value and decreases at each iteration until the value is zero or a minimal value. The variation of this noise can be defined in various ways. Algorithm 1.10 summarizes the approach of the noising method.

Algorithm 1.10: The noising method

Batch-1_image_27_3.jpg

In this method, two parameters have to be determined, the noise rate and the decreasing rate of this noise. The first consists of defining the values of rmax and rmin with rmin usually set to 0 while the second parameter may follow the geometric decreasing function defined by equation [1.10] or other more advanced methods [CHA 06]. However, the choice of these parameters has a significant impact on the performance of this method.

1.7.3. Smoothing methods

The main goal of smoothing methods is to reduce the number of local optima as well as the depth of the basins of attraction by modifying the objective function yet without changing search space [GLO 86b, JUN 94]. Once this smoothing operation is performed, any path-based method or metaheuristic can generally be utilized.

First, this approach involves subdividing the problem being considered into several successive instances relating to a specific search space. First of all, the simplest instance is dealt with and a local search method is then applied as a second step. The solution obtained is kept as an initial solution of the problem related to the next instance, which becomes increasingly more complex at each iteration and therefore enables this solution to be improved.

Algorithm 1.11: The smoothing method

Batch-1_image_28_2.jpg

The smoothing makes it possible to decrease the likelihood of being trapped in a local optimum. In addition, this technique can reduce the complexity of the algorithm that implements it. Moreover, the computational time of the algorithm can prove considerable.

1.8. Conclusion

All the methods presented in this chapter are considered to be local searches, because they are based on intensification that allows a good quality solution to be obtained. Furthermore, to improve the results obtained, the fact of distributing the search over a larger number of regions happens to be a complementary method as a means to span several regions of the search space by generating multiple solutions [XU 14].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset