Showing newest posts with label function optimization. Show older posts
Showing newest posts with label function optimization. Show older posts

Saturday, January 17, 2009

Multiple Restart Search: Multiple runs to address convergence

While searching for the global optimum of a given problem domain the may question arise as to whether it is worth continuing with the search or starting the search again. This strategy of repeatedly executing a search on a problem domain with new initial conditions is called Random, Early, or Multiple Restart Search.

The Multiple Restart Search algorithm involves the application of another search technique until a predefined condition is met, after which the embedded search technique is executed again with different starting conditions. The conditions that govern when a restart will occur are refereed to as the restart schedule and examples include: a fixed number of function evaluations, a fixed time, or the convergence of the search process (no improvements for a fixed number of iterations or function evaluations).

Restarting or multiple executions of a technique is a common practice with stochastic optimization algorithms as they are influenced by their initial starting conditions and convergent behavior that may lead to local rather than the desired global optimum of a problem's search space. Restart schedules are also used with local search and hill climbing searching algorithms as they allow potentially multiple local optima to be explored improving the algorithms approximation of the global optima. Such algorithms are commonly referred to a Random-Restart Hill Climbing algorithms.

The multiple executions of an embedded search algorithm (the restarts) are traditionally executed sequentially as the name suggests. Alternatively, the algorithm may exploit parallel or distributed hardware and execute each algorithm run in parallel, offering a linear (and in some cases a super-linear) speedup of locating or approximating a globally optimal solution.

Multiple Restart Search for Function Optimization
This guide provides an example of a the multiple restart search algorithm applied to a continuous function optimization problem. The embedded search algorithm that is repeatedly restarted is a hill climbing algorithm, and as such the generalized approach may be referred to as a Random-Restart Hill Climber.

The SquaringFunction class defines an instance of a function optimization problem. The problem has a configurable number of parameters (@num_dimensions) the squares of which are summed together (hence the name). The evaluate(vector) method expects a vector of real valued numbers, the in_bounds?(vector) convenience method determines whether or not a given vector is valid within the bounds of the problems search space [-5.12, 5.12], and the is_optimal?(scoring) method determines whether or not the given scoring is optimal (equal to zero).
class SquaringFunction
attr_reader :dimensions, :min, :max

def initialize(dimensions=2)
@dimensions = dimensions
@min, @max = -5.12, +5.12
end

def evaluate(vector)
vector.inject(0) {|sum, x| sum + (x ** 2.0)}
end

def in_bounds?(vector)
vector.each {|x| return false if x>@max or x<@min}
return true
end

def is_optimal?(scoring)
scoring == optimal_score
end

def optimal_score
0.0
end

# true if s1 has a better score than s2
def is_better?(s1, s2)
s1 < s2 # minimizing
end
end
The Solution class provides a container for candidate solutions. The @data instance variable maintains an immutable solution representation (in this case real valued vectors), whereas the @score instance variable maintains a mutable problem specific scoring for the maintained data.
class Solution
attr_reader :data
attr_accessor :score

def initialize(data)
@data = data
@score = 0.0/0.0 # NaN
end

def to_s
"[#{@data.inspect}] (#{@score})"
end
end
The MultipleRestartSearchAlgorithm provides an implementation of the random-restart hill climbing algorithm. The initialize(max_restarts) constructor stores some algorithm configuration parameters in instance variables, specifically the maximum number of restarts provided as an argument, and the number of non-improving steps taken per run of the embedded hill climbing algorithm (set to 100). The search(problem) method manages the restart schedule generating a random starting point which is used to seed a run of the embedded hill climbing algorithm on each restart.

The problem independent hill_climb(problem, current) method does the hard work for the search, performing a fixed step hill climbing procedure on a provided starting point. The step size is set to 10% of the possible search space, and is unchanged throughout the run. The main loop of this procedure involves the generation of two steps (a positive and negative application of the step size) for a single problem component (function parameter). If one of the two steps results in an improvement the candidate solution is taken as the current position otherwise the number of non-improving iterations is incremented. This counter is rest each time an improving move is made, and is used to measure the convergence of a run of the hill climbing algorithm. The procedure loops through all components of the problem (one per iteration) seeking improvements. A run completes on convergence, which is occurs when the no_improve_count variable exceeds the @max_no_improvements parameter.

The take_step(problem, current, step_size, index, add) method creates a single new candidate solution from a given solution by mutating a specific component of the solution in a specific direction. Interestingly, this is the only problem-specific part of the algorithm, that in turn calls the next_bfloat(min, max) utility method to generates random values that are used to perturb a given vector.
class MultipleRestartSearchAlgorithm
attr_accessor :max_iterations
attr_reader :best_solution

def initialize(max_restarts)
@max_restarts = max_restarts
@max_no_improvements = 100 # max steps for hill climber
end

# execute a random search on the provided problem
def search(problem)
curr_it = 0
begin
# generate a random solution
current = generate_random_solution(problem)
evaluate_candidate_solution(current, problem)
# hill climb until convergence
run_best = hill_climb(problem, current)
curr_it += 1
puts "> finished run #{curr_it} with: #{run_best.score}"
end until should_stop?(curr_it, problem)
return @best_solution
end

def should_stop?(curr_it, problem)
(curr_it >= @max_restarts) or problem.is_optimal?(best_solution.score)
end

def generate_random_solution(problem)
real_vector = Array.new(problem.dimensions) do
next_bfloat(problem.min, problem.max)
end
return Solution.new(real_vector)
end

def hill_climb(problem, current)
step_size = (problem.max-problem.min)*0.10
no_improve_count = 0
index = 0
begin
# take a step
step1 = take_step(problem, current, step_size, index, true)
evaluate_candidate_solution(step1, problem)
step2 = take_step(problem, current, step_size, index, false)
evaluate_candidate_solution(step2, problem)
# check for improvement
if problem.is_better?(step1.score, current.score) or
problem.is_better?(step2.score, current.score)
# store the best
if problem.is_better?(step1.score, current.score)
current = step1
else
current = step2
end
no_improve_count = 0 # reset
else
no_improve_count += 1 # count consecutative no improvements
end
index = (index==problem.dimensions-1) ? 0 : index+1
end until no_improve_count >= @max_no_improvements
return current
end

def take_step(problem, current, step_size, index, add)
vector = nil
begin # keep stepping until a valid point is generated
offset = next_bfloat(0, step_size)
vector = Array.new(current.data)
vector[index] += (add) ? offset : -offset
end until problem.in_bounds?(vector)
return Solution.new(vector)
end

def next_bfloat(min, max)
min + ((max - min) * rand)
end

def evaluate_candidate_solution(solution, problem)
solution.score = problem.evaluate(solution.data)
# keep track of the best solution found
if @best_solution.nil? or
problem.is_better?(solution.score, @best_solution.score)
@best_solution = solution
end
end
end
The algorithm is demonstrated by first seeding the global random number generator to ensure a consistent sequence of random numbers (useful for testing) and creating instance of the problem and algorithm. The search is executed by passing the problem instance to the algorithm's search method that displays the status of the algorithms as the best result found after each restart. The demonstration ends by displaying an approximation of the problem's global optima (which is 0.0 in each dimension) as the best solution encountered by the algorithm.
srand(1) # set the random number seed to 1
algorithm = MultipleRestartSearchAlgorithm.new(10) # limit to 10 restarts
problem = SquaringFunction.new(5) # create a problem with 5 dimensions
best = algorithm.search(problem) # execute the search
puts "Best Solution: #{best}" # display the best solution
This general procedure is called a meta-search algorithm as it can generally be applied to any other direct search technique. The demonstration may be extended by elaborating upon the hill climbing algorithm such as using an adaptive rather than a fixed step size. Further, the embedded hill climbing (local search) algorithm may be replaced with a population-based global stochastic search algorithm. Additionally, restarted algorithm runs may exploit the results found by previous runs, and alternate restart schedules may be used such as a maximum number of function evaluations or relative improvement over previous runs.

Source Code
Download: MultipleRestartSearch.rb

Further Reading

Papers:
Sites:

Have you found a bug? Know of a great reference? Got an idea for a guide? Please send an email to jasonb@InspiredAlgorithms.com.

Need Help? Want to discuss this post? Please visit the Inspired Algorithms User Group.

Thursday, January 15, 2009

Adaptive Random Search: Varying step size based on performance

The Random Search algorithm involves the blind generation and evaluation of candidate solutions to a problem, where as the Localized Random Search algorithm makes the simple addition of successively varying the best solution found by the search so far. The Adaptive Random Search algorithm, like Localized Random Search is concerned with searching the neighborhood of the best solution found by the search so far, although unlike that algorithm the definition of neighborhood is not fixed. Instead it is changed dynamically over the course of the search.

When exploring the neighborhood of a candidate solution, the size of the steps taken within the neighborhood may be fixed in what is referred to as Fixed Step Size Random Search. The Adaptive Random Search algorithm involves varying the step size during the search in response to the relative improvements or non-improving steps as defined by the objective function evaluation. A small number of samples are taken each iteration with varied small and large step sizes in an attempt to approximate an optimal step size to converge the search to the local optimum. The step size that generate an improved candidate solution is adopted as the current step size for the search, and improved solutions are taken as the current working position. When a given step size stops producing improvements, its size is reduced to encourage the search algorithm to more carefully explore the smaller localized neighborhood. Larger step sizes are frequently explored in an effort to potentially escape local optima, and very large step sizes are sampled with a low frequency in an effort to explore distant regions of the search space.

Adaptive Random Search for Continuous Function Optimization
This guide provides an example of the Adaptive Random Search algorithm applied to an instance of the continuous function optimization problem.

The ExponentFunction class defines an instance of a function optimization problem with a variable number of function parameters or dimensions. The evaluate(vector) method expects a vector of real values in the problem bounds and evaluates it as the sum of the fourth power of each function parameter. The global optimum of this minimization function is located at 0.0 which is accessible via the optimal_score method and is used by is_optimal?(scoring) to test whether the algorithm has located to the global optimum. Finally, the problem provides a in_bounds?(vector) method for assessing whether a given vector falls within the hypercube of the valid problem space, useful for algorithms like Adaptive Random Search that generate new samples from existing samples and may stray out of the problem's bounds.
class ExponentFunction
attr_reader :dimensions, :min, :max

def initialize(dimensions=2)
@dimensions = dimensions
@min, @max = -5.12, +5.12
end

def evaluate(vector)
vector.inject(0) {|sum, x| sum + (x ** 4.0)}
end

def in_bounds?(vector)
vector.each {|x| return false if x>@max or x<@min}
return true
end

def is_optimal?(scoring)
scoring == optimal_score
end

def optimal_score
0.0
end

# true if s1 has a better score than s2
def is_better?(s1, s2)
s1 < s2 # minimizing
end
end
The Solution class provides a convenience container for holding immutable real-valued vectors in the @data instance variable, and a mutable @score instance variable for holing the costing of the vector assigned by a problems objective function.
class Solution
attr_reader :data
attr_accessor :score

def initialize(data)
@data = data
@score = 0.0/0.0 # NaN
end

def to_s
"[#{@data.inspect}] (#{@score})"
end
end
The AdaptiveRandomSearchAlgorithm class provides a reasonably generic implementation of the algorithm. The initialize(max_iterations) constructor expects a maximum number of algorithm iterations as an argument and goes on to define a set of algorithm configuration parameters as class instance variables. The search(problem) method is large and contains all the problem agnostic conditional logic of the algorithm. The search starts by generating a random starting point for the search (current) and setting an initial step size of 10% of the size of the search domain. The main loop of the algorithm is concerned with first generating a new sample using the step_size and a second sample using a larger step size. The larger step size is fixed at @small_factor * step_size although periodically (once every @large_factor_multiple iterations) a much larger factor (@large_factor) of the current step size is used.

If one of the generated steps results in a candidate solution better than the current solution it is replaced. If the larger step size resulted in the improvement, it is used the current step size. If neither generated step sizes resulted in an improved candidate solution, a count of non-improving steps is incremented. If this count exceeds @maximum_no_improvements then the current step size is reduced by a constant factor (step_size / @small_factor).

The generate_random_solution(problem) method used for the starting point of the search generates a random vector in the bounds of the problem space. The take_step(problem, current, step_size) is responsible for stochastically generating a new solution from a provided existing solution within a neighborhood defined by the provided step_size. This works by generating a random vector between +/- step_size which is then added to the current sample. This means the neighbourhood is a step_size sized hypercube around the origin. Steps are taken by this method until a valid (within the problem bounds) candidate solution is generated.
class AdaptiveRandomSearchAlgorithm
attr_accessor :max_iterations
attr_reader :best_solution

def initialize(max_iterations)
@max_iterations = max_iterations
@small_factor = 1.3 # normal step size incrase(*) and decrease(/) rate
@large_factor = 10 # step size factor increase for larger jump
@large_factor_multiple = 100 # max steps before larger jump is tried
@maximum_no_improvements = 50 # max non-improvement steps before size decreased
end

# execute a random search on the provided problem
def search(problem)
# random starting point
current = generate_random_solution(problem)
evaluate_candidate_solution(current, problem)
curr_it = 0
step_size = (problem.max-problem.min)*0.1 # 10% of the domain
no_change_cnt = 0
begin
# current step
step = take_step(problem, current, step_size)
evaluate_candidate_solution(step, problem)
# take second larger step
factor = @small_factor
factor = @large_factor if curr_it.modulo(@large_factor_multiple) == 0
larger_step_size = step_size * factor
larger_step = take_step(problem, current, larger_step_size)
evaluate_candidate_solution(larger_step, problem)
# check for improvement
if problem.is_better?(step.score, current.score) or
problem.is_better?(larger_step.score, current.score)
# select new step size if larger step is better
if problem.is_better?(larger_step.score, step.score)
step_size = larger_step_size # increase step size
current = larger_step # new best solution
else
current = step # new best solution
end
no_change_cnt = 0 # reset counter
# check for step size decrease
elsif (no_change_cnt+=1) >= @maximum_no_improvements
no_change_cnt = 0 # reset counter
step_size /= @small_factor # decrease step size
end
curr_it += 1
end until should_stop?(curr_it, problem)
return @best_solution
end

def should_stop?(curr_it, problem)
(curr_it >= @max_iterations) or problem.is_optimal?(best_solution.score)
end

def generate_random_solution(problem)
real_vector = Array.new(problem.dimensions) do
next_bfloat(problem.min, problem.max)
end
return Solution.new(real_vector)
end

def take_step(problem, current, step_size)
vector = nil
begin # keep stepping until a valid point is generated
step = Array.new(problem.dimensions) do
v = next_bfloat(-step_size, +step_size)
end
data = current.data
vector = Array.new(data.length) {|i| data[i] + step[i]}
end while !problem.in_bounds?(vector)
return Solution.new(vector)
end

def next_bfloat(min, max)
min + ((max - min) * rand)
end

def evaluate_candidate_solution(solution, problem)
solution.score = problem.evaluate(solution.data)
# keep track of the best solution found
if @best_solution.nil? or
problem.is_better?(solution.score, @best_solution.score)
@best_solution = solution
puts "> new best: #{solution.score}"
end
end
end
The algorithm is demonstrated by first seeding the global random number number generator to 1 to ensure the same sequence of random numbers is ued for each run. A new instance of the algorithm is created allowing a maximum of 5000 iterations, and an instance of the ExponentFunction problem is created in 5 dimensions. The search is executed by passing the problem to the algorithm, and the best result achieved by the algorithm is printed once the search is completed.
srand(1) # set the random number seed to 1
algorithm = AdaptiveRandomSearchAlgorithm.new(5000) # limit to 5000 iterations
problem = ExponentFunction.new(5) # create a problem with 5 dimensions
best = algorithm.search(problem) # execute the search
puts "Best Solution: #{best}" # display the best solution
Once the general approach is understood, one may begin playing with the parameters defined in the algorithms constructor that govern the dynamic adaptation of the step size. For example, smaller or larger factors may be used, and the frequency of large or small step size increase and decreases may be varied. The acceptance criteria for when the current working solution is replaced may be made probabilistic rather than deterministically greedy. Finally, one may experiment with hyper-spherical sampling neighbourhoods in the next_step function and with sampling distributions other than uniform such as Gaussian with the origin as the mean and the step_size as the standard deviation.

Source Code
Download: AdaptiveRandomSearch.rb

Further Reading

Papers:
Sites:

Have you found a bug? Know of a great reference? Got an idea for a guide? Please send an email to jasonb@InspiredAlgorithms.com.

Need Help? Want to discuss this post? Please visit the Inspired Algorithms User Group.