methodology.txt

What story am I trying to tell here?

The language of NK-Landscapes has enough expressive power to evolve fitness functions that favor a user selected black box search algorithm above all other black box search algorithms in a set.

10000 inner evals
mu = 100, lambda = 10 for EA (SUS, rank based fitness)
linear cooling schedule for SA

30 inner runs

32 bit genome length
starting k = 8

1000 outer evals, mu = 100, lambda = 10


NK-Landscapes

Rationale
When using NK-landscapes, typically the experiement is set up to use a large number of randomly generated landscape to lessen the impact of any one landscape on the results. This line of research, however, is explicitly searching for fitness landscapes that have an outsized impact. Each NK-landscape generated is evaluated on its ability to discriminate amongst a set of search algorithms. A high scoring landscape is one that shows a clear preference for one of the search algorithms, such that the chosen algorithm consistently finds a better solution than all other algorithms in the set. This is determined via the following methodology.

Implementation
In this paper, NK-landscapes are implemented as a pair of lists. The first list is the neighbors list. The neighbors list is n elements long, where each element is a list of k+1 integer indexes. Each element of the i'th inner list is a neighbor of i, and will participate in the calculation of that part of the overall fitness. A important implementation detail is that the first element of the i'th inner list is i, for all i. Making the element under consideration part of the underlying data (as opposed to a special case) simplifies and regularizes the code, an important consideration when metaprogramming is used. A second important implementation detail is that no number may appear in a neighbor list more than once. This forces the importance of a single index point to be visible in the second list, allowing for easier analysis. The first list is called the neighborses list, to indicate the nested plurality of its structure.

The second list is the subfunctions list. The subfunctions list is used in conjunction with the neighborses list to determine the overall fitness of the individual under evaluation. The subfunction list is implemented as a list of key value stores, of list length n. Each key in the key value store is a binary tuple of length k, with every possible such tuple represented in every key value store. For example, if k is 2, then the possible keys are (0, 0), (0, 1), (1, 0), and (1, 1). The values for each key are real numbers, both positive and negative.

Evaluation of a Bit String Individual
To evaluate an individual, the system runs down the pair of lists simultaneously . For each element in neighborses, it extracts the binary value of the individual at the listed indexes in the first list. It assembles those binary values into a single tuple. It then looks at the corresponding subfunc key value store in the subfuncs list and finds the value associated with that tuple. The sum of the values found for each element in the pair of lists is the fitness of that individual in the context of this NK-landscape.

Part of the design consideration for this structure was ease of metaprogramming for CUDA. The various components of the lists plug into a string template of C++ code, which is then compiled into a CUDA kernel. This kernel can then be run against a large number of individuals simultaneously. This approach is not expected to be as fast as a hand-tuned CUDA kernel that pays proper respect to the various memory subsystems available, however it has shown to be faster than running the fitness evaluations on the CPU, given a sufficiently large number of individuals in need of evaluation.

Evolutionary Operators
The search algorithm chosen to guide the modification of the NK landscapes is a canonical mu+lambda evolutionary algorithm, with stochastic universal sampling used for both parent selection and survival selection. Such an algorithm needs to be able to mutate an individual, perform genetic crossover between individuals, and determine the fitness of an individual. Mutation and crossover are intrinsic to the representation of the individual, and will be covered first. Fitness evaluation is left for a later section.

Mutation of an NK-landscape is performed in three ways, and during any given mutation event all, some, or none of the three ways may be used. The first mutation method is to alter the neighbors list at a single neighborses location. This does not alter the length of the list, nor may it ever alter the first element in the list. The second mutation method is to alter the subfunc at a single location. All possible tuple keys are still found, but the values associated with those keys are altered by a random amount.

The third mutation method alters k. When k is increased, each element of the neighborses list gains one randomly chosen neighbor, with care taken that no neighbor can be in the same list twice, nor can k ever exceed n. Increasing k by 1 doubles the size of the subfunc key value stores, since each key in the parent has two corresponding entries in the child key, one ending in 0, the other ending in 1. For example the key (0, 1) in the original NK-landscape would need corresponding entries for (0, 1, 0) and (0, 1, 1) in the mutated NK-landscape. This implementation starts with the value in the original key and alters it by a different random amount for each entry in the mutated NK-landscape.

When k is decreased, a single point in the inner lists is randomly chosen, and the neighbor found at that point is removed from each of the lists. Care is taken so that the first neighbor is never removed, so k can never be less than zero. The corresponding entry in subfuncs has two parents, for example if the second point in the inner list is chosen, then both (0, 1, 0) and (0, 0, 0) will map to (0, 0) in the mutated NK-landscape. This implementation averages the values of the two parent keys for each index in the subfuncs list.

Genetic crossover is only possible in this implementation between individuals of identical n and k, via single point crossover of the neighborses and subfuncs lists. The system is therefore dependent on mutation to alter k, and holds n constant during any given system run.

Evaluation of NK-Landscape Fitness
All of this NK-Landscape manipulation infrastructure is used to evolve landscapes that clearly favor a given search algorithm over all other algorithms in a set. Accordingly, a fitness score must be assigned to each NK-Landscape in a population, so that natural selection can favor the better landscapes, guiding the meta-search towards an optimal landscape for the selected search algorithm.

This implementation defines the fitness of the NK-landscape as follows. First, the 'performance' of each search algorithm is found. The performance is defined as the mean of the fitness values of the optimal solutions found across several (n=30) runs. While performance is being calculated all search algorithms also record the fitness of the worst individual they ever encountered in the NK-landscape. This provides a heuristic for an unknown value: the value of the worst possible individual in the NK-landscape. Once a performance value has been calculated for every search algorithm in the set, the performance values and 'worst ever encountered' value are linearly scaled into the range [0, 1], such that the worst ever encountered value maps to zero and the best ever encountered value maps to one. This provides a relative measure of the performance of the various search algorithms, as well as allowing for fair comparisons between NK-landscape.

Once each search algorithm has a normalized performance value, the system needs to judge how well this NK landscape 'clearly favors a given search algorithm over all other algorithms in the set'. This implementation tried two approaches, one more successful than the other.

One versus All
The first heuristic used in this implementation wass to calculate the set of differences between the performance of the favored algorithm and the performance of each of the other algorithms, and then find the minimum of the set of differences. The minimum of the set of differences is then used as the fitness of the NK-landscape. Note that the fitness ranges from -1 to 1. A fitness of 1 would correspond to an NK-landscape where the optimal individual found by all of the non-favored algorithms has identical fitness to the 'worst ever encountered' individual, while the favored algorithm finds any individual better than that. This approach suffered because it needed a 'multiple coincidence' to make any forward progress. The use of the minimum function meant that an NK-landscape needed to clearly favor one algorithm over all others. Favoring a pair of algorithms over all others was indistinguishable from not favoring any algorithms at all, so there was no gradient for the meta-search algorithm to climb.

Pairwise Comparisons
Making pairwise comparisons and allowing the meta-search algorithm to simply compare the normalized optimal indviduals between two search algorithms proved to provide a better gradient. Since any change in the relative fitness of the optimal individuals was reflected in the fitness score of the NK-landscape, the meta-search had immediate feedback, not needing to cross plateaus of unchanging fitness.

Algorithms
Random Search simply generates random individuals and records their fitness until it runs out of evals. The individual found with highest fitness is deemed optimal.

Climb Hill is a steepest ascent restarting hill climber. It starts from a random location, and at each time step it examines all of its neighbors and moves to neighbor with highest fitness. Should it find itself at a peak, a point at which all neighbors are downhill, then it starts over from a new random location. Note that this algorithm is vulnerable to plateaus, and will wander instead of restart.

At each time step, Simulated Annealing (SA) picks a random neighbor. If that neighbor has a higher fitness than the current location, then SA moves to that location. If that neighbor has a lower or equal fitness, then SA still may move to that location, with probability determined by a 'cooling schedule'. Earlier in the run, SA is more likely to move downhill. SA does not restart.

As implemented for this paper, Evolutionary Algorithm (EA) is a mu+lambda (mu=100, lambda=10) evolutionary algorithm, using linear ranking (s=2.0) stochastic universal sampling for both parent selection and survival selection.

All algorithms are allowed 10000 fitness evaluations.


Distributed Architecture
Taking advantage of the parallelizable nature of evolutionary algorithms, this implementation used an asynchronous message queue (beanstalkd) and a web server (nginx) to distribute the workload across a small cluster of high performance hardware. Each of the four machines in the cluster has a quad-core processor and two CUDA compute cards. A worker node system was developed whereby the head node could place a job request for a particular search algorithm to be run against a particular NK landscape. This job request was put into the message queue, to be delivered to the next available worker node. When the worker node received the request, it first checked if it had a copy of the requested NK-landscape in a local cache. If not, the worker node used the index number of the NK-landscape to place a web request with the head node and download the data sufficient to recreate the NK-landscape and place it in the local cache. The use of the web based distribution channel was necessary because the representation of an NK-landscape can grow very large, and the chosen message queue has strict message size limits. For the results presented in this paper, the distributed system ran using 16 worker nodes utilizing the CPU cores, plus another 8 worker nodes running their fitness functions on the CUDA compute cards.

Interestingly, the CUDA-based worker nodes did not outperform the CPU based worker nodes. The CUDA-based fitness evaluation is very fast, but moving the number of individuals that need evaluated needs to be high before the speed difference becomes apparent. The exception is the random search algorithm, which was rewritten for CUDA-based evaluation. Since each of the evaluations is independent, all evaluations can happen in parallel.


Results

Four graphs for the One vs All approach showing fitness and k of the best individual at that time

fitness on the left, BOLD the mindiff line, k on the right, dotted line


Set the graphs up to show the minimums thing
EA vs all
SA vs all
CH vs all
RA vs all

Four graphs, each with three lines, for the pairwise results, showing fitness
EA over SA, CH, RA
SA over EA, CH, RA
CH over EA, SA, RA
RA over EA, SA, CH


Discussion


THINGS THAT NEED LOGGED TO MAKE GRAPHS:
At each EA eval, I need the comparative fitness (4 numbers) and k of the best landscape in the population.