"The binary alphabet maximises the level of implicit parallelism"
A Schema is a template that identifies a subset of strings with similarities at certain string positions
Chromosome 1
1
0
1
1
0
1
1
0
Chromosome 2
0
0
1
0
0
1
0
0
Schema 1
*
0
*
*
0
1
*
0
Schema 2
*
*
*
*
0
1
*
*
Where * is a wildcard representing 0 or 1
Implicit Parallelism
If a chromosome is of length L then it contains 3L schemata as for each position there are 3 possible values {0,1,∗}. Therefore, for a population of M individuals we will have to evaluate up toM⋅3L schemata.
For example, the decimal number 4 is represented as 100 in binary encoding, which contains the following schemata: ∗00,1∗0,1∗∗,∗∗0,∗0∗ and 10∗
Some schemata are fitter than others, by our selection process we can produce a population of entirely fitter schemata
We derive implicit parallelism from the fact that we are not only evolving M individuals but also manipulating M⋅3L schemata. This fact means that with binary coding we require fewer strings to construct more schemata to sample larger search spaces
Drawbacks of Binary Coding
One drawback of binary coding is known as the Hamming cliff problem:
1-bit mutation can make a large (or small) jump; a multi-bit mutation can make a small (or large) jump.
Solution to the Hamming cliff problem
The solution is called Gray encoding. This is the process by which encoding of numbers is done such that adjacent numbers have a single digit differing by 1
For a∈{0,1}L and b∈{0,1}L where a us the standard binary encoded string and b is Gray encoded, then:
bi={aiai−1⊗aiif i=1if i>1
where ⊗ means exclusive or
Further issues with Binary Encoding
We also have issues with binary encoding with discrete search spaces:
The redundancy problem: when the variables belong to a finite discrete set with a cardinal different from a power of 2, some binary strings are redundant, which correspond to infeasible solutions.
I.e. when we have a search space of a size not equal to a power of 2 then we require a binary representation that is more powerful than the space requires
And in continuous spaces:
The precision problem
Decoding function:
Divide a∈{0,1}L into n segments of equal length: si∈{0,1}nL,i=1,…,n
Decode each segment into an integer Ki,i=1,…,n where Ki=∑j=0nLsij⋅2j
Applying the decoding function h(Ki) which is essentially mapping the integer linearly onto the interval bound xi∈[ui,vi]:
h(Ki)=ui+Ki⋅2nL−1vi−ui
Here clearly the precision depends on L, meaning that this process might produce difficulties if the search space has a high dimensionality (n is large) and requires greater numerical precision. The smaller the interval bound, the more precise the process is but also the more computationally expensive it is
Real valued vector representation
In continuous optimisation problems, real-valued vector representations are a natural way to represent solutions.
With this method of representation, there are no differences between genotypes and phenotypes. They are both real-valued vectors representing chromosomes e.g. x=[x1,x2,…,xn]∈Rn
Each gene in the chromosome represents a variable in the problem. The precision is not restricted by the decoding/ encoding functions
Evolution Strategies, Evolutionary Programming and Differential Evolution are all based on real-valued vector representation
Advantages include:
Being simple, natural and fast as there is no need ot encode or decode
Better precision and better handling large dimension problems
Real valued mutation
Randomly select parents with a probability pm∈[0,1] for mutation and randomly select a gene ci and apply the mutation operator
Real number mutation operators include:
Uniform mutation
Non-uniform mutation
Gaussian mutation
Uniform/ Gaussian Real valued mutation
Uniform Mutation
Replace ci with a uniformly random number ci′ generated from the interval bound of the variable xi∈[ui,vi]
Gaussian Mutation
Replace ci with ci′ which is calculated from:
ci′min(max(N(ci,σi),ui),vi)
where N(ci,σi) is a gaussian (normal) distribution with a mean ci and standard deviation σi which ma depend on the length li=vi−ui of the interval bound and typically σi=101li
Non-Uniform Mutation
In non-uniform mutation the replacement ci′ for ci is calculated from:
where t is the number of current generation solutions and τ is a random number in the range [0,1] and :
Δ(t,y)=y(1−r1−gmt)b
where r is a random number in the range [0,1], gm is the maximum number of generations and b is a constant.
Real Valued Crossover
Randomly select two parents x1={xi[1],x2[1],…,xn[1]} and x2={xi[2],x2[2],…,xn[2]} , then apply a crossover operators
Crossover operators include:
Flat crossover
One offspring h={h1,h2,,…,hn} is generated where hi is a uniformly randomly generated value in the interval [xi[1],xi[2]]. If xi[1]<xi[2] or [xi[2],xi[1]] if xi[2]<xi[1]
Simple Crossover
A crossover point i∈{1,…,n} is randomly chosen and the variables beyond this point are swapped to create to new offspring:
Two offspring hk−{hik,h2k,…,hnk}, k=1,2 are generated, where hi[1]=αxi[1]+(1−α)xi[2] and hi[2]=αxi[2]+(1−α)xi[i] and parameter α is a random number in the range of [0,1]
Local arithmetic crossover
The same as whole arithmetic crossover, except α∈Rn is a vector of which each element is random number in the range [0,1]
Single arithmetic crossover
Choose gene and then replace it with the arithmetic average of genes at the position of two parents, other genes are copied from the parents.
BLX-α crossover
An offspring s generated: h={h1,h2,…,hn} where hi is a uniformly randomly generated number over the interval [hmin−I⋅α,hmax+I⋅α], hmax=max(xi[1],xi[2]) , hmin=min(xi[1],xi[2]) and I=hmax−hmin