Lecture 7: Real-valued Coded Evolutionary Algorithms

Arguments for Binary Coding

"The binary alphabet maximises the level of implicit parallelism"

A Schema is a template that identifies a subset of strings with similarities at certain string positions

Chromosome 1 1 0 1 1 0 1 1 0
Chromosome 2 0 0 1 0 0 1 0 0
Schema 1 * 0 * * 0 1 * 0
Schema 2 * * * * 0 1 * *

Where * is a wildcard representing 0 or 1

Implicit Parallelism

If a chromosome is of length LL then it contains 3L3^L schemata as for each position there are 3 possible values {0,1,}\{0,1,*\}. Therefore, for a population of MM individuals we will have to evaluate up to M3LM\cdot 3^L schemata.

For example, the decimal number 4 is represented as 100 in binary encoding, which contains the following schemata: 00,10,1,0,0 and 10*00, 1*0, 1**, **0, *0* \text{ and } 10*

Some schemata are fitter than others, by our selection process we can produce a population of entirely fitter schemata

We derive implicit parallelism from the fact that we are not only evolving MM individuals but also manipulating M3LM\cdot 3^L schemata. This fact means that with binary coding we require fewer strings to construct more schemata to sample larger search spaces

Drawbacks of Binary Coding

One drawback of binary coding is known as the Hamming cliff problem:

Solution to the Hamming cliff problem

The solution is called Gray encoding. This is the process by which encoding of numbers is done such that adjacent numbers have a single digit differing by 1

For a{0,1}La \in \{0,1\}^L and b{0,1}Lb\in \{0,1\}^L where aa us the standard binary encoded string and bb is Gray encoded, then:

bi={aiif i=1ai1aiif i>1b_i = \begin{cases} a_i & \text{if } i =1 \\ a_{i-1} \otimes a_i & \text{if } i > 1 \end{cases}

where \otimes means exclusive or

Further issues with Binary Encoding

We also have issues with binary encoding with discrete search spaces:

I.e. when we have a search space of a size not equal to a power of 2 then we require a binary representation that is more powerful than the space requires

And in continuous spaces:

h(Ki)=ui+Kiviui2Ln1h(K_i) = u_i + K_i \cdot \frac{v_i-ui}{2^{\frac{L}{n}}-1}

Here clearly the precision depends on LL, meaning that this process might produce difficulties if the search space has a high dimensionality (nn is large) and requires greater numerical precision. The smaller the interval bound, the more precise the process is but also the more computationally expensive it is

Real valued vector representation

In continuous optimisation problems, real-valued vector representations are a natural way to represent solutions.

With this method of representation, there are no differences between genotypes and phenotypes. They are both real-valued vectors representing chromosomes e.g. x=[x1,x2,,xn]Rn\vec{x} = [x_1,x_2,\ldots,x_n] \in \R^n
Each gene in the chromosome represents a variable in the problem. The precision is not restricted by the decoding/ encoding functions

Evolution Strategies, Evolutionary Programming and Differential Evolution are all based on real-valued vector representation

Advantages include:

Real valued mutation

Randomly select parents with a probability pm[0,1]p_m \in [0,1] for mutation and randomly select a gene cic_i and apply the mutation operator

Real number mutation operators include:

Uniform/ Gaussian Real valued mutation

Uniform Mutation

Replace cic_i with a uniformly random number cic_i' generated from the interval bound of the variable xi[ui,vi]x_i \in [u_i,v_i]

Gaussian Mutation

Replace cic_i with cic_i' which is calculated from:

cimin(max(N(ci,σi),ui),vi)c_i' \min(\max(N(c_i,\sigma_i),u_i),v_i)

where N(ci,σi)N(c_i,\sigma_i) is a gaussian (normal) distribution with a mean cic_i and standard deviation σi\sigma_i which ma depend on the length li=viuil_i = v_i - u_i of the interval bound and typically σi=110li\sigma_i = \frac{1}{10}l_i

Non-Uniform Mutation

In non-uniform mutation the replacement cic_i' for cic_i is calculated from:

ci={ci+Δ(t,vici)if τ0.5ciΔ(t,ciui)if τ<0.5c_i' = \begin{cases} c_i + \Delta (t, v_i - c_i) & \text{if } \tau \geq 0.5 \\ c_i - \Delta (t,c_i - u_i) & \text{if } \tau < 0.5 \end{cases}

where tt is the number of current generation solutions and τ\tau is a random number in the range [0,1][0,1] and :

Δ(t,y)=y(1r1tgm)b\Delta (t,y) = y(1-r^{1-\frac{t}{g_m}})^b

where rr is a random number in the range [0,1][0,1], gmg_m is the maximum number of generations and bb is a constant.

Real Valued Crossover

Randomly select two parents x1={xi[1],x2[1],,xn[1]}\vec{x_1} = \{x_i^{[1]},x_2^{[1]},\ldots, x_n^{[1]}\} and x2={xi[2],x2[2],,xn[2]}\vec{x_2} = \{x_i^{[2]},x_2^{[2]},\ldots, x_n^{[2]}\} , then apply a crossover operators

Crossover operators include:

Flat crossover

One offspring h={h1,h2,,,hn}\vec{h} = \{h_1,h_2,,\ldots,h_n\} is generated where hih_i is a uniformly randomly generated value in the interval [xi[1],xi[2]][x_i^{[1]},x_i^{[2]}]. If xi[1]<xi[2]x_i^{[1]}<x_i^{[2]} or [xi[2],xi[1]][x_i^{[2]},x_i^{[1]}] if xi[2]<xi[1]x_i^{[2]} < x_i^{[1]}

Simple Crossover

A crossover point i{1,,n}i \in \{1,\ldots,n\} is randomly chosen and the variables beyond this point are swapped to create to new offspring:

h1={x1[1],x2[1],,xi[1],xi+1[2],,xn[2]}\vec{h_1} = \{ x_1^{[1]},x_2^{[1]},\ldots,x_i^{[1]},x_{i+1}^{[2]},\ldots,x_n^{[2]}\}
h2={x1[2],x1[2],,xi[2],xi+1[1],,xn[1]}\vec{h_2} = \{ x_1^{[2]},x_1^{[2]},\ldots,x_i^{[2]},x_{i+1}^{[1]},\ldots,x_n^{[1]}\}

Whole arithmetic crossover

Two offspring hk{hik,h2k,,hnk}\vec{h_k} - \{h_i^k, h_2^k, \ldots, h_n^k\}, k=1,2k = 1,2 are generated, where hi[1]=αxi[1]+(1α)xi[2]h_i^{[1]} = \alpha x_i^{[1]} + (1-\alpha)x_i^{[2]} and hi[2]=αxi[2]+(1α)xi[i]h_i^{[2]} = \alpha x_i^{[2]} + (1-\alpha)x_i^{[i]} and parameter α\alpha is a random number in the range of [0,1][0,1]

Local arithmetic crossover

The same as whole arithmetic crossover, except αRn\alpha \in \R^n is a vector of which each element is random number in the range [0,1][0,1]

Single arithmetic crossover

Choose gene and then replace it with the arithmetic average of genes at the position of two parents, other genes are copied from the parents.

BLX-α\alpha crossover

An offspring s generated: h={h1,h2,,hn}\vec{h} = \{h_1,h_2,\ldots,h_n\} where hih_i is a uniformly randomly generated number over the interval [hminIα,hmax+Iα][h_{\min} - I \cdot \alpha, h_{\max} + I \cdot \alpha],
hmax=max(xi[1],xi[2])h_{\max} = \max(x_i^{[1]},x_i^{[2]}) , hmin=min(xi[1],xi[2])h_{\min} = \min(x_i^{[1]},x_i^{[2]}) and I=hmaxhminI = h_{\max} - h_{\min}