2014年10月18日 星期六

Discrete UMDA oneMax algorithm using Mateda2.0

This blog is about how to run a discrete UMDA for solving oneMax problem using Mateda2.0

The following is the matlab code.

PopSize = 10;  # no. of population
n = 3; # no. of ones, or length of the string of the oneMax problem
cache  = [1,1,1,1,1]; # matrix to record 1th one.Entire population, 2nd one. Selected population, 3rd one. Probabilistic model, 4th one. Fitness values of the entire population and 5th one. Fitness values of the selected population. Change the value from one to zero for turning off the logging.

Card = 2*ones(1,n); #vector of dimension of the discrete variables, or a vector showing number of possible values each node can hold, in this case, each node can hold either '0' or '1'


MaxGen=5; # Maximum number of generation to stop the algorithm

MaxVal=n; # Maximum value of the fitness function to stop the algorithm

stop_cond_params={MaxGen,MaxVal}; #Define the stopping criteria with the above two settings.


NumbVar=n; #Number of variables 

Cliques=CreateMarkovModel(NumbVar,0); #This creates a markov chain model with each variable/node depending on how many previous variables (i.e. specify by '0' in this case meaning all variables are independent.

edaparams{1}={'learning_method','LearnFDA',{Cliques}}; #define LearnFDA as the learning method, which computes the marginal frequencies for each variable.

edaparams{2} = {'sampling_method','SampleFDA',{PopSize}}; #define sampling method
edaparams{3} = {'stop_cond_method','maxgen_maxval',stop_cond_params}; #define stopping conditions
F = 'sum'; % Onemax function;
[AllStat,Cache]=RunEDA(PopSize,n,F,Card,cache,edaparams) #Run the EDA using above settings.

Below is the result:


General Settings

PopSize: 10
Number of variables: 3
Function: sum
seeding_pop_method:  RandomInit
Params: 
sampling_method:  SampleFDA
    'Params: '    [10]
repairing_method:  none
Params: 
local_opt_method:  none
Params: 
replacement_method:  best_elitism
Params: fitness_ordering
selection_method:  truncation_selection
    'Params: '    [0.5000]    'fitness_ordering'
learning_method:  LearnFDA
    'Params: '    [3x3 double]
statistics_method:  simple_pop_statistics
    'Params: '    'fitness_ordering'
verbose_method:  simple_verbose
Params: 
stop_cond_method:  maxgen_maxval
    'Params: '    [5]    [3]

*****************  Generation 1 ********************** 

Max objective values: 2 

Sum of max objective values: 2 
Mean objective values: 1.400000e+00 
Sum of mean objective values: 1.400000e+00 
Median objective values: 1 
Sum of median objective values: 1 
Min objective values: 1 
Sum of min objective values: 1 
Variance of the objective values: 2.666667e-01 
Best individual: 0 1 1 
Number of different individuals: 5 
Max values of the variables: 1 1 1 
Mean values of the variables: 2.000000e-01 7.000000e-01 5.000000e-01 
Median values of the variables: 0 1 5.000000e-01 
Min values of the variables: 0 0 0 
Variance of the variables: 1.777778e-01 2.333333e-01 2.777778e-01 
number of evaluations: 10 
Time: sampling 0, repairing 0, evaluation 0, local_opt 0, replacement 0, selection 0, learning 0, and total 1.560010e-02 

*****************  Generation 2 ********************** 

Max objective values: 3 

Sum of max objective values: 3 
Mean objective values: 2 
Sum of mean objective values: 2 
Median objective values: 2 
Sum of median objective values: 2 
Min objective values: 1 
Sum of min objective values: 1 
Variance of the objective values: 2.222222e-01 
Best individual: 1 1 1 <<<< best solution found then stop iteration
Number of different individuals: 5 
Max values of the variables: 1 1 1 
Mean values of the variables: 3.000000e-01 8.000000e-01 9.000000e-01 
Median values of the variables: 0 1 1 
Min values of the variables: 0 0 0 
Variance of the variables: 2.333333e-01 1.777778e-01 1.000000e-01 
number of evaluations: 20 
Time: sampling 3.120020e-02, repairing 0, evaluation 0, local_opt 0, replacement 0, selection 0, learning 0, and total 3.120020e-02 

We should see the following output matrices 
AllStat = 
    [5x1 double]    [1x3 double]    [5]    [5x3 double]    [10]    [1x8 double]
    [5x1 double]    [1x3 double]    [5]    [5x3 double]    [20]    [1x8 double]

AllStat stores information about population in each generation. For each generation, it provides six groups of information and we take the above case as an example such that:

1st generation
[2;1.40;1;1;0.267] [0,1,1] 5 5x3 double 10 [0,0,0,0,0,0,0,0.0156]
2nd generation
[3;2;2;1;0.22] [1,1,1] 5 5x3 double 20 [0.0312002000000007,0,0,0,0,0,0,0.0312]

There are two rows because two generations of population were computed only. For each row, it contains the following (Take the 1st row as an example):

1st item: [2;1.40;1;1;0.267] are max,mean,median,min and variance of the fitness function in that population
2nd item: [0,1,1] provides the best individual in the population
3rd item: 5 is the no.of different individuals in the population
4th item: 5x3 double provides a matrix of 5 rows x no.of variables (i.e.3). The five values in each rows provide max,mean,median,min and variance of each variables. Values found in the 1st generation:
(max)1 1 1
(mean)  0.20 0.70 0.50
(median)0 1 0.50
(min)   0 0 0

(var)   0.17 0.23 0.278
5th item: 10, this is no. of function evaluations until this generation. 
6th item: [0,0,0,0,0,0,0,0.0156], time for each of the steps in seconds:sampling, repairing, evaluation, local optimization, replacement, selection, learning and total. 

Cache = 

    [10x3 double]    [10x3 double]
    [ 5x3 double]    [ 5x3 double]
    { 1x2 cell  }    { 1x2 cell  }
    [10x1 double]    [10x1 double]
    [ 5x1 double]    [ 5x1 double]

The following is result of the cache:

10x3 double       10x3 double           (Entire population)
5x3 double                 5x3 double            (Selected population)
1x2 cell                 1x2 cell              (Probabilistic model)
[2;1;2;1;1;1;1;2;2;1] [3;2;2;2;2;2;2;2;2;1] (Fitness values of the population)

[2;2;2;2;1]         [3;2;2;2;2]           (Fitness values of selected pop)

Regarding the probabilistic model, I have the following:
1st generation

[0,1,1;0,1,2;0,1,3] This is the structure of the clique. This indicates three independent variables.
1x3 cell 

[0.714285714285714,0.285714285714286] This indicates probability of '0' and '1'
[0.285714285714286,0.714285714285714] ditto
[0.285714285714286,0.714285714285714] ditto

It is interesting to see the probability of getting '0' is higher than that of '1' in the 1st variable. Supposingly, '1' should get the higher probability. This situation is reversed meaning the probability of getting '1' for all variables becomes high when the number of generation is high, say 5.


Nevertheless, once the probabilistic model indicating bias to '1', it is not necessary gain higher probability, higher fitness value can easily be generated.


ENDS.













沒有留言: