Natural Computation Methods for Machine Learning Note 13

2020年3月20日 2873点热度 0人点赞 0条评论

Natural Computation Methods for Machine Learning Note 13

Properties of EC

  • EC is parallel search (many balls in the landscape).
    • Want to prevent all individuals from converging to the same
      solution (premature convergence – a common problem)
    • The task of mutation is to prevent this – to maintain diversity in the
      population. To inject new material in the population.
  • EC is unlikely to get stuck in local optima.
    • due to the parallel search, and since it does not follow gradients
    • mutation helps
  • EC does not require the objective function (nor any other function) to be differentiable.
    • The objective function (fitness) is used only in selection of individuals, not to decide how to change them.
  • RL is perhaps more clever in its search (more directed), but EC compensates by its more global view.

EC (in particular GA) can be a very effective way to search for the best combination of parameters to control/define something.

An ANN is a parameterized system ...

EC and neural networks

EC (in particular GA) can be a very effective way to search for the best combination of parameters to control/define something.

An ANN is a parameterized system ==> Neuro evolution

  • GA can be used to evolve, rather than train, artificial neural networks
    • (crossover easily becomes destructive, though, explain why)
  • Can evolve structure as well, not only weight values
  • Google this: NeuroEvolution of Augmented Topologies (NEAT)

  • Could combine the two forms of learning – evolve initial network configurations and training parameters, then train the networks using neural learning algorithms (≈ Lamarckian evolution).
  • EC does not have to follow natural laws – could have more than two parents, Lamarckian evolution (where learned experience is hereditary), etc.

EC Issues

  • Generational or steady state? (last lecture taught the generational view)
  • Population size, constant or variable (if so, how?)
  • Introns, useful or not?
  • The importance of mutation
  • Is crossover constructive or just a form of macro mutation?
  • Crossover moves parts of genotypes – finding a good encoding under this operator can be difficult.

    • Is a substring moved from A to B still meaningful in B?
  • Evaluation (the fitness function) should be simple and efficient
    • This is where EC spends most computation time
    • Necessary to evaluate the whole population every generation?

Genetic programming

GP = EC where genotype = parse tree (of a programming language)

Example: The expression 5 + 3x


GP operates directly on (subsets of) computer programs => semi automatic programming.

Crossover and mutation in GP

Crossover: Swap randomly selected subtrees in the two parents


Many forms of mutation

  • Function node: replace function with new of same arity
  • Terminal node: replace terminal
  • Swap: swap arguments of a function
  • Grow: replace node by randomly generated subtree
  • Gaussian: jog constants
  • Trunc: replace node by terminal

Target languages

  • Any language can be used, but Lisp is particularly suitable (explain why
  • Assembler/machine code (AIM-GP by Nordin et al) (project idea)
    • assembler instruction = integers
    • store in an unsigned integer array
    • cast to function pointer and call it
  • Redcode (Corewars)

Most methods are very specialized for its particular target language

Decision trees (common)

Here, GP has been used to evolve computer network protocols

Other applications: See (not updated since 2007
though). gives a better overview (but few applications)

Advanced topics

Automatically Defined Functions (ADF), how to deal with types, if-statements etc.

Grammatical evolution (GE) (Ryan & O'Neill)

GP methods often specific to their target language.

GE = language independent genetic programming

Takes a BNF grammar as input ==> can generate programs in any language!

Genetic operators, as in GA – the genotype is just a sequence of numbers


The meaning of codons (integers) is extremely context dependent.

Crossover destructive (variants that address this problem exist)

Possible application: Grammatical Swarm – Train the GE with a particle swarm instead of GA (has no crossover).

Dong Wang

I will work as a PhD student of TU Graz in Austria. My research interests include Embedded/Edge AI, federated learning, computer vision, and IoT.