Natural Computation Methods for Machine Learning Note 13

2020年3月20日 2431点热度 0人点赞 0条评论

Natural Computation Methods for Machine Learning Note 13

Properties of EC

  • EC is parallel search (many balls in the landscape).
    • Want to prevent all individuals from converging to the same
      solution (premature convergence – a common problem)
    • The task of mutation is to prevent this – to maintain diversity in the
      population. To inject new material in the population.
  • EC is unlikely to get stuck in local optima.
    • due to the parallel search, and since it does not follow gradients
    • mutation helps
  • EC does not require the objective function (nor any other function) to be differentiable.
    • The objective function (fitness) is used only in selection of individuals, not to decide how to change them.
  • RL is perhaps more clever in its search (more directed), but EC compensates by its more global view.

EC (in particular GA) can be a very effective way to search for the best combination of parameters to control/define something.

An ANN is a parameterized system ...

EC and neural networks

EC (in particular GA) can be a very effective way to search for the best combination of parameters to control/define something.

An ANN is a parameterized system ==> Neuro evolution

  • GA can be used to evolve, rather than train, artificial neural networks
    • (crossover easily becomes destructive, though, explain why)
  • Can evolve structure as well, not only weight values
  • Google this: NeuroEvolution of Augmented Topologies (NEAT)

  • Could combine the two forms of learning – evolve initial network configurations and training parameters, then train the networks using neural learning algorithms (≈ Lamarckian evolution).
  • EC does not have to follow natural laws – could have more than two parents, Lamarckian evolution (where learned experience is hereditary), etc.

EC Issues

  • Generational or steady state? (last lecture taught the generational view)
  • Population size, constant or variable (if so, how?)
  • Introns, useful or not?
  • The importance of mutation
  • Is crossover constructive or just a form of macro mutation?
  • Crossover moves parts of genotypes – finding a good encoding under this operator can be difficult.

    • Is a substring moved from A to B still meaningful in B?
  • Evaluation (the fitness function) should be simple and efficient
    • This is where EC spends most computation time
    • Necessary to evaluate the whole population every generation?

Genetic programming

GP = EC where genotype = parse tree (of a programming language)

Example: The expression 5 + 3x


GP operates directly on (subsets of) computer programs => semi automatic programming.

Crossover and mutation in GP

Crossover: Swap randomly selected subtrees in the two parents


Many forms of mutation

  • Function node: replace function with new of same arity
  • Terminal node: replace terminal
  • Swap: swap arguments of a function
  • Grow: replace node by randomly generated subtree
  • Gaussian: jog constants
  • Trunc: replace node by terminal

Target languages

  • Any language can be used, but Lisp is particularly suitable (explain why
  • Assembler/machine code (AIM-GP by Nordin et al) (project idea)
    • assembler instruction = integers
    • store in an unsigned integer array
    • cast to function pointer and call it
  • Redcode (Corewars)

Most methods are very specialized for its particular target language

Decision trees (common)

Here, GP has been used to evolve computer network protocols

Other applications: See (not updated since 2007
though). gives a better overview (but few applications)

Advanced topics

Automatically Defined Functions (ADF), how to deal with types, if-statements etc.

Grammatical evolution (GE) (Ryan & O'Neill)

GP methods often specific to their target language.

GE = language independent genetic programming

Takes a BNF grammar as input ==> can generate programs in any language!

Genetic operators, as in GA – the genotype is just a sequence of numbers


The meaning of codons (integers) is extremely context dependent.

Crossover destructive (variants that address this problem exist)

Possible application: Grammatical Swarm – Train the GE with a particle swarm instead of GA (has no crossover).

Dong Wang

Master student of computer science at Uppsala University in Sweden. My primary research interests are deep learning, computer vision, federated learning and internet-of-things.