木叶下

  • 编程算法
  • 深度学习
  • 微小工作
  • 善用软件
  • 杂记
  • 诗人远方
南国羽说
文字记录生活
  1. 首页
  2. 深度学习
  3. 正文

Natural Computation Methods for Machine Learning Note 11

2020年3月4日 4712点热度 0人点赞 0条评论

Natural Computation Methods for Machine Learning Note 11

Contents hide
1 Natural Computation Methods for Machine Learning Note 11
1.1 Review
1.1.1 CL and SOFM
1.1.2 CL with bad (random) initialization (simulation)
1.1.3 CL initialized from the data (simulation)
1.1.4 Rectangular distribution example (simulation)
1.1.5 SOFM (simulation)
1.2 Growing Neural Gas
1.2.1 Which nodes to move
1.2.2 Neighbourhood (and shrinking)
1.2.3 Growth

Let us firstly revisit completive learning and SOFM

Review

CL and SOFM

  • A node position (in the input space) is its weight vector, i.e. nodes move around in the same space as the data.
  • Purpose of unsupervised learning is (usually) to cluster for classification, or for modelling distributions (several nodes in high density areas, fewer in other areas)

CL with bad (random) initialization (simulation)

  • The winner-takes-all scenario
  • Nodes which are never used (= no data in that Voronoi region)
  • Nodes which cover more than one cluster, and therefore moves back and
    forth between them

Nodes moving back and forth between clusters = nodes that need help? Node "instability" as an error measure ?

CL initialized from the data (simulation)

  • Better coverage. All nodes are used.
  • But still not optimal, unless we're very lucky. Bad distribution of nodes.

Rectangular distribution example (simulation)

What should we expect if we run CL with 10 nodes on this distribution? What if the distribution moves, slowly. What if it jumps?

SOFM on the same data. SHOW gridlines. Should follow jumps, were it not for the decaying parameters

SOFM (simulation)

SOFM forms a density function of the data. For example, uniformly distributed input data leads to uniformly distributed weight vectors. Three steps:

Random. Edges join nodes that are neighbours on the map.

A fishing net, since neighbours in the network, the grid, tend to react on the same data (topological map).

A stretched net, as the nodes move towards the different clusters.

Should two such nodes still be neighbours? Is it possible to have a dynamic neighbourhood, where edges can be created and destroyed? Rubber bands that snap?

Growing Neural Gas

Unsupervised growing algorithm for clustering.

  • Dynamic size – a growing/shrinking algorithm
    • though it grows faster than it shrinks
  • Dynamic neighbourhood – who is neighbour to whom is not fixed
    • defined by a graph (SHOW). Not weighted connections, just edges in a graph! (like the SOFM grid lines).
  • All parameters are constant
    • great for on-line learning and for following moving targets
    • GNG will even follow jumping targets

Three things to consider:

  • Which nodes to move (given an input), and by how much
  • How to define and update the neighbourhood graph
  • How to grow (when and where should we insert new nodes)

Which nodes to move

Move not only the winner (k), but also its (current) neighbours. The winner is moved using a much greater gain factor than the neighbours:
\begin{aligned}
&\Delta w_{k}=\varepsilon_{k}\left(x-w_{k}\right)\
&\Delta w_{n}=\varepsilon_{n}\left(x-w_{n}\right)^{, \boldsymbol{\varepsilon}_{k}>>} \varepsilon_{n}
\end{aligned}

where k is the winner, n is a neighbour, x is the input vector and w is the node position (its weight vector). Both gain factors are constant. (note that n in the equations is a name, not an index)

Neighbourhood (and shrinking)

All current neighbours are connected by edges in a graph.
Each edge has an associated age

For each input vector

  1. find the the closest node (k, the winner), and the second closest (r).
  2. If k and r are not already connected (neighbours), connect them with a new edge
  3. Set the age of the edge between k and r to 0.
  4. The age of all other edges from k is incremented by one.
    • If any edge becomes too old (> amax), remove it. If any node loses its last edge this way, remove that node as well.

Growth

Every time a winner, k, is found, add the distance (from the input) to a local error variable.

(error is proportional to the accumulated distance this node as moved, as a winner)

At fixed time intervals, insert a new node where it is most likely needed:

  • halfway between the node with the largest error, and the node among its current neighbours with the largest error.

The error of a node is decayed over time (this is not a decaying parameter, though)

See the algorithm in Fritzkes paper (in the handouts)
Possible problem: Dead units (show using a jumping rectangular distribution) – could be an advantage, though

Extension: GNG-U. Removes nodes with low utility, based on frequency of winning and closeness to other nodes.

Show Delaunay triangulation and that GNG forms a subset of this.

GNG is used mostly for modelling distributions. Can for example be used to train the hidden nodes in RBF networks

标签: completive learning Natural Computation Unsupervised learning
最后更新:2020年3月4日

Dong Wang

I am a PhD student of TU Graz in Austria. My research interests include Embedded/Edge AI, efficient machine learning, model sparsity, deep learning, computer vision, and IoT. I would like to understand the foundational problems in deep learning.

点赞
< 上一篇
下一篇 >

文章评论

razz evil exclaim smile redface biggrin eek confused idea lol mad twisted rolleyes wink cool arrow neutral cry mrgreen drooling persevering
取消回复

这个站点使用 Akismet 来减少垃圾评论。了解你的评论数据如何被处理。

文章目录
  • Natural Computation Methods for Machine Learning Note 11
    • Review
      • CL and SOFM
      • CL with bad (random) initialization (simulation)
      • CL initialized from the data (simulation)
      • Rectangular distribution example (simulation)
      • SOFM (simulation)
    • Growing Neural Gas
      • Which nodes to move
      • Neighbourhood (and shrinking)
      • Growth

COPYRIGHT © 2013-2024 nanguoyu.com. ALL RIGHTS RESERVED.

Theme Kratos Made By Seaton Jiang

陕ICP备14007751号-1