木叶下

  • 编程算法
  • 深度学习
  • 微小工作
  • 善用软件
  • 杂记
  • 诗人远方
南国羽说
文字记录生活
  1. 首页
  2. 深度学习
  3. 正文

Natural Computation Methods for Machine Learning Note 03

2020年2月7日 4057点热度 0人点赞 0条评论

Natural Computation Methods for Machine Learning Note 03

Contents hide
1 Natural Computation Methods for Machine Learning Note 03
1.1 Pattern recognition and the Perceptron
1.2 Pattern recognition
1.2.1 Distance measures
1.3 Training(adjusting the line automatically)
1.4 Perceptron

Pattern recognition and the Perceptron

This course, I learnt basic pattern recognition , perceptron and an overview of how to train/adjust the perceptron.

Pattern recognition

Here are some basic terms we should know.

pattern recognition = feature extraction + classification

Feature extraction = find "good" feature to classify => feature vector, X(This is very sensitive to assumptions )

Classification = Find a discriminant that separates the classes(There is an infinite number of solutions)

Example: Nearest neighbour classifiers
Classify the unknown sample (vector) X to k nearest classes.

How do we measure the distance for 'nearest' classes?

Distance measures

Define distance between two vectors

a = (a_1,a_2,\cdots, a_n) \ and \ b=(b_1,b_2,\cdots,b_n)
l_p \ norm
l_p(\bar{x}) = \left( \sum_{i=1} x_i^p \right)^\frac{1}{p}

Specially, l_2 = Euclidean distance. l_1 = city block (Manhattan) distance
When the perceptron is a classifier, we have

f=f_h(s)=\begin{cases}
1& \text{if } s>0\
0& \text{if } s \leq 0
\end{cases}

S =\sum_{i=1}^n w_ix_i-\theta = \sum_{i=0}^n w_ix_i \text{where} \begin{cases}
x_0=-1\
w_0=\theta
\end{cases}

\theta is the bias/ threshold, \sum_{i=0}^n w_ix_i \text{where} \begin{cases}
x_0=-1\
w_0=\theta
\end{cases}
is the augmented vector notation. This defines a hyperplane n-dimension in input space.

Let us consider a hyperplane example (2D):

S =\sum_{i=1}^n w_ix_i-\theta =w_1x_1+w_2x_2-\theta

The discriminant found by setting S=0, we have w_1x_1+w_2w_2-\theta = 0 \Rightarrow x_2 = \frac{\theta-w_ix_i}{w_2} = -\frac{w_1}{w_2}x_2+\frac{\theta}{w_2}=kx+m. This is a line.

\TODO there should be 3 figures.
Conclusion: The weights define the position and slope of the line (in the general case, a hyper plane). Threshold(\theta) moves the hyperplane.

Training(adjusting the line automatically)

We have a number of pairs (x, d) of feature vectors (x) an desired responses (d).

For each such pair and perceptron output y.

If y=d, do nothing

If y=0, d=1, Reinforce the connections (to increase the weighted sum).

If y=1, d=0, Weaken the connections (to decrease the weighted sum).

(Reinforce/inhibit = add/subtract the corresponding input value.)

But when to stop?

multiply the weight change by a gain factor/learning rate/step length \eta, where 0\leq \eta \leq1 \Rightarrow \Delta w_i=g\delta x_i where \delta = d-y

Perceptron

The algorithm converges to an optimal discriminant in a finite number of steps, if such a discriminant exists.(not always exist).

Limitations
Linear separability, e.g. XOR

In the following figure, there are multiply perceptrons.

XORMLP

标签: machine learning Natural Computation
最后更新:2020年3月4日

Dong Wang

I am a PhD student of TU Graz in Austria. My research interests include Embedded/Edge AI, efficient machine learning, model sparsity, deep learning, computer vision, and IoT. I would like to understand the foundational problems in deep learning.

点赞
< 上一篇
下一篇 >

文章评论

razz evil exclaim smile redface biggrin eek confused idea lol mad twisted rolleyes wink cool arrow neutral cry mrgreen drooling persevering
取消回复

这个站点使用 Akismet 来减少垃圾评论。了解你的评论数据如何被处理。

文章目录
  • Natural Computation Methods for Machine Learning Note 03
    • Pattern recognition and the Perceptron
    • Pattern recognition
      • Distance measures
    • Training(adjusting the line automatically)
    • Perceptron

COPYRIGHT © 2013-2024 nanguoyu.com. ALL RIGHTS RESERVED.

Theme Kratos Made By Seaton Jiang

陕ICP备14007751号-1