A question, what is AI?
Artificial Intelligence: It is not the art of making computers behave as they do in the movies. It is motivated by the brain, but do not overdo it. We always regard it as a black box. After opening this box, we can see a network of simple processing units(nodes/neurons) working in parallel. Here we should concern about the ' parallel'. According to the 100 step rule. parallelism important, not individual speed.
Artificial neuron consists of ,
- Activation function / Step function (it can be discrete/continuous)
For example, binary neuron is something like this,
where (f) is a binary activation function, x_1,x_2, \cdots x_n are inputs, w_1,w_2, \cdots w_n are weights. Binary neuron may very well have more than 2 inputs.
The activation function can be,
AND OR NAND neuron node structure figures.
- Common properties of ANNS
- Information is stored in the connections (as weights), not in the nodes.
- ANN’s are trained (by modifying the weights), not programmed. [Motivate the advantages of this (e.g. first lab, Volvo's car engines)]
- Ability to generalize, i.e. to work in situations slightly different than
- before (without retraining).
- Adaptivity, i.e. ability to adapt to new circumstances (by retraining).
- Fault tolerance
Two example: Hebb's rule. If two nodes are active, then reinforce the connection between them. Rosenblatt's Perceptron Convergence Procedure.
Learning to imitate
Examples: PCP(BackPropagation), intro lab, learning to walk by copying a teacher's gait.
Learning to trail-and-error
Example,Q-learning. playing a game (you may learn the
rules by a teacher, but you learn to play well by playing over and over again)
Unsupervised learning (UL)
Examples: Hebb, recognizing similarities, topological maps
Connection strategies (architectures)
Feedforward networks (focus of this course)
Description, information flow
Applications: Classification, function approximation, perception
Training: Most often supervised using some variant of backprop (overview).
Common issues: Dimensioning, weight information
//TODO need figure
Layered networks with recurrent connections between layers
Share term memory, also used for sequential problem
LSTMs (Long Short-Term Memory), commonly used now, are also recurrent
but not layered in quite the same way
Applications: Recognizing/generating sequences of patterns. Linguistics.
Fully interconnected recurrent networks
Description, information flow
Applications: associate memories, combinatorial optimization problems
Training. Often some version of Hebb
Common issues: Convergence, capacity.
Question： Why neural networks?
Why not use statistics or some rule based expert systems?
- ANN is a statistical method! (not "model free" though, as sometimes said)
- Currently, neural networks outperform other methods for many applications, but they have been used for a long time for other reasons as well:
- Speed (at least if implemented in hardware)
- Economical reasons: Projects, interviewing experts, etc. (Example NETTALK: Three months vs several years for DecTalk). Prototyping