ECS 271, Machine Learning: Homework #4
Due: 29 April 2004


Instructor: Prof. Rao Vemuri, rvemuri@ucdavis.edu

1.      (60 points) Write a backpropagation program to recognize the 7 alphabetical chatacters stored in this file. I am not just interested in looking at your program listing per se, buit most interested in the results, their analysis and such other things.

I suggest that you write a program and make the program flexible enough so that you can easily change certain parameters.Work with one hidden layer. I am especially interested in your ability to change (a) number of hidden layer neurons, (b) the learning parameter h, (c) the encoding of TRUE and FALSE:

Case1. Encode FALSE = 0 and TRUE = 1 for the input values.

Case 2. Encode FALSE = -1 and TRUE = +1 for the input values.

Remember to include the weight w0 as a part of the training process and set your x0 to +1 all the time.

Submit your code, results of your runs and your analysis and comments.

 2.      (20 points) Start with the perceptron learning rule in vector notation as shown below.

w(m+1) = w(m) + h (t(m)d(m) ) x(m)

 

That is, upon presentation of the m th training pattern x(m), the weight vector is updated as shown. Here h is a positive quantity, called the learning parameter or step size.  Now prove the Perceptron Convergence Theorem, stated as follows.

 

Theorem: If a set of training patterns is linearly separable, then the linear perceptron learning algorithm shown above converges to a correct solution in a finite number of iterations.

 

Hint: To prove this, you may want to follow the steps shown below.

 

(a)    Assume that w* is the correct solution. After m iterations, show that

 

||w(m+1) -  w*||2  = ||w(m)  -  w*||2  + h2 ||x(m)||2 – 2 h ( |w*T x(m) | +  | w(m)T  x(m) |)

(b) Then give a cogent argument telling why the error strictly decreases. That concludes the proof.

Hint: It is easy to proveif you stay with the vector notation.


 3. (20 points) Assuming that w* is known ahead of time, find an expression for the optimal step size.