**An Introduction to Artificial Intelligence**

Pages: 1, **2**, 3, 4

Let's look at the algorithm neural networks use to learn, and then run through an example using a Java implementation. The following presents the basic idea:

- Initialize weightings.
- Set each weighting on each perceptron in the network to a small random value, usually between +/- 0.05.

- Process a training example (feed forward).
- Multiply each input value by its corresponding weighting.
- Sum the weighted input values.
- Pass the summed and weighted values through a thresholding function to scale the output.

- Calculate the error (back propagation).
- For each output value, calculate associated error.
- Adjust each weighting in the network by an amount proportional to the error and learning rate.

- Continue processing all training examples for some number of epochs (go to Step 2).

You can download a flexible and easy to use neural network written in Java and released under the GNU GPL, complete with an accompanying test harness, here. Follow these steps to get it running:

- Place
*nnet.jar*in a location of your choice. - Add
*nnet.jar*to your`CLASSPATH`

.- If you're using a Bash shell, you can do it with
`export CLASSPATH=$CLASSPATH:/path/to/file/nnet.jar`

.

- If you're using a Bash shell, you can do it with
- Inspect the source of
*testHarness.java*to see how it works. - Run the
*AND*example.- Ensure that the
*AND*training data is uncommented in*testHarness.java*. - Compile
*testHarness.java*with`javac testHarness.java`

. - Type
`java testHarness 100 0.05`

(100 training epochs and a learning rate of 0.05) to run it. - Vary the learning rate and number of training epochs to get a feel for the tradeoff.

- Ensure that the
- Try out the
*OR*example the same way by making the adjustments noted in the`testHarness.java`

source code.

As you experiment with the *AND* and *OR* examples, you'll notice that lower learning rates require more training examples and vice versa. As you try to approximate more advanced functions with partial data sets, you generally want to keep the learning rate as low as possible--low enough to do the job, but no higher than it needs to be. This allows your network to more easily adapt to future training examples that may be unknown at the present time.

*The*XOR

*function does not have linearly separable output.*

Let's take a look at a more complex function, the exclusive-or (*XOR*) function. This function produces an output if and only if exactly one of its two input values is equal to 1.0. If you try to use the same tactics that you used with the *AND* and *OR* functions, you'll frustrate yourself and eventually become very irritated. To approximate this nonlinear function (and avoid irritation), we'll need to add an additional layer to the network in between the input and output layers. This layer is often referred to as a "hidden" layer. To add this hidden layer to the network, make the following changes to *testHarness.java*:

- Change
`int[] networkTopology = {3,1};`

to`int[] networkTopology = {3,2,1};`

. - Change
`int[] thresholdTopology = {TANH,STEP};`

to`int[] thresholdTopology = {TANH,TANH,TANH};`

. - Uncomment the lines specifying the
*XOR*training data. - Recompile testHarness:
`javac testHarness.java`

. - Type
`java testHarness 1000 0.06`

to run it and note effect of using*TANH*as the output layer - Mix up the learning rate and number of epochs to see the differences it makes.

Although the neural net package is flexible enough to accommodate more than one hidden layer and various thresholding functions, these parameters work fine. Feel free to try comparing the sigmoid and hyperbolic tangent functions if you want a better feel for how they work. Be advised that expected output values can vary considerably between the two.

*To represent the *XOR* function, the network needs a hidden layer.*