oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

An Introduction to Artificial Intelligence
Pages: 1, 2, 3, 4

Let's look at the algorithm neural networks use to learn, and then run through an example using a Java implementation. The following presents the basic idea:

  1. Initialize weightings.
    • Set each weighting on each perceptron in the network to a small random value, usually between +/- 0.05.
  2. Process a training example (feed forward).
    • Multiply each input value by its corresponding weighting.
    • Sum the weighted input values.
    • Pass the summed and weighted values through a thresholding function to scale the output.
  3. Calculate the error (back propagation).
    • For each output value, calculate associated error.
    • Adjust each weighting in the network by an amount proportional to the error and learning rate.
  4. Continue processing all training examples for some number of epochs (go to Step 2).

You can download a flexible and easy to use neural network written in Java and released under the GNU GPL, complete with an accompanying test harness, here. Follow these steps to get it running:

  • Place nnet.jar in a location of your choice.
  • Add nnet.jar to your CLASSPATH.
    • If you're using a Bash shell, you can do it with export CLASSPATH=$CLASSPATH:/path/to/file/nnet.jar.
  • Inspect the source of to see how it works.
  • Run the AND example.
    • Ensure that the AND training data is uncommented in
    • Compile with javac
    • Type java testHarness 100 0.05 (100 training epochs and a learning rate of 0.05) to run it.
    • Vary the learning rate and number of training epochs to get a feel for the tradeoff.
  • Try out the OR example the same way by making the adjustments noted in the source code.

As you experiment with the AND and OR examples, you'll notice that lower learning rates require more training examples and vice versa. As you try to approximate more advanced functions with partial data sets, you generally want to keep the learning rate as low as possible--low enough to do the job, but no higher than it needs to be. This allows your network to more easily adapt to future training examples that may be unknown at the present time.

XOR function The XOR function does not have linearly separable output.

Let's take a look at a more complex function, the exclusive-or (XOR) function. This function produces an output if and only if exactly one of its two input values is equal to 1.0. If you try to use the same tactics that you used with the AND and OR functions, you'll frustrate yourself and eventually become very irritated. To approximate this nonlinear function (and avoid irritation), we'll need to add an additional layer to the network in between the input and output layers. This layer is often referred to as a "hidden" layer. To add this hidden layer to the network, make the following changes to

  • Change int[] networkTopology = {3,1}; to int[] networkTopology = {3,2,1};.
  • Change int[] thresholdTopology = {TANH,STEP}; to int[] thresholdTopology = {TANH,TANH,TANH};.
  • Uncomment the lines specifying the XOR training data.
  • Recompile testHarness: javac
  • Type java testHarness 1000 0.06 to run it and note effect of using TANH as the output layer
  • Mix up the learning rate and number of epochs to see the differences it makes.

Although the neural net package is flexible enough to accommodate more than one hidden layer and various thresholding functions, these parameters work fine. Feel free to try comparing the sigmoid and hyperbolic tangent functions if you want a better feel for how they work. Be advised that expected output values can vary considerably between the two.

XOR topology
To represent the XOR function, the network needs a hidden layer.

Pages: 1, 2, 3, 4

Next Pagearrow