oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

An Introduction to Artificial Intelligence

by Matthew Russell

Editor's note: When I recently saw the new iMac with the iSight built in, it reminded me of a project we've been working on. In a nutshell, Matthew Russell and I have been talking about using the iSight to take and classify images, such as those of a user sitting at the iMac, so it knows who's using it. (Face-sensing engines have been in the news lately.) Aside from being a cool hack, this possibly could used be in addition to your user password for authentication.

It became clear as we worked on this project that an introduction to the technology behind this operation was in order. So as we chip away at the actual tutorial for enabling the iSight, Matthew has put together this introduction to artificial intelligence. Call it weekend reading. I hope you enjoy it.

Artificial intelligence (AI) is gradually weaving itself into virtually ever aspect of our daily lives. It already turns up in places as diverse as the transmission systems of cars, vacuum cleaners, computer games, hospital equipment, Google Labs, and postal equipment. The richest man in the world is rumored to have AI systems integrated into his home that even adjust room temperatures, lighting, background music, etc., depending on who walks into the room. In this installment, we'll investigate artificial neural networks, a powerful AI learning technique that can be used to accomplish some of these interesting things.

A Refresher in Biology

synapse A conceptual representation of a synapse (from Wikipedia). If enough neurotransmitters are released into the synapse, adjoining neurons may fire an electrical impulse that continues the signal transmission.

In the blockbuster hit Terminator 2: Judgment Day, a Terminator robot divulges the inner workings of its intellect when it says, "My CPU is a neural net processor--a learning computer." Although the jargon "neural network" sounds intimidating, it's actually pretty simple. From its etymology, we already know that "neural" refers to an association with the nervous system, and a "network" is a connected group or system. Thus, neural networks are just a learning model based on the way that neurons (nerve cells) store information. For the purpose of employing neural networks to solve problems, it's helpful to quickly review the basics of how biological neurons work.

Neural networks found in living organisms are highly interconnected but physically separated from one another by a tiny gap called a synapse. As an electrical impulse travels through a neuron, it eventually reaches the physical end of the cell and may cause a chemical signal to be released into the synapse. Adjoining neurons, sometimes as many as 10,000 or so, detect the chemical signal and may transform it back into an electrical impulse if a certain chemical threshold is reached. This process repeats over and over--eventually reaching out to such things as muscle fibers and enables us to walk, run, swim, etc. These signals travel incredibly fast and massively in parallel. Although biologists would gasp at this gross oversimplification, it provides sufficient background to develop a neural network learning model. Let's get on with the geekery.

Artificial Neural Networks

Neural networks are usually categorized under a specific division of artificial intelligence known as machine learning. While general AI is interested in systems that can demonstrate a perceived intelligence, machine learning narrows in on learning models that primarily use statistical analysis to accomplish the learning process. In the case of neural networks, we provide the network with sets of input and output values for "training" purposes, and then expect the network to perform well on previously unseen data sets. This isn't all that different than the way we'd expect our own brains to learn.

For each of the input values, the network outputs a value that is compared to the expected training output value, and an error value is computed based on the difference. Internal network parameters are adjusted by an amount proportional to the error and the network's learning rate, so that the network will perform with higher accuracy the next time it classifies the example. This process is repeated many times for each of the distinct training values. Once a network has learned a training set with sufficient accuracy, it can often classify "new" data sets surprisingly well.

Given some biological inspiration and some object-oriented programming, let's model a perceptron--the basic object in a feed-forward neural network that is analogous to a neuron in a real biological model. A good starting point is to represent a simple Boolean logic function like the AND function. Recall that the binary AND function's output is activated if and only if both inputs are activated.

From our uber-simplified model of a neuron, we know that each neuron in our model has multiple input units and a single output unit, so we'll model a perceptron the same way. We'll provide it with two binary input values and a single binary output value. An additional bias value is used to stabilize the perceptron's learning and is equal to a constant value of 1.0. The inputs are each multiplied by a corresponding weighting, summed up, and then passed through a step function that produces only 0 or 1 for output values, since we want a strictly binary output. For values less than 1.0, the step function outputs a 0; otherwise, it outputs a 1. Since the AND function has linearly separable output when put on a grid, it can be represented using a single perceptron.

perceptron AND function
On the left: A perceptron. It accepts two input values and produces a single binary output value. Each input value is multiplied by a corresponding weighting and added together to form a weighted sum. The weighted sum is input into the step function to produce a final output.

On the right: The AND function has output that is linearly separable.

Pages: 1, 2, 3, 4

Next Pagearrow