MetaCode: Neural networks in C#

What is a neural network?

In fancy words: it is a system that can be trained to represent any function.

In common words: it is a system that "learns" how an input connects to an output, so that after you train it a lot of times it readjusts itself so that it "looks" like it really understands how an input gets transformed into an output.

But what is really a network? What does it have in its guts?

A neural network is formed of neurons. Yes, good naming conventions make things easier. What is neuron? Simply put, a neuron gets some numbers and then returns another. This is done in this way:

- A neuron will always get the same amount of numbers, that amount will depend on the rest of the neurons of the network.

I will explain later how they transform these numbers into an output.

Neurons are organized in layers. Typically, neural networks have one input layers, one output layer, and two layers in between that are called hidden layers.

In the picture, the yellow nodes represent the neurons of the input layer, the greens are the hidden layers, and the blue one is the output layer. Now we have a structure that has a lots of neurons, each of which transforms a vector into a single number. Combining several neurons, you can transform an input vector into an output vector.

The way neurons transform an input can be seen in the following picture:

For each input (the amount of them is fixed) a neuron has a weight. You have to multiply each input with its weight, and then sum it all with the threshold. You can see the threshold as a weight that doesn't depend on any input. Once you have summed all, you pass the result to the activation function. The output of the neuron is the output of the activation function.

This is the rendered form of the equation. You can not edit this directly. Right click will give you the option to save the image, and in most browsers you can drag the image onto your desktop or another program.

I didn't write the j index, because that refers to each neuron, and so far we are only interested in one.

Activation functions are normally derivable functions that have an upper and lower bound. The most common for neural networks is the sigmoid function, although the hyperbolic tangent is also used.

This function simulates the way real neurons work. Neurons get a signal from other neurons, and then, if the received signal is strong enough, the neuron will provide an output. Looking at the shape of the sigmoid function this is, in a simplified way, what our model of neuron does.

Now that we know how neurons work, we can actually understand how the network generates an output: for each layer, you pass the output of the previous layer (the array of the outputs of each neuron) to each neuron. The input layer is a little different, because you don't actually transform the data with it, you pass your input to the first hidden layer directly.

If you think of this enough time, you come to realize that this is can be seen as a matrix product. Each neuron would be a row, the threshold can be seen as represented with the help of an extra input with a constant value of one. Then you apply the activation function to each element of the resulting vector and there you have the input vector for the next layer, or the output if there were no more.

But, I want to do things different, just to have some fun. Instead of using the mathematical abstraction of matrices, I will express the same ideas in terms of LINQ operators. I will need some non-standard ones, but thanks to extension methods that will not be a problem.

There are already other implementations of neural networks in C# and other languages. The reason for doing this is that I think that if you take advantage of the lazy evaluation of LINQ queries together with its superior expressiveness you get network that it fast enough, and much clearer and simple to understand. Here is the code for a neuron and and activation function:

public sealed class Neuron
{
    public double[] Weights { get; set; }
    public double Threshold { get; set; }
    public readonly IActivationFunction ActivationFunction;

    public Neuron(IActivationFunction activationFunction)
    {
        ActivationFunction = activationFunction;
    }

    public double Input(double[] signal)
    {
        return signal.Dot(Weights) + Threshold;
    }

    public double Output(double[] signal)
    {
        return ActivationFunction.Process(Input(signal));
    }
}

public interface IActivationFunction
{
    double Process(double x);
}

The reason for the activation function to be an interface and not the Func delegate is that later we will need to extend it, adding a method to get its derivative. Also, the weights and threshold are public because they are suposed to change during the training. Note that the output generation process is split into two steps, because we will need them separately later.

So far this is easy. In fact it won't get any harder. That's the benefit of using LINQ instead of matrices.

Now the codes for the layer and the network:

public class Layer
{
    public readonly List<Neuron> Neurons = new List<Neuron>();

    public double[] Output(double[] signal)
    {
        return Neurons.Select (n => n.Output (signal)).ToArray();
    }
}

public class Network
{
   public readonly List<Layer> Layers = new List<Layer>();

   public double[] Output(double[] input)
   {
      return Layers.Aggregate(input, (i,l) => l.Output(i)).ToArray();
   }
}

And, excluding training, this finishes the network implementation! Probably the most confusing point is the part that uses the Aggregate extension method. What that does is basically calculate the output of a layer, pass it to the next one, an so on, until you have run out of layers.

The problem with this implementation is that it is a little tedious to use, because you need to add each neuron and layer manually. However, I think the point is proved: LINQ allows us to create a minimal and beautiful implementation of a neural network, good for understanding it before deepening into other matters as efficiency.

I have left apart the training on purpose, because it is much harder to understand and the makes the implementation much more complicated. With this network, however, given that you get the weights, thresholds and activation function from somewhere else, you have an working neural network.

MetaCode

Tuesday, 16 December 2014

Neural networks in C#

No comments:

Post a Comment

Blog Archive