Some Mathematics

XOR: A Simple Neural Network in Detail

There exist mathematical entities which transform values to give other values. We say that the transformed value is dependent on or has a dependency on the value being transformed. In mathematical language, we say that the dependent or transformed value is a function of the original value, and express it in mathematical notation as follows:

y = f(x)

x is the original value and y is the transformed or dependent value. For example, we can have a function f defined as y = ½
times x or y = (½)x. So if x = 2, y will be 1, and if x = 6, y will be 3. We can have a function y = x^{2}, which is x times x.
So if x = 1, y will be 1 times 1 which is 1, if x = 2, y will be 2 times 2 which is 4, and if x = -1, y will be -1 times -1 which
is 1.

You saw a function in the Weights and Neurons section of the Introducing Neural Networks
page defined as 1 / (1 + e^{-x}). This is based on the strange letter e, which stands for a special number. You can see
a description of this special number in the next section on Irrational Numbers. You can use the
scientific calculator on your computer to calculate e raised to the power of a value. For example,
if x = 0, e^{-0} = e^{0} = 1, and 1 / (1 + e^{0}) = 1 / (1 + 1) = 1/2. If x = 1, e^{-1} = 0.368,
and 1/( 1 + e^{-1}) = 1 / (1 + 0.368) = 1 / 1.368 = 0.731. If x = -1, e^{- (-1)} = e ^{1} = 2.718 and
1 / (1 + 2.718) = 1/3.718 = 0.269 (these values are rounded to the nearest third decimal place).

We can visualize functions by plotting the x value on a horizontal axis which we call the x-axis, and by plotting the y value
on a corresponding dependent vertical axis which we call the y-axis. The x values vary continuously across the x-axis, and the y-values
vary correspondingly across the y-axis. You can play with the functions y = (½)x, y = x^{2}, and y = 1 / (1 + e^{-x}) in
the Interaction below. Just press a function button to see the corresponding function. Press the "line" button to play with y = (½)x,
the "squared" button to play with y = x^{2}, and the "sigmoid" button to play with y = 1 / (1 + e^{-x}) (this last
function is called the "sigmoid" function). The sigmoid is selected by default. The horizontal line at y = 1 on the sigmoid shows the
limiting value of the function as x becomes large. Slide the slider left and right to see how the y values change with the x values.

In mathematics, there exist a category of mysterious numbers which cannot be expressed as decimal numbers which end at a certain point. In fact, they go on forever and do not even have a repeating pattern of digits. A number, which when multiplied by itself, gives the number 2, is one such number. It is called the “square root” of 2, and written as √2. We try writing down the square root of 2 below:

1.41 × 1.41 = 1.9881

1.4142 × 1.4142 = 1.99996164

1.414213 × 1.414213 = 1.999998409369

1.41421356 × 1.41421356 = 1.9999999932878736

1.4142135623 × 1.4142135623 = 1.99999999979325598129

In fact, you can keep adding digits to approximate the square root of 2, and multiplying by itself, and you will keep getting closer to 2, but never exactly 2. You can put a number on every atom in our galaxy, as an approximation to the square root of 2, and multiply by itself, and you still will not get exactly 2. Such numbers, like the square root of 2, are called “irrational numbers”.

The number denoted by the letter e, which you encounter frequently in the mathematics of neural networks, is another such irrational number. The approximate value of e is: 2.718281828459045.

A special property of the number e, and e alone, is that e raised to the power of a very small number is, to a very close approximation, 1 plus that number. That is,

e^{n} ≈ 1 + n for very small n, to a very close approximation. So:

e^{0.01} approximates 1.01

e^{0.002} approximates 1.002

e^{0.00048} approximates 1.00048

e^{0.0000759} approximates 1.0000759

You can try this out on your scientific calculator.

Copyright © 2022 by Sandeep Jain. All rights reserved.